Research Info

Home \Semi-supervised hierarchical ...
Title
Semi-supervised hierarchical ensemble clustering based on an innovative distance metric and constraint information
Type Article
Keywords
Ensemble clustering ,AHC ,Semi-supervised clustering, Distance metric, Information constraints
Abstract
Agglomerative Hierarchical Clustering (AHC) is a bottom-up clustering strategy in which each object is originally a cluster, and more pairs of clusters are formed by traversing the hierarchy. It has been proven that there is no individual AHC clustering algorithm that can be efficient in all situations. In order to address this problem, ensemble clustering techniques have been introduced. These techniques combine the results of several output partitions to achieve a consensus with higher accuracy compared to an individual clustering algorithm. This paper proposes an AHC-based ensemble semi-supervised clustering algorithm to improve performance. In semi-supervised clustering, class membership information is used in some objects. Here, we introduce the Semi-Supervised Ensemble Hierarchical Clustering based on Constraints Information (SSEHCCI) algorithm. SSEHCCI is developed using several individual clustering algorithms based on AHC. SSEHCCI includes a flexible weighting policy to generate base partitions and uses the constraints information to configure the semi-supervised clustering. In addition, SSEHCCI uses an innovative distance measure to calculate the distance between each pair of objects. Experimental results show that SSEHCCI performs better than existing semi-supervised algorithms on some University of California Irvine (UCI) datasets. Specifically, we observed an average accuracy of SSEHCCI compared to SSDC and RSSC of 2.6% and 1.8%, respectively.
Researchers Baohua Shen (First researcher) , Juan Jiang (Second researcher) , Feng Qian (Third researcher) , Daoguo Li (Fourth researcher) , Yanming Ye (Fifth researcher) , gholamreza Ahmadi (Not in first six researchers)