An Efficient Visual Analysis Method for Cluster Tendency Evaluation, Data Partitioning and Internal Cluster Validation

keywords: Visual clustering, graph analysis, cluster assessment tendency, automatic clustering, visual data partitioning and validation measures.
Visual methods have been extensively studied and performed in cluster data analysis. Given a pairwise dissimilarity matrix D of a set of n objects, visual methods such as Enhanced-Visual Assessment Tendency (E-VAT) algorithm generally represent D as an n times n image I( overlineD) where the objects are reordered to expose the hidden cluster structure as dark blocks along the diagonal of the image. A major constraint of such methods is their lack of ability to highlight cluster structure when D contains composite shaped datasets. This paper addresses this limitation by proposing an enhanced visual analysis method for cluster tendency assessment, where D is mapped to D' by graph based analysis and then reordered to overlineD' using E-VAT resulting graph based Enhanced Visual Assessment Tendency (GE-VAT). An Enhanced Dark Block Extraction (E-DBE) for automatic determination of the number of clusters in I( overlineD') is then proposed as well as a visual data partitioning method for cluster formation from I( overlineD') based on the disparity between diagonal and off-diagonal blocks using permuted indices of GE-VAT. Cluster validation measures are also performed to evaluate the cluster formation. Extensive experimental results on several complex synthetic, UCI and large real-world data sets are analyzed to validate our algorithm.
reference: Vol. 32, 2013, No. 5, pp. 1013–1037