Data Clustering: An Approach for Evaluating the Adequate Number of Groups in Partitioned Techniques
Guillermo Molero-Castillo, Yaimara Céspedes-González, Alejandro Velázquez-Mena

The partitioned clustering techniques, such as k-means, have advantages in applications involving a large amount of data, but a particularity of this type of clustering is to establish a priori the number of input groups (k). So in practice, it is necessary to repeat the test by establishing different numbers of groups, choosing the solution that best suits the objective of the problem. Therefore, to validate the results obtained it is necessary to have validation mechanisms that allow evaluating the formation of the groups appropriately. An evaluation strategy is through validation indexes that help determine if the formation of the groups is adequate. These methods are based on estimates that identify how compact or separate the formed groups are. This paper presents validation indexes used as a strategy to determine the number of relevant groups. The results obtained indicate that this evaluation approach guarantees an adequate way the determination of the desired number of groups.

Full Text: PDF     DOI: 10.15640/jcsit.v5n1a3