The aim of this paper has been to familiarize researchers using post-genomic measurements with the multitude of validation techniques available for cluster analysis. For this purpose, the different types of validation measures have been reviewed, and specific weaknesses of individual measures have been addressed. It is hoped that the analysis provided has demonstrated not only the importance, but also the intricacy of cluster validation. It is fundamental to comprehend that the use of analytical validation techniques on their own is not sufficient, but that an understanding of the working principles of clustering algorithms, validation measures and their interactions is crucial to enable fair and objective cluster validation. Owing to the biases intrinsic to many internal validation techniques, a careful analysis of the results obtained is required, and results should always be double-checked using alternative complementary validation techniques.
Researchers should be aware that entirely objective cluster validation is possible only on the data with known well-defined cluster structures and the development and evaluation of new clustering algorithms should therefore always include such data. In this context, the development of synthetic datasets that realistically mimic the properties of biological data [such as simulated gene-expression data (Mendes et al., 2003; Michaud et al., 2003)] are of particular importance as such an approach permits a controlled study of an algorithm's sensitivity with respect to specific data properties.