Comprehensive analysis of multiple microarray datasets by binarization of consensus partition matrix
Abu-Jamous B., Fa R., Roberts DJ., Nandi AK.
Clustering methods have been increasingly applied over gene expression datasets. Different results are obtained when different clustering methods are applied over the same dataset as well as when the same set of genes is clustered in different microarray datasets. Most approaches cluster genes' profiles from only one dataset, either by a single method or an ensemble of methods; we propose using the binarization of consensus partition matrix (Bi-CoPaM) method to analyze comprehensively the results of clustering the same set of genes by different clustering methods and from different datasets. A tunable consensus result is generated and can be tightened or widened to control the assignment of the doubtful genes that have been assigned to different clusters in different individual results. We apply this over a subset of 384 yeast genes by using four clustering methods and five microarray datasets. The results demonstrate the power of Bi-CoPaM in fusing many different individual results in a tunable consensus result and that such comprehensive analysis can overcome many of the defects in any of the individual datasets or clustering methods. © 2012 IEEE.