TY - GEN
T1 - Clustering nodes in large-scale biological networks using external memory algorithms
AU - Arefin, Ahmed Shamsul
AU - Inostroza-Ponta, Mario
AU - Mathieson, Luke
AU - Berretta, Regina
AU - Moscato, Pablo
PY - 2011
Y1 - 2011
N2 - Novel analytical techniques have dramatically enhanced our understanding of many application domains including biological networks inferred from gene expression studies. However, there are clear computational challenges associated to the large datasets generated from these studies. The algorithmic solution of some NP-hard combinatorial optimization problems that naturally arise on the analysis of large networks is difficult without specialized computer facilities (i.e. supercomputers). In this work, we address the data clustering problem of large-scale biological networks with a polynomial-time algorithm that uses reasonable computing resources and is limited by the available memory. We have adapted and improved the MSTkNN graph partitioning algorithm and redesigned it to take advantage of external memory (EM) algorithms. We evaluate the scalability and performance of our proposed algorithm on a well-known breast cancer microarray study and its associated dataset.
AB - Novel analytical techniques have dramatically enhanced our understanding of many application domains including biological networks inferred from gene expression studies. However, there are clear computational challenges associated to the large datasets generated from these studies. The algorithmic solution of some NP-hard combinatorial optimization problems that naturally arise on the analysis of large networks is difficult without specialized computer facilities (i.e. supercomputers). In this work, we address the data clustering problem of large-scale biological networks with a polynomial-time algorithm that uses reasonable computing resources and is limited by the available memory. We have adapted and improved the MSTkNN graph partitioning algorithm and redesigned it to take advantage of external memory (EM) algorithms. We evaluate the scalability and performance of our proposed algorithm on a well-known breast cancer microarray study and its associated dataset.
KW - Data clustering
KW - external memory algorithms
KW - gene expression data analysis
KW - graph algorithms
UR - http://www.scopus.com/inward/record.url?scp=80455162373&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-24669-2_36
DO - 10.1007/978-3-642-24669-2_36
M3 - Conference proceeding contribution
AN - SCOPUS:80455162373
SN - 9783642246685
VL - 7017 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 375
EP - 386
BT - Algorithms and Architectures for Parallel Processing - 11th International Conference, ICA3PP 2011, Proceedings
A2 - Xiang, Yang
A2 - Cuzzocrea, Alfredo
A2 - Hobbs, Michael
A2 - Zhou, Wanlei
T2 - 11th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2011
Y2 - 24 October 2011 through 26 October 2011
ER -