A Parallel Algorithm for Large-Scale Multiple Sequence Alignment

keywords: Bioinformatics, parallel algorithm, multiple sequence alignment
Multiple sequence alignment is a central topic of extensive research in computational biology. Basically, two or more protein sequences are compared to evaluate their similarity and to identify conserved regions. This work reports a methodology for parallel processing of a multiple sequence alignment algorithm (ClustalW) in an environment of networked computers. A detailed description of the modules that compose the distributed system is provided, giving special attention to the way a dynamic programming algorithm is run in multilevel parallelism. Extensive experiments were done to evaluate performance and scalability of the reported method. Results suggest that the proposed method is very promising for large-scale multiple protein sequence alignment.
mathematics subject classification 2000: 68W10: Parallel algorithms; 92-08: Computational Methods; 92D20: Protein sequences
reference: Vol. 29, 2010, No. 6+, pp. 1233–1250