Multiple sequence alignment with hierarchical clustering

F Corpet - Nucleic acids research, 1988 - academic.oup.com
F Corpet
Nucleic acids research, 1988academic.oup.com
An algorithm is presented for the multiple alignment of sequences, either proteins or nucleic
acids, that is both accurate and easy to use on microcomputers. The approach is based on
the conventional dynamic-programming method of pairwise alignment. Initially, a
hierarchical clustering of the sequences is performed using the matrix of the pairwise
alignment scores. The closest sequences are aligned creating groups of aligned sequences.
Then close groups are aligned until all sequences are aligned in one group. The pairwise …
Abstract
An algorithm is presented for the multiple alignment of sequences, either proteins or nucleic acids, that is both accurate and easy to use on microcomputers. The approach is based on the conventional dynamic-programming method of pairwise alignment. Initially, a hierarchical clustering of the sequences is performed using the matrix of the pairwise alignment scores. The closest sequences are aligned creating groups of aligned sequences. Then close groups are aligned until all sequences are aligned in one group. The pairwise alignments included in the multiple alignment form a new matrix that is used to produce a hierarchical clustering. If it is different from the first one, iteration of the process can be performed. The method is illustrated by an example : a global alignment of 39 sequences of cytochrome c.
Oxford University Press