A general method applicable to the search for similarities in the amino acid sequence of two proteins

SB Needleman, CD Wunsch - Journal of molecular biology, 1970 - Elsevier
SB Needleman, CD Wunsch
Journal of molecular biology, 1970Elsevier
A computer adaptable method for finding similarities in the amino acid sequences of two
proteins has been developed. From these findings it is possible to determine whether
significant homology exists between the proteins. This information is used to trace their
possible evolutionary development. The maximum match is a number dependent upon the
similarity of the sequences. One of its definitions is the largest number of amino acids of one
protein that can be matched with those of a second protein allowing for all possible …
Abstract
A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed. From these findings it is possible to determine whether significant homology exists between the proteins. This information is used to trace their possible evolutionary development.
The maximum match is a number dependent upon the similarity of the sequences. One of its definitions is the largest number of amino acids of one protein that can be matched with those of a second protein allowing for all possible interruptions in either of the sequences. While the interruptions give rise to a very large number of comparisons, the method efficiently excludes from consideration those comparisons that cannot contribute to the maximum match.
Comparisons are made from the smallest unit of significance, a pair of amino acids, one from each protein. All possible pairs are represented by a two-dimensional array, and all possible comparisons are represented by pathways through the array. For this maximum match only certain of the possible pathways must be evaluated. A numerical value, one in this case, is assigned to every cell in the array representing like amino acids. The maximum match is the largest number that would result from summing the cell values of every pathway.
Elsevier