Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs

M Kozak - Nucleic acids research, 1984 - academic.oup.com
M Kozak
Nucleic acids research, 1984academic.oup.com
Noncoding sequences have been tabulated for 211 messenger R'JAs from higher
eukaryotic cells. The 5′-proximal AUG triplet serves as the initiator codon in 95% of the
mRNAs examined. The most conspicuous conserved feature is the presence of a purine
(most often A) three nucleotides upstream from the AUG initiator codon; only 6 of the mRNAs
in the survey have a pyrimidine in that position. There is a predominance of C in positions−
1,− 2,− 4 and− 5, just upstream from the initiator codon. The sequence CC GA CCAUG (G) …
Abstract
5′-Noncoding sequences have been tabulated for 211 messenger R'JAs from higher eukaryotic cells. The 5′-proximal AUG triplet serves as the initiator codon in 95% of the mRNAs examined. The most conspicuous conserved feature is the presence of a purine (most often A) three nucleotides upstream from the AUG initiator codon; only 6 of the mRNAs in the survey have a pyrimidine in that position. There is a predominance of C in positions −1, −2, −4 and −5, just upstream from the initiator codon. The sequence (G) thus emerges as a consensus sequence for eukaryotic initiation sites. The extent to which the ribosome binding site in a given mRNA matches the −1 to −5 consensus sequence varies: more than half of the mRNAs in the tabulation have 3 or 4 nucleotides in common with the CCACC consensus, but only ten mRNAs conform perfectly.
Oxford University Press