Interpreting cDNA sequences: some insights from studies on translation

M Kozak - Mammalian genome, 1996 - Springer
M Kozak
Mammalian genome, 1996Springer
This review discusses some rules for assessing the completeness of a cDNA sequence and
identifying the start site for translation. Features commonly invoked—such as an ATG codon
in a favorable context for initiation, or the presence of an upstream in-frame terminator
codon, or the prediction of a signal peptide-like sequence at the amino terminus—have
some validity; but examples drawn from the literature illustrate limitations to each of these
criteria. The best advice is to inspect a cDNA sequence not only for these positive features …
Abstract
This review discusses some rules for assessing the completeness of a cDNA sequence and identifying the start site for translation. Features commonly invoked—such as an ATG codon in a favorable context for initiation, or the presence of an upstream in-frame terminator codon, or the prediction of a signal peptide-like sequence at the amino terminus—have some validity; but examples drawn from the literature illustrate limitations to each of these criteria. The best advice is to inspect a cDNA sequence not only for these positive features but also for the absence of certain negative indicators. Three specific warning signs are discussed and documented: (i) The presence of numerous ATG codons upstream from the presumptive start site for translation often indicates an aberration (sometimes a retained intron) at the 5′ end of the cDNA. (ii) Even one strong, upstream, out-of-frame ATG codon poses a problem if the reading frame set by the upstream ATG overlaps the presumptive start of the major open reading frame. Many cDNAs that display this arrangement turn out to be incomplete; that is, the out-of-frame ATG codon is within, rather than upstream from, the protein coding domain. (iii) A very weak context at the putative start site for translation often means that the cDNA lacks the authentic initiator codon. In addition to presenting some criteria that may aid in recognizing incomplete cDNA sequences, the review includes some advice for using in vitro translation systems for the expression of cDNAs. Some unresolved questions about translational regulation are discussed by way of illustrating the importance of verifying mRNA structures before making deductions about translation.
Springer