Codon usage tabulated from international DNA sequence databases: status for the year 2000

Y Nakamura, T Gojobori, T Ikemura - Nucleic acids research, 2000 - academic.oup.com
Nucleic acids research, 2000academic.oup.com
The frequencies of each of the 257 468 complete protein coding sequences (CDSs) have
been compiled from the taxonomical divisions of the GenBank DNA sequence database.
The sum of the codons used by 8792 organisms has also been calculated. The data files
can be obtained from the anonymous ftp sites of DDBJ, Kazusa and EBI. A list of the codon
usage of genes and the sum of the codons used by each organism can be obtained through
the web site http://www. kazusa. or. jp/codon/. The present study also reports recent …
Abstract
The frequencies of each of the 257 468 complete protein coding sequences (CDSs) have been compiled from the taxonomical divisions of the GenBank DNA sequence database. The sum of the codons used by 8792 organisms has also been calculated. The data files can be obtained from the anonymous ftp sites of DDBJ, Kazusa and EBI. A list of the codon usage of genes and the sum of the codons used by each organism can be obtained through the web site http://www.kazusa.or.jp/codon/ . The present study also reports recent developments on the WWW site. The new web interface provides data in the CodonFrequency-compatible format as well as in the traditional table format. The use of the database is facilitated by keyword based search analysis and the availability of codon usage tables for selected genes from each species. These new tools will provide users with the ability to further analyze for variations in codon usage among different genomes.
Oxford University Press