Google Scholar

NCBI Reference Sequences: current status, policy and new initiatives

KD Pruitt, T Tatusova, W Klimke… - Nucleic acids …, 2009 - academic.oup.com

KD Pruitt, T Tatusova, W Klimke, DR Maglott

Nucleic acids research, 2009•academic.oup.com

Abstract NCBI's Reference Sequence (RefSeq) database (http://www. ncbi. nlm. nih.
gov/RefSeq/) is a curated non-redundant collection of sequences representing genomes,
transcripts and proteins. RefSeq records integrate information from multiple sources and
represent a current description of the sequence, the gene and sequence features. The
database includes over 5300 organisms spanning prokaryotes, eukaryotes and viruses, with
records for more than 5.5× 106 proteins (RefSeq release 30). Feature annotation is applied …

Abstract

NCBI's Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) is a curated non-redundant collection of sequences representing genomes, transcripts and proteins. RefSeq records integrate information from multiple sources and represent a current description of the sequence, the gene and sequence features. The database includes over 5300 organisms spanning prokaryotes, eukaryotes and viruses, with records for more than 5.5 × 10⁶ proteins (RefSeq release 30). Feature annotation is applied by a combination of curation, collaboration, propagation from other sources and computation. We report here on the recent growth of the database, recent changes to feature annotations and record types for eukaryotic (primarily vertebrate) species and policies regarding species inclusion and genome annotation. In addition, we introduce RefSeqGene, a new initiative to support reporting variation data on a stable genomic coordinate system.

Oxford University Press

Show moreShow less

Save Cite Cited by 913 Related articles All 14 versions

Cite

Advanced search

Saved to My library

NCBI Reference Sequences: current status, policy and new initiatives