[HTML][HTML] A map of human genome variation from population scale sequencing

1000 Genomes Project Consortium - Nature, 2010 - ncbi.nlm.nih.gov
1000 Genomes Project Consortium
Nature, 2010ncbi.nlm.nih.gov
Abstract The 1000 Genomes Project aims to provide a deep characterisation of human
genome sequence variation as a foundation for investigating the relationship between
genotype and phenotype. We present results of the pilot phase of the project, designed to
develop and compare different strategies for genome wide sequencing with high throughput
sequencing platforms. We undertook three projects: low coverage whole genome
sequencing of 179 individuals from four populations, high coverage sequencing of two …
Abstract
The 1000 Genomes Project aims to provide a deep characterisation of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. We present results of the pilot phase of the project, designed to develop and compare different strategies for genome wide sequencing with high throughput sequencing platforms. We undertook three projects: low coverage whole genome sequencing of 179 individuals from four populations, high coverage sequencing of two mother-father-child trios, and exon targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million SNPs, 1 million short insertions and deletions and 20,000 structural variants, the majority of which were previously undescribed. We show that over 95% of the currently accessible variants found in any individual are present in this dataset; on average, each person carries approximately 250 to 300 loss of function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios we directly estimate the rate of de novo germline base substitution mutations to be approximately 10− 8 per base pair per generation. We find many putative functional variants with large allele frequency differences between populations. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
ncbi.nlm.nih.gov