Principles for the post-GWAS functional characterization of cancer risk loci

ML Freedman, ANA Monteiro, SA Gayther… - Nature …, 2011 - nature.com
ML Freedman, ANA Monteiro, SA Gayther, GA Coetzee, A Risch, C Plass, G Casey…
Nature genetics, 2011nature.com
1 Nature America, Inc. All rights reserv ed. perspective 514 volume 43| number 6| june 2011|
NATURE GENETICS region if a similar association was found in this population, as the
African-American population generally has smaller LD block structure than the European
population9. Alternatively, LD structure can be ignored and arbitrary physical limits can be
set to define boundaries, for example, by choosing to sequence 1 Mb across the risk allele.
The region can be further narrowed through incorporation of biological information for the …
1 Nature America, Inc. All rights reserv ed. perspective
514 volume 43| number 6| june 2011| NATURE GENETICS region if a similar association was found in this population, as the African-American population generally has smaller LD block structure than the European population9. Alternatively, LD structure can be ignored and arbitrary physical limits can be set to define boundaries, for example, by choosing to sequence 1 Mb across the risk allele. The region can be further narrowed through incorporation of biological information for the presence of a compelling candidate gene or transcript. However, note that relying on biological assumptions undermines the agnostic approach of GWAS.
The depth of coverage and the number of subjects to be sequenced are important considerations. Current targeted enrichment technologies yield non-uniform sequencing coverage, which could increase the heterozygote false-negative rate. Sequencing coverage of 25× or greater may be required, especially if sequencing-based genotyping and not just variant discovery is a goal. The likelihood of identifying less common variants is also dependent upon the number of subjects sequenced, and often DNA from several hundreds of subjects is needed. In summary, because of the fact that both the size of the region and the number of individuals to be sequenced influence cost, the final design will likely be a compromise. Costs can be offset to some extent with the use of molecular barcoding, when individual genotypes are important, and DNA pooling, when variant discovery is important.
nature.com