Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA

Y Yang, J Zhang, J Hoh, F Matsuda… - Proceedings of the …, 2003 - National Acad Sciences
Y Yang, J Zhang, J Hoh, F Matsuda, P Xu, M Lathrop, J Ott
Proceedings of the National Academy of Sciences, 2003National Acad Sciences
The efficiency of single-nucleotide polymorphism haplotype analysis may be increased by
DNA pooling, which can dramatically reduce the number of genotyping assays. We develop
a method for obtaining maximum likelihood estimates of haplotype frequencies for different
pool sizes, assess the accuracy of these estimates, and show that pooling DNA samples is
efficient in estimating haplotype frequencies. Although pooling K individuals increases
ambiguities, at least for small pool size K and small numbers of loci, the uncertainty of …
The efficiency of single-nucleotide polymorphism haplotype analysis may be increased by DNA pooling, which can dramatically reduce the number of genotyping assays. We develop a method for obtaining maximum likelihood estimates of haplotype frequencies for different pool sizes, assess the accuracy of these estimates, and show that pooling DNA samples is efficient in estimating haplotype frequencies. Although pooling K individuals increases ambiguities, at least for small pool size K and small numbers of loci, the uncertainty of estimation increases <K times that of unpooled DNA. We also develop the asymptotic variance-covariance of maximum likelihood estimates and evaluate the accuracy of variance estimates by Monte Carlo methods. When the sample size of pools is moderately large, the asymptotic variance estimates are rather accurate. Completely or partially missing genotyping information is allowed for in our analysis. Finally, our methods are applied to single-nucleotide polymorphisms in the angiotensinogen gene.
National Acad Sciences