Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals

DV Zaykin, PH Westfall, SS Young, MA Karnoub… - Human …, 2002 - karger.com
DV Zaykin, PH Westfall, SS Young, MA Karnoub, MJ Wagner, MG Ehm
Human heredity, 2002karger.com
There have been increasing efforts to relate drug efficacy and disease predisposition with
genetic polymorphisms. We present statistical tests for association of haplotype frequencies
with discrete and continuous traits in samples of unrelated individuals. Haplotype
frequencies are estimated through the expectation-maximization algorithm, and each
individual in the sample is expanded into all possible haplotype configurations with
corresponding probabilities, conditional on their genotype. A regression-based approach is …
Abstract
There have been increasing efforts to relate drug efficacy and disease predisposition with genetic polymorphisms. We present statistical tests for association of haplotype frequencies with discrete and continuous traits in samples of unrelated individuals. Haplotype frequencies are estimated through the expectation-maximization algorithm, and each individual in the sample is expanded into all possible haplotype configurations with corresponding probabilities, conditional on their genotype. A regression-based approach is then used to relate inferred haplotype probabilities to the response. The relationship of this technique to commonly used approaches developed for case-control data is discussed. We confirm the proper size of the test under H₀ and find an increase in power under the alternative by comparing test results using inferred haplotypes with single-marker tests using simulated data. More importantly, analysis of real data comprised of a dense map of single nucleotide polymorphisms spaced along a 12-cM chromosomal region allows us to confirm the utility of the haplotype approach as well as the validity and usefulness of the proposed statistical technique. The method appears to be successful in relating data from multiple, correlated markers to response.
Karger