Global Phylogeny of Mycobacterium tuberculosis Based on Single Nucleotide Polymorphism (SNP) Analysis: Insights into Tuberculosis Evolution, Phylogenetic …

I Filliol, AS Motiwala, M Cavatore, W Qi… - Journal of …, 2006 - Am Soc Microbiol
I Filliol, AS Motiwala, M Cavatore, W Qi, MH Hazbón, M Bobadilla del Valle, J Fyfe…
Journal of bacteriology, 2006Am Soc Microbiol
We analyzed a global collection of Mycobacterium tuberculosis strains using 212 single
nucleotide polymorphism (SNP) markers. SNP nucleotide diversity was high (average
across all SNPs, 0.19), and 96% of the SNP locus pairs were in complete linkage
disequilibrium. Cluster analyses identified six deeply branching, phylogenetically distinct
SNP cluster groups (SCGs) and five subgroups. The SCGs were strongly associated with the
geographical origin of the M. tuberculosis samples and the birthplace of the human hosts …
Abstract
We analyzed a global collection of Mycobacterium tuberculosis strains using 212 single nucleotide polymorphism (SNP) markers. SNP nucleotide diversity was high (average across all SNPs, 0.19), and 96% of the SNP locus pairs were in complete linkage disequilibrium. Cluster analyses identified six deeply branching, phylogenetically distinct SNP cluster groups (SCGs) and five subgroups. The SCGs were strongly associated with the geographical origin of the M. tuberculosis samples and the birthplace of the human hosts. The most ancestral cluster (SCG-1) predominated in patients from the Indian subcontinent, while SCG-1 and another ancestral cluster (SCG-2) predominated in patients from East Asia, suggesting that M. tuberculosis first arose in the Indian subcontinent and spread worldwide through East Asia. Restricted SCG diversity and the prevalence of less ancestral SCGs in indigenous populations in Uganda and Mexico suggested a more recent introduction of M. tuberculosis into these regions. The East African Indian and Beijing spoligotypes were concordant with SCG-1 and SCG-2, respectively; X and Central Asian spoligotypes were also associated with one SCG or subgroup combination. Other clades had less consistent associations with SCGs. Mycobacterial interspersed repetitive unit (MIRU) analysis provided less robust phylogenetic information, and only 6 of the 12 MIRU microsatellite loci were highly differentiated between SCGs as measured by GST. Finally, an algorithm was devised to identify two minimal sets of either 45 or 6 SNPs that could be used in future investigations to enable global collaborations for studies on evolution, strain differentiation, and biological differences of M. tuberculosis.
American Society for Microbiology