[HTML][HTML] Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database

TG Bell, M Yousif, A Kramvis - Springerplus, 2016 - Springer
TG Bell, M Yousif, A Kramvis
Springerplus, 2016Springer
Abstract Background Hepatitis B virus (HBV) DNA sequence data from thousands of
samples are present in the public sequence databases. No publicly available, up-to-date,
multiple sequence alignments, containing full-length and subgenomic fragments per
genotype, are available. Such alignments are useful in many analysis applications,
including data-mining and phylogenetic analyses. Results By issuing a query, all HBV
sequence data from the GenBank public database was downloaded (67,893 sequences) …
Background
Hepatitis B virus (HBV) DNA sequence data from thousands of samples are present in the public sequence databases. No publicly available, up-to-date, multiple sequence alignments, containing full-length and subgenomic fragments per genotype, are available. Such alignments are useful in many analysis applications, including data-mining and phylogenetic analyses.
Results
By issuing a query, all HBV sequence data from the GenBank public database was downloaded (67,893 sequences). Full-length and subgenomic sequences, which were genotyped by the submitters (30,852 sequences), were placed into a multiple sequence alignment, for each genotype (genotype A: 5868 sequences, B: 4630, C: 7820, D: 8300, E: 2043, F: 985, G: 189, H: 108, I: 23), according to the results of offline BLAST searches against a custom reference library of full-length sequences. Further curation was performed to improve the alignment.
Conclusions
The algorithm described in this paper generates, for each of the nine HBV genotypes, multiple sequence alignments, which contain full-length and subgenomic fragments. The alignments can be updated as new sequences become available in the online public sequence databases. The alignments are available at http://hvdr.bioinf.wits.ac.za/alignments .
Springer