Leveraging information across HLA alleles/supertypes improves epitope prediction

D Heckerman, C Kadie, J Listgarten - Journal of Computational …, 2007 - liebertpub.com
Journal of Computational Biology, 2007liebertpub.com
We present a model for predicting HLA class I restricted CTL epitopes. In contrast to almost
all other work in this area, we train a single model on epitopes from all HLA alleles and
supertypes, yet retain the ability to make epitope predictions for specific HLA alleles. We are
therefore able to leverage data across all HLA alleles and/or their supertypes, automatically
learning what information should be shared and also how to combine allele-specific,
supertype-specific, and global information in a principled way. We show that this leveraging …
We present a model for predicting HLA class I restricted CTL epitopes. In contrast to almost all other work in this area, we train a single model on epitopes from all HLA alleles and supertypes, yet retain the ability to make epitope predictions for specific HLA alleles. We are therefore able to leverage data across all HLA alleles and/or their supertypes, automatically learning what information should be shared and also how to combine allele-specific, supertype-specific, and global information in a principled way. We show that this leveraging can improve prediction of epitopes having HLA alleles with known supertypes, and dramatically increases our ability to predict epitopes having alleles which do not fall into any of the known supertypes. Our model, which is based on logistic regression, is simple to implement and understand, is solved by finding a single global maximum, and is more accurate (to our knowledge) than any other model.
Mary Ann Liebert