dc.contributor.author
Abella, Jayvee R.
dc.contributor.author
Antunes, Dinler Amaral
dc.contributor.author
Clementi, Cecilia
dc.contributor.author
Kavraki, Lydia E.
dc.date.accessioned
2021-03-17T09:39:23Z
dc.date.available
2021-03-17T09:39:23Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/29962
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-29704
dc.description.abstract
Prediction of stable peptide binding to Class I HLAs is an important component for designing immunotherapies. While the best performing predictors are based on machine learning algorithms trained on peptide-HLA (pHLA) sequences, the use of structure for training predictors deserves further exploration. Given enough pHLA structures, a predictor based on the residue-residue interactions found in these structures has the potential to generalize for alleles with little or no experimental data. We have previously developed APE-Gen, a modeling approach able to produce pHLA structures in a scalable manner. In this work we use APE-Gen to model over 150,000 pHLA structures, the largest dataset of its kind, which were used to train a structure-based pan-allele model. We extract simple, homogenous features based on residue-residue distances between peptide and HLA, and build a random forest model for predicting stable pHLA binding. Our model achieves competitive AUROC values on leave-one-allele-out validation tests using significantly less data when compared to popular sequence-based methods. Additionally, our model offers an interpretation analysis that can reveal how the model composes the features to arrive at any given prediction. This interpretation analysis can be used to check if the model is in line with chemical intuition, and we showcase particular examples. Our work is a significant step toward using structure to achieve generalizable and more interpretable prediction for stable pHLA binding.
en
dc.format.extent
9 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
structural modeling
en
dc.subject
random forests
en
dc.subject
machine learning
en
dc.subject
peptide binding
en
dc.subject
immunopeptidomics
en
dc.subject
antigen presentation
en
dc.subject.ddc
500 Naturwissenschaften und Mathematik::530 Physik::530 Physik
dc.subject.ddc
000 Informatik, Informationswissenschaft, allgemeine Werke::000 Informatik, Wissen, Systeme::000 Informatik, Informationswissenschaft, allgemeine Werke
dc.subject.ddc
500 Naturwissenschaften und Mathematik::540 Chemie::540 Chemie und zugeordnete Wissenschaften
dc.title
Large-Scale Structure-Based Prediction of Stable Peptide Binding to Class I HLAs Using Random Forests
dc.type
Wissenschaftlicher Artikel
dc.identifier.sepid
80576
dcterms.bibliographicCitation.articlenumber
1583
dcterms.bibliographicCitation.doi
10.3389/fimmu.2020.01583
dcterms.bibliographicCitation.journaltitle
Frontiers in Immunology
dcterms.bibliographicCitation.originalpublishername
Frontiers Media
dcterms.bibliographicCitation.originalpublisherplace
Lausanne
dcterms.bibliographicCitation.volume
11
dcterms.bibliographicCitation.url
http://dx.doi.org/10.3389/fimmu.2020.01583
refubium.affiliation
Physik
refubium.affiliation.other
Institut für Theoretische Physik
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
1664-3224