dc.contributor.author
Morger, Andrea
dc.contributor.author
Svensson, Fredrik
dc.contributor.author
Arvidsson McShane, Staffan
dc.contributor.author
Gauraha, Niharika
dc.contributor.author
Norinder, Ulf
dc.contributor.author
Spjuth, Ola
dc.contributor.author
Volkamer, Andrea
dc.date.accessioned
2023-03-10T14:16:09Z
dc.date.available
2023-03-10T14:16:09Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/38310
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-38029
dc.description.abstract
Machine learning methods are widely used in drug discovery and toxicity prediction. While showing overall good performance in cross-validation studies, their predictive power (often) drops in cases where the query samples have drifted from the training data's descriptor space. Thus, the assumption for applying machine learning algorithms, that training and test data stem from the same distribution, might not always be fulfilled. In this work, conformal prediction is used to assess the calibration of the models. Deviations from the expected error may indicate that training and test data originate from different distributions. Exemplified on the Tox21 datasets, composed of chronologically released Tox21Train, Tox21Test and Tox21Score subsets, we observed that while internally valid models could be trained using cross-validation on Tox21Train, predictions on the external Tox21Score data resulted in higher error rates than expected. To improve the prediction on the external sets, a strategy exchanging the calibration set with more recent data, such as Tox21Test, has successfully been introduced. We conclude that conformal prediction can be used to diagnose data drifts and other issues related to model calibration. The proposed improvement strategy-exchanging the calibration data only-is convenient as it does not require retraining of the underlying model.
en
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
Toxicity prediction
en
dc.subject
Conformal prediction
en
dc.subject
Applicability domain
en
dc.subject
Calibration plots
en
dc.subject
Tox21 datasets
en
dc.subject.ddc
600 Technik, Medizin, angewandte Wissenschaften::610 Medizin und Gesundheit::610 Medizin und Gesundheit
dc.title
Assessing the calibration in toxicological in vitro models with conformal prediction
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.articlenumber
35
dcterms.bibliographicCitation.doi
10.1186/s13321-021-00511-5
dcterms.bibliographicCitation.journaltitle
Journal of Cheminformatics
dcterms.bibliographicCitation.originalpublishername
Springer Nature
dcterms.bibliographicCitation.volume
13
refubium.affiliation
Charité - Universitätsmedizin Berlin
refubium.funding
Springer Nature DEAL
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.bibliographicCitation.pmid
33926567
dcterms.isPartOf.eissn
1758-2946