dc.contributor.author
Morger, Andrea
dc.contributor.author
Mathea, Miriam
dc.contributor.author
Achenbach, Janosch H.
dc.contributor.author
Wolf, Antje
dc.contributor.author
Buesen, Roland
dc.contributor.author
Schleifer, Klaus-Juergen
dc.contributor.author
Landsiedel, Robert
dc.contributor.author
Volkamer, Andrea
dc.date.accessioned
2020-07-20T12:54:17Z
dc.date.available
2020-07-20T12:54:17Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/27855
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-27608
dc.description.abstract
Risk assessment of newly synthesised chemicals is a prerequisite for regulatory approval. In this context, in silico methods have great potential to reduce time, cost, and ultimately animal testing as they make use of the ever-growing amount of available toxicity data. Here, KnowTox is presented, a novel pipeline that combines three different in silico toxicology approaches to allow for confident prediction of potentially toxic effects of query compounds, i.e. machine learning models for 88 endpoints, alerts for 919 toxic substructures, and computational support for read-across. It is mainly based on the ToxCast dataset, containing after preprocessing a sparse matrix of 7912 compounds tested against 985 endpoints. When applying machine learning models, applicability and reliability of predictions for new chemicals are of utmost importance. Therefore, first, the conformal prediction technique was deployed, comprising an additional calibration step and per definition creating internally valid predictors at a given significance level. Second, to further improve validity and information efficiency, two adaptations are suggested, exemplified at the androgen receptor antagonism endpoint. An absolute increase in validity of 23% on the in-house dataset of 534 compounds could be achieved by introducing KNNRegressor normalisation. This increase in validity comes at the cost of efficiency, which could again be improved by 20% for the initial ToxCast model by balancing the dataset during model training. Finally, the value of the developed pipeline for risk assessment is discussed using two in-house triazole molecules. Compared to a single toxicity prediction method, complementing the outputs of different approaches can have a higher impact on guiding toxicity testing and de-selecting most likely harmful development-candidate compounds early in the development process.
en
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
Toxicity prediction
en
dc.subject
Random forest
en
dc.subject
Conformal prediction
en
dc.subject
Confidence estimation
en
dc.subject
Applicability domain
en
dc.subject
Androgen receptor
en
dc.subject.ddc
600 Technik, Medizin, angewandte Wissenschaften::610 Medizin und Gesundheit::610 Medizin und Gesundheit
dc.title
KnowTox: pipeline and case study for confident prediction of potential toxic effects of compounds in early phases of development
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.articlenumber
24
dcterms.bibliographicCitation.doi
10.1186/s13321-020-00422-x
dcterms.bibliographicCitation.journaltitle
Journal of Cheminformatics
dcterms.bibliographicCitation.originalpublishername
BMC
dcterms.bibliographicCitation.volume
12
refubium.affiliation
Charité - Universitätsmedizin Berlin
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
1758-2946