dc.contributor.author
Mousa, Nesma
dc.contributor.author
Varbanov, Hristo P.
dc.contributor.author
Kaipanchery, Vidya
dc.contributor.author
Gabano, Elisabetta
dc.contributor.author
Ravera, Mauro
dc.contributor.author
Toropov, Andrey A.
dc.contributor.author
Charochkina, Larisa
dc.contributor.author
Menezes, Filipe
dc.contributor.author
Godin, Guillaume
dc.contributor.author
Tetko, Igor V.
dc.date.accessioned
2025-05-16T08:07:35Z
dc.date.available
2025-05-16T08:07:35Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/47678
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-47396
dc.description.abstract
Predicting the solubility and lipophilicity of platinum(II, IV) complexes is essential for prioritizing potential anticancer candidates in drug discovery. This study introduces the first publicly available online model for predicting the solubility of platinum complexes, addressing the lack of literature and models in this regard. Using a time-split dataset, we developed a consensus model with a Root Mean Squared Error (RMSE) of 0.62 through 5-cross-validation on a training set of 284 historical compounds (solubility data reported prior to 2017). However, the RMSE increased to 0.86 when applied to a prospective test set of 108 compounds reported after 2017. Further analysis of the high prediction errors revealed that these inaccuracies are primarily attributed to the underrepresentation of novel chemical scaffolds, particularly Pt(IV) derivatives, in the training sets. For instance, a series of eight phenanthroline-containing compounds, not covered by the training set's chemical space, had an RMSE of 1.3. When the model was redeveloped using a combined dataset, the RMSE of this series significantly decreased to 0.34 under the same validation protocol. Additionally, we developed an interpretable linear model to identify structural features and functional groups that influence the solubility of platinum complexes. We further validated the correlation between solubility and lipophilicity, consistent with the Yalkowsky General Solubility Equation. Building on these insights, we developed a final multitask model that simultaneously predicts solubility and lipophilicity as two endpoints with RMSE = 0.62 and 0.44, respectively. The data and final developed model is available at https://ochem.eu/article/31.
en
dc.format.extent
16 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject
Platinum Pt(II)/Pt(IV) complexes
en
dc.subject
Water solubility
en
dc.subject
Lipophilicity
en
dc.subject
Consensus model
en
dc.subject
Neural networks
en
dc.subject
Representation learning
en
dc.subject.ddc
500 Naturwissenschaften und Mathematik::570 Biowissenschaften; Biologie::570 Biowissenschaften; Biologie
dc.title
Online OCHEM multi-task model for solubility and lipophilicity prediction of platinum complexes
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.articlenumber
112890
dcterms.bibliographicCitation.doi
10.1016/j.jinorgbio.2025.112890
dcterms.bibliographicCitation.journaltitle
Journal of Inorganic Biochemistry
dcterms.bibliographicCitation.volume
269
dcterms.bibliographicCitation.url
https://doi.org/10.1016/j.jinorgbio.2025.112890
refubium.affiliation
Biologie, Chemie, Pharmazie
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
1873-3344
refubium.resourceType.provider
WoS-Alert