dc.contributor.author
Le, Tuan
dc.contributor.author
Winter, Robin
dc.contributor.author
Noé, Frank
dc.contributor.author
Clevert, Djork-Arné
dc.date.accessioned
2020-11-12T13:13:05Z
dc.date.available
2020-11-12T13:13:05Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/28844
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-28593
dc.description.abstract
Protecting molecular structures from disclosure against external parties is of great relevance for industrial and private associations, such as pharmaceutical companies. Within the framework of external collaborations, it is common to exchange datasets by encoding the molecular structures into descriptors. Molecular fingerprints such as the extended-connectivity fingerprints (ECFPs) are frequently used for such an exchange, because they typically perform well on quantitative structure-activity relationship tasks. ECFPs are often considered to be non-invertible due to the way they are computed. In this paper, we present a fast reverse-engineering method to deduce the molecular structure given revealed ECFPs. Our method includes the Neuraldecipher, a neural network model that predicts a compact vector representation of compounds, given ECFPs. We then utilize another pre-trained model to retrieve the molecular structure as SMILES representation. We demonstrate that our method is able to reconstruct molecular structures to some extent, and improves, when ECFPs with larger fingerprint sizes are revealed. For example, given ECFP count vectors of length 4096, we are able to correctly deduce up to 69% of molecular structures on a validation set (112 K unique samples) with our method.
en
dc.format.extent
12 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
chemical-structures
en
dc.subject.ddc
500 Naturwissenschaften und Mathematik::510 Mathematik::510 Mathematik
dc.title
Neuraldecipher - reverse-engineering extended-connectivity fingerprints (ECFPs) to their molecular structures
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.doi
10.1039/D0SC03115A
dcterms.bibliographicCitation.journaltitle
Chemical Science
dcterms.bibliographicCitation.number
38
dcterms.bibliographicCitation.pagestart
10378
dcterms.bibliographicCitation.pageend
10389
dcterms.bibliographicCitation.volume
11
dcterms.bibliographicCitation.url
https://doi.org/10.1039/D0SC03115A
refubium.affiliation
Mathematik und Informatik
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
2041-6539
refubium.resourceType.provider
WoS-Alert