dc.contributor.author
Winter, Jan Robin
dc.date.accessioned
2024-04-18T11:00:10Z
dc.date.available
2024-04-18T11:00:10Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/42166
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-41891
dc.description.abstract
Representing molecules in a computer-interpretable way plays a crucial role in enabling the application of computational method to the field of chemistry and pharmaceutical drug development in particular. Recently, there has been a surge of interest in using machine learning to predict molecular properties such as the binding affinity to protein targets of interest or to generate molecular structures with desirable properties. However, as chemical entities are challenging to represent in an expressive and computer-interpretable way, much work in the field of cheminformatics has concerned itself with defining clever feature extractors, which encode the chemical graph structure in a uniform, fixed-sized, numerical manner. Recently Deep Neural Networks have shown great success in learning to extract meaningful features directly from raw data representations, outperforming hand-crafted feature extraction protocols and revolutionizing fields such as image analysis or natural language processing. Deep Neural Networks have also been directly applied on raw data representations of molecules such as their structural graph. However, the capabilities of this method in pharmaceutical drug development are usually limited by the scarcity of labeled data as their collection usually involves running expensive wet lab experiments. Unsupervised Learning, on the other hand, is a powerful machine learning strategy that enables the training of Deep Neural Networks without the need of labeled training data. In this thesis we discuss how Unsupervised Learning can be used to train powerful feature extractors on unlabeled chemical structures. We propose for different input representations of molecules (such as line notations, graphs and point clouds) novel methods to extract expressive representations. We show how those representations can efficiently be used as input for downstream molecular property prediction models or to generate novel molecules with desirable properties. Moreover, we discuss how certain symmetries of molecular representations are crucial to respect (e.g. permutation invariance of molecular graphs or rotation and translation invariance of molecular conformations) and develop novel methods particularly designed to extract invariant representations.
en
dc.format.extent
142 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by-nd/4.0/
dc.subject
Drug Development
en
dc.subject
Unsupervised Learning
en
dc.subject
Machine Learning
en
dc.subject
Representation Learning
en
dc.subject
Deep Learning
en
dc.subject.ddc
000 Informatik, Informationswissenschaft, allgemeine Werke::000 Informatik, Wissen, Systeme::000 Informatik, Informationswissenschaft, allgemeine Werke
dc.title
Unsupervised Learning of Molecular Representations for Drug Development
dc.contributor.gender
male
dc.contributor.firstReferee
Noé, Frank
dc.contributor.furtherReferee
Bender, Andreas
dc.contributor.furtherReferee
Clevert, Djork-Arné
dc.date.accepted
2023-08-25
dc.identifier.urn
urn:nbn:de:kobv:188-refubium-42166-7
dc.title.translated
Unüberwachtes Lernen von Molekülrepräsentationen für die Medikamententwicklung
ger
refubium.affiliation
Mathematik und Informatik
dcterms.accessRights.dnb
free
dcterms.accessRights.openaire
open access
dcterms.accessRights.proquest
accept