Computational approaches have become integral to modern oncology, significantly advancing drug discovery across all stages of development. Among these, computer-aided drug design (CADD) has proven particularly effective in identifying novel molecules targeting key cancer-related proteins in early stage drug research. This thesis applies CADD techniques to the discovery of new inhibitors for two cancer-related therapeutic targets, human cytochrome P450 19A1 (CYP19A1, also known as aromatase) and kinesin family member C3 (KIFC3), employing tailored computational strategies informed by the different levels of information available.
Human CYP19A1 is essential for the progression of estrogen receptor-positive (ER+) breast cancer in postmenopausal women. We developed and rigorously validated machine learning models to predict aromatase inhibition using bioactivity data from the ChEMBL and PubChem BioAssay databases. These models enabled a virtual screening campaign that identified promising aromatase inhibitors, which were subsequently tested in an enzymatic assay based on heterologous expression of human CYP19A1 in yeast. Our approach led to the discovery of several novel active inhibitors with novel chemical scaffolds. Among them, a non-covalent aromatase inhibitor containing coumarin and imidazole substructures, compound 9, shows the highest potency, with an experimentally determined IC50 of 271 ± 51 nM was discovered. Interestingly, the two different types of machine learning approaches, random forest and message passing neural network, addressed different areas of chemical spaces for the top-ranked CYP19A1 inhibitors predicted.
Additionally, we investigated a novel cancer-related target, KIFC3, for which no inhibitors were previously known. By identifying potential binding sites through comparison with KIFC1, one of KIFC3's nearest neighbors in the family, and employing binding site detection software, we facilitated a docking-based prioritization of analogs of KIFC1 inhibitors as binders of KIFC3. Promising candidates were selected and experimentally tested, leading to the identification of the first KIFC3 inhibitor, compound 5. Structural optimization of 5 via pharmacophore-based and shape-based virtual screening led to the discovery of an improved inhibitor, compound 8, with enhanced activity.
In summary, this dissertation explores computational strategies for the virtual screening of cancer-related targets with varying levels of prior knowledge. Through the integration of machine learning, molecular docking, and experimental validation, we successfully identified chemically novel, highly active compounds for aromatase and the first known KIFC3 inhibitor.