Phenotype Relevant Network-based Biomarker Discovery Integrating Multiple Omics Data: EMT Network-based Lung Cancer Prognosis Prediction

Shao, Borong

Phenotype Relevant Network-based Biomarker Discovery Integrating Multiple Omics Data

Metadata

dc.contributor.author

Shao, Borong

dc.date.accessioned

2018-07-19T09:39:43Z

dc.date.available

2018-07-19T09:39:43Z

dc.date.issued

2018

dc.identifier.uri

https://refubium.fu-berlin.de/handle/fub188/22487

dc.identifier.uri

http://dx.doi.org/10.17169/refubium-294

dc.description.abstract

Network-based feature selection methods on omics data have been developed in recent years. Their performance gain, however, is shown to be affected by the datasets, networks, and evaluation metrics. The reproducibility and robustness of biomarkers await to be improved. In this endeavor, one of the major challenges is the curse of dimensionality. To mitigate this issue, we proposed the Phenotype Relevant Network-based Feature Selection (PRNFS) framework. By employing a much smaller but phenotype relevant network, we could avoid irrelevant information and select robust molecular signatures. The advantages of PRNFS were demonstrated with the application of lung cancer prognosis prediction. Specifically, we constructed epithelial mesenchymal transition (EMT) networks and employed them for feature selection. We mapped multiple types of omics data on it alternatively to select single-omics signatures and further integrated them into multi-omics signatures. Then we introduced a multiplex network-based feature selection method to directly select multi-omics signatures. Both single-omics and multi-omics EMT signatures were evaluated on TCGA data as well as an independent multi-omics dataset. The results showed that EMT signatures achieved significant performance gain, although EMT networks covered less than 2.5% of the original data dimensions. Frequently selected EMT features achieved average AUC values of 0.83 on TCGA data. Employing EMT signatures on the independent dataset stratified the patients into significantly different prognostic groups. Multi-omics features showed superior performance over single-omics features on both TCGA data and the independent data. Additionally, we tested the performance of a few relational and non-relational databases for storing and retrieving omics data. Since biological data have large volume, high velocity, and wide varieties, it is necessary to have database systems that meet the need of integrative omics data analysis. Based on the results, we provided a few advices on building scalable omics data infrastructures.

dc.format.extent

vi, 186 Seiten

dc.language

eng

dc.rights.uri

http://www.fu-berlin.de/sites/refubium/rechtliches/Nutzungsbedingungen

dc.subject

Feature selection

dc.subject

Data integration

dc.subject

Cancer prognosis

dc.subject

Epithelial Mesenchymal Transition

dc.subject

Multiomics

dc.subject

Multiplex

dc.subject

Survival analysis

dc.subject

NoSQL

dc.subject.ddc

000 Computer science, information, and general works::000 Computer Science, knowledge, systems::000 Computer science, information, and general works

dc.title

Phenotype Relevant Network-based Biomarker Discovery Integrating Multiple Omics Data

dc.type

Dissertation

dcterms.format

Text

dc.contributor.gender

female

dc.contributor.firstReferee

Conrad, Tim

dc.contributor.furtherReferee

Klau, Gunnar

dc.date.accepted

2018-07-09

dc.identifier.urn

urn:nbn:de:kobv:188-refubium-22487-4

dc.title.subtitle

EMT Network-based Lung Cancer Prognosis Prediction

refubium.affiliation

Mathematik und Informatik

dcterms.accessRights.dnb

free

dcterms.accessRights.openaire

open access

Show Simple Item Record

This Item appears in the following Collection(s)

Dissertationen FU

Files in This Item

phd_thesis_ShaoBorong.pdf

Size: 7.895MB

Format: PDF

Checksum (MD5): d5991e461b1bc4fc689b9cca4f10fb18

View/Open

Phenotype Relevant Network-based Biomarker Discovery Integrating Multiple Omics Data

Refubium - Freie Universität Berlin Repository

Phenotype Relevant Network-based Biomarker Discovery Integrating Multiple Omics Data

Metadata

This Item appears in the following Collection(s)

Files in This Item

Export metadata