New approaches for unsupervised transcriptomic data analysis based on Dictionary learning

Rams, Mona

New approaches for unsupervised transcriptomic data analysis based on Dictionary learning

Metadata

dc.contributor.author

Rams, Mona

dc.date.accessioned

2022-11-10T11:38:33Z

dc.date.available

2022-11-10T11:38:33Z

dc.date.issued

2022

dc.identifier.uri

https://refubium.fu-berlin.de/handle/fub188/36752

dc.identifier.uri

http://dx.doi.org/10.17169/refubium-36465

dc.description.abstract

The era of high-throughput data generation enables new access to biomolecular profiles and exploitation thereof. However, the analysis of such biomolecular data, for example, transcriptomic data, suffers from the so-called "curse of dimensionality". This occurs in the analysis of datasets with a significantly larger number of variables than data points. As a consequence, overfitting and unintentional learning of process-independent patterns can appear. This can lead to insignificant results in the application. A common way of counteracting this problem is the application of dimension reduction methods and subsequent analysis of the resulting low-dimensional representation that has a smaller number of variables. In this thesis, two new methods for the analysis of transcriptomic datasets are introduced and evaluated. Our methods are based on the concepts of Dictionary learning, which is an unsupervised dimension reduction approach. Unlike many dimension reduction approaches that are widely applied for transcriptomic data analysis, Dictionary learning does not impose constraints on the components that are to be derived. This allows for great flexibility when adjusting the representation to the data. Further, Dictionary learning belongs to the class of sparse methods. The result of sparse methods is a model with few non-zero coefficients, which is often preferred for its simplicity and ease of interpretation. Sparse methods exploit the fact that the analysed datasets are highly structured. Indeed, a characteristic of transcriptomic data is particularly their structuredness, which appears due to the connection of genes and pathways, for example. Nonetheless, the application of Dictionary learning in medical data analysis is mainly restricted to image analysis. Another advantage of Dictionary learning is that it is an interpretable approach. Interpretability is a necessity in biomolecular data analysis to gain a holistic understanding of the investigated processes. Our two new transcriptomic data analysis methods are each designed for one main task: (1) identification of subgroups for samples from mixed populations, and (2) temporal ordering of samples from dynamic datasets, also referred to as "pseudotime estimation". Both methods are evaluated on simulated and real-world data and compared to other methods that are widely applied in transcriptomic data analysis. Our methods convince through high performance and overall outperform the comparison methods.

dc.format.extent

l, 167 Seiten

dc.language

eng

dc.rights.uri

https://creativecommons.org/licenses/by-nc/4.0/

dc.subject

Dictionary learning

dc.subject

Transcriptomic

dc.subject

Machine learning

dc.subject

Applied Mathematics

dc.subject

Dimension reduction

dc.subject.ddc

500 Naturwissenschaften und Mathematik::510 Mathematik::519 Wahrscheinlichkeiten, angewandte Mathematik

dc.title

New approaches for unsupervised transcriptomic data analysis based on Dictionary learning

dc.type

Dissertation

dcterms.format

Text

dc.contributor.gender

female

dc.contributor.firstReferee

Conrad, Tim

dc.contributor.furtherReferee

Renard, Bernhard

dc.date.accepted

2022-10-25

dc.identifier.urn

urn:nbn:de:kobv:188-refubium-36752-1

refubium.affiliation

Mathematik und Informatik

dcterms.accessRights.dnb

free

dcterms.accessRights.openaire

open access

dcterms.accessRights.proquest

Show Simple Item Record

This Item appears in the following Collection(s)

Dissertationen FU

Files in This Item

Thesis_Rams_Mona.pdf

Size: 24.41MB

Format: PDF

Checksum (MD5): 079217b2e18775fd2f75541e25c8b47f

View/Open

New approaches for unsupervised transcriptomic data analysis based on Dictionary learning

Refubium - Freie Universität Berlin Repository

New approaches for unsupervised transcriptomic data analysis based on Dictionary learning

Metadata

This Item appears in the following Collection(s)

Files in This Item

Export metadata