dc.contributor.author
Hauschild, Anne-Christin
dc.contributor.author
Frisch, Tobias
dc.contributor.author
Baumbach, Jörg Ingo
dc.contributor.author
Baumbach, Jan
dc.date.accessioned
2018-06-08T10:24:40Z
dc.date.available
2018-02-23T11:08:04.271Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/20385
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-23688
dc.description.abstract
Computational breath analysis is a growing research area aiming at identifying
volatile organic compounds (VOCs) in human breath to assist medical
diagnostics of the next generation. While inexpensive and non-invasive
bioanalytical technologies for metabolite detection in exhaled air and
bacterial/fungal vapor exist and the first studies on the power of supervised
machine learning methods for profiling of the resulting data were conducted,
we lack methods to extract hidden data features emerging from confounding
factors. Here, we present Carotta, a new cluster analysis framework dedicated
to uncovering such hidden substructures by sophisticated unsupervised
statistical learning methods. We study the power of transitivity clustering
and hierarchical clustering to identify groups of VOCs with similar expression
behavior over most patient breath samples and/or groups of patients with a
similar VOC intensity pattern. This enables the discovery of dependencies
between metabolites. On the one hand, this allows us to eliminate the effect
of potential confounding factors hindering disease classification, such as
smoking. On the other hand, we may also identify VOCs associated with disease
subtypes or concomitant diseases. Carotta is an open source software with an
intuitive graphical user interface promoting data handling, analysis and
visualization. The back-end is designed to be modular, allowing for easy
extensions with plugins in the future, such as new clustering methods and
statistics. It does not require much prior knowledge or technical skills to
operate. We demonstrate its power and applicability by means of one artificial
dataset. We also apply Carotta exemplarily to a real-world example dataset on
chronic obstructive pulmonary disease (COPD). While the artificial data are
utilized as a proof of concept, we will demonstrate how Carotta finds
candidate markers in our real dataset associated with confounders rather than
the primary disease (COPD) and bronchial carcinoma (BC). Carotta is publicly
available at http://carotta.compbio.sdu.dk [1]. View Full-Text
en
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.subject
multicapillary column/ion mobility spectrometry
dc.subject
breath analysis
dc.subject.ddc
600 Technik, Medizin, angewandte Wissenschaften::610 Medizin und Gesundheit
dc.title
Carotta: Revealing Hidden Confounder Markers in Metabolic Breath Profiles
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation
Metabolites. - 5 (2015), 2, S. 344-363
dcterms.bibliographicCitation.doi
10.3390/metabo5020344
dcterms.bibliographicCitation.url
http://www.mdpi.com/2218-1989/5/2/344
refubium.affiliation
Mathematik und Informatik
de
refubium.mycore.fudocsId
FUDOCS_document_000000029108
refubium.note.author
Der Artikel wurde in einer reinen Open-Access-Zeitschrift publiziert.
refubium.resourceType.isindependentpub
no
refubium.mycore.derivateId
FUDOCS_derivate_000000009458
dcterms.accessRights.openaire
open access