dc.contributor.author
Niehus, Sebastian
dc.date.accessioned
2022-07-28T08:52:30Z
dc.date.available
2022-07-28T08:52:30Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/35614
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-35328
dc.description.abstract
In recent years, advances in the field of sequencing technologies have enabled the field of population-scale sequencing studies. These studies aim to sequence and analyze a large set of individuals from one or multiple populations, with the aim of gaining insight into underlying genetic structure, similarities and differences. Collections of genetic variation and possible connections to various disease are some of the products of this area of research. The potential of population studies is widely considered to be huge and many more endeavors of this kind are expected in the near future. This opportunity comes with a big challenge because many computational tools that are used for the analysis of sequencing data were not designed for cohorts of this size and may suffer from limited scalability. It is therefore vital that the computational tools required for the analysis of population-scale data keep up with the quickly growing amounts of data.
This thesis contributes to the field of population-scale genetics in the development and application of a novel approach for structural variant detection. It has explicitly been designed with the large amounts of population-scale sequencing data in mind. The presented approach is capable of analyzing tens of thousands of whole-genome short-read sequencing samples jointly. This joint analysis is driven by a tailored joint likelihood ratio model that integrates information from many genomes. The efficient approach does not only save computational resources but also allows to combine the data across all samples to make sensitive and specific predictions about the presence and genotypes of structural variation present within the analyzed population. This thesis demonstrates that this approach and the computational tool PopDel that implements it compare favorably to current state-of-the-art structural variant callers that have been used in previous population-scale studies. Extensive benchmarks on simulated and real world sequencing data are provided to show the performance of the presented approach. Further, a first finding of medical relevance that directly stems from the application of PopDel on the genomes of almost 50,000 Icelanders is presented.
This thesis therefore provides a novel tool and new ideas to further push the boundaries of the analysis of massive amounts of next generation sequencing data and to deepen our understanding of structural variation and their implications for human health.
en
dc.format.extent
XI, 140 Seiten, xxxviii
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
Structural Variation
en
dc.subject
Variant Calling
en
dc.subject
Population Genomics
en
dc.subject
Bioinformatics
en
dc.subject.ddc
000 Informatik, Informationswissenschaft, allgemeine Werke::000 Informatik, Wissen, Systeme::006 Spezielle Computerverfahren
dc.subject.ddc
500 Naturwissenschaften und Mathematik::570 Biowissenschaften; Biologie::576 Genetik und Evolution
dc.title
Multi-Sample Approaches and Applications for Structural Variant Detection
dc.contributor.gender
male
dc.contributor.firstReferee
Reinert, Knut
dc.contributor.furtherReferee
Kehr, Birte
dc.date.accepted
2022-07-19
dc.identifier.urn
urn:nbn:de:kobv:188-refubium-35614-6
refubium.affiliation
Mathematik und Informatik
refubium.note.author
BMBF-gefördert
de
dcterms.accessRights.dnb
free
dcterms.accessRights.openaire
open access
dcterms.accessRights.proquest
accept