dc.contributor.author
Löber, Ulrike
dc.date.accessioned
2019-06-26T11:42:27Z
dc.date.available
2019-06-26T11:42:27Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/24935
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-2695
dc.description.abstract
For hundreds of millions of years, retroviruses have been integrating into genomes of vertebrates. This thesis contributes to the development of new methods for retrieval, characterization and the comparison of viruses that have integrated into the genome (endogenous retroviruses, or ERVs) and their integration sites in host genomes. The koala retrovirus is an outstanding study subject since it is currently in the transition from an exogenous to an endogenous retrovirus. In the past decades, high-throughput sequencing (HTS) has allowed scientists to investigate genomic data at high coverage and low costs. However, the development of new sequencing technologies facilitated the production of vast amounts of data. The analysis bottleneck has shifted from data production to the analysis of so-called “big data”. In consequence, new algorithms and pipelines need to be established to process biological data. Solutions for automated handling of short-read HTS data exist for many problems and can be improved and extended. Recent improvements in HTS resulting in longer sequence fragments have helped solve problems connected to short-read sequencing but produced new challenges for genomics data processing. In this thesis, I present pipelines to comprehensively profile endogenous retroviruses from short-read HTS data for museum koala samples (ancient DNA) and describe a new method to amplify retroviral integration sites facilitating long-read HTS. The thesis is divided into five sections. In the first part, I describe the biological problem, the evolution of sequencing technologies, resulting in information technology problems and proposed solutions (chapter 1). In the second chapter, I present a comparison of three different target enrichment techniques to retrieve retroviral integration sites from museum koala samples. The computational pipeline I developed for this purpose is presented. In chapter 3 I describe a method (sonication inverse polymerase chain reaction) for target enrich- ment of long sequence fragments to exploit the capacities of third-generation sequencing technologies. An analysis pipeline for the processing of sonication inverse PCR products was established. Moreover, the remaining problems resulting from artificial read structures are discussed. In chapter 4 the method described in chapter 3 was used to profile koala retrovirus integrations. The striking discovery of a new retroviral recombinant in koalas is reported. Finally, I discuss our findings and compare short- and long-read HTS technologies. An outlook for further applications and remaining computational problems is outlined. Overall, this thesis contributes to the automated computational processing of HTS data from target enrichment techniques to profile endogenous retroviruses in host genomes.
en
dc.format.extent
X, 118 Seiten
dc.rights.uri
http://www.fu-berlin.de/sites/refubium/rechtliches/Nutzungsbedingungen
dc.subject.ddc
500 Naturwissenschaften und Mathematik::500 Naturwissenschaften::500 Naturwissenschaften und Mathematik
dc.title
Development of Bioinformatic Tools for Retroviral Analysis from High Throughput Sequence Data
dc.contributor.gender
female
dc.contributor.firstReferee
Reinert, Knut
dc.contributor.furtherReferee
Greenwood, Alex
dc.date.accepted
2019-06-18
dc.identifier.urn
urn:nbn:de:kobv:188-refubium-24935-3
refubium.affiliation
Mathematik und Informatik
dcterms.accessRights.dnb
free
dcterms.accessRights.openaire
open access