dc.contributor.author
Hombach, Daniela
dc.contributor.author
Schwarz, Jana Marie
dc.contributor.author
Robinson, Peter N.
dc.contributor.author
Schuelke, Markus
dc.contributor.author
Seelow, Dominik
dc.date.accessioned
2018-06-08T03:19:47Z
dc.date.available
2016-07-12T11:34:27.572Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/14943
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-19131
dc.description.abstract
Background The modelling of gene regulation is a major challenge in biomedical
research. This process is dominated by transcription factors (TFs) and
mutations in their binding sites (TFBSs) may cause the misregulation of genes,
eventually leading to disease. The consequences of DNA variants on TF binding
are modelled in silico using binding matrices, but it remains unclear whether
these are capable of accurately representing in vivo binding. In this study,
we present a systematic comparison of binding models for 82 human TFs from
three freely available sources: JASPAR matrices, HT-SELEX-generated models and
matrices derived from protein binding microarrays (PBMs). We determined their
ability to detect experimentally verified “real” in vivo TFBSs derived from
ENCODE ChIP-seq data. As negative controls we chose random downstream exonic
sequences, which are unlikely to harbour TFBS. All models were assessed by
receiver operating characteristics (ROC) analysis. Results While the area-
under-curve was low for most of the tested models with only 47 % reaching a
score of 0.7 or higher, we noticed strong differences between the various
position-specific scoring matrices with JASPAR and HT-SELEX models showing
higher success rates than PBM-derived models. In addition, we found that while
TFBS sequences showed a higher degree of conservation than randomly chosen
sequences, there was a high variability between individual TFBSs. Conclusions
Our results show that only few of the matrix-based models used to predict
potential TFBS are able to reliably detect experimentally confirmed TFBS. We
compiled our findings in a freely accessible web application called ePOSSUM
(http:/mutationtaster.charite.de/ePOSSUM/) which uses a Bayes classifier to
assess the impact of genetic alterations on TF binding in user-defined
sequences. Additionally, ePOSSUM provides information on the reliability of
the prediction using our test set of experimentally confirmed binding sites.
en
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.subject
Transcription factor binding sites
dc.subject
TFBS prediction
dc.subject
Genetic variation
dc.subject.ddc
600 Technik, Medizin, angewandte Wissenschaften::610 Medizin und Gesundheit
dc.title
A systematic, large-scale comparison of transcription factor binding site
models
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation
BMC Genomics. - 17 (2016), Artikel Nr. 388
dcterms.bibliographicCitation.doi
10.1186/s12864-016-2729-8
dcterms.bibliographicCitation.url
http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2729-8
refubium.affiliation
Charité - Universitätsmedizin Berlin
de
refubium.mycore.fudocsId
FUDOCS_document_000000024987
refubium.note.author
Der Artikel wurde in einer Open-Access-Zeitschrift publiziert.
refubium.resourceType.isindependentpub
no
refubium.mycore.derivateId
FUDOCS_derivate_000000006761
dcterms.accessRights.openaire
open access