dc.contributor.author
Heller, David
dc.contributor.author
Krestel, Ralf
dc.contributor.author
Ohler, Uwe
dc.contributor.author
Vingron, Martin
dc.contributor.author
Marsico, Annalisa
dc.date.accessioned
2018-06-08T10:32:40Z
dc.date.available
2018-01-16T09:56:20.324Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/20618
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-23919
dc.description.abstract
RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional
regulation and recognize target RNAs via sequence-structure motifs. The extent
to which RNA structure influences protein binding in the presence or absence
of a sequence motif is still poorly understood. Existing RNA motif finders
either take the structure of the RNA only partially into account, or employ
models which are not directly interpretable as sequence-structure motifs. We
developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and
Gibbs sampling which fully captures the relationship between RNA sequence and
secondary structure preference of a given RBP. Compared to previous methods
which output separate logos for sequence and structure, it directly produces a
combined sequence-structure motif when trained on a large set of sequences.
ssHMM’s model is visualized intuitively as a graph and facilitates biological
interpretation. ssHMM can be used to find novel bona fide sequence-structure
motifs of uncharacterized RBPs, such as the one presented here for the YY1
protein. ssHMM reaches a high motif recovery rate on synthetic data, it
recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input
size, being considerably faster than MEMERIS and RNAcontext on large datasets
while being on par with GraphProt. It is freely available on Github and as a
Docker image.
en
dc.rights.uri
http://creativecommons.org/licenses/by-nc/4.0/
dc.subject
Protein-nucleic acid interaction
dc.subject
Computational Methods
dc.subject.ddc
500 Naturwissenschaften und Mathematik::540 Chemie
dc.title
ssHMM: extracting intuitive sequence-structure motifs from high-throughput
RNA-binding protein data
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation
Nucleic Acids Research. - 45 (2017), 19, S. 11004-11018
dcterms.bibliographicCitation.doi
10.1093/nar/gkx756
dcterms.bibliographicCitation.url
http://doi.org/10.1093/nar/gkx756
refubium.affiliation
Biologie, Chemie, Pharmazie
de
refubium.mycore.fudocsId
FUDOCS_document_000000028811
refubium.resourceType.isindependentpub
no
refubium.mycore.derivateId
FUDOCS_derivate_000000009338
dcterms.accessRights.openaire
open access