dc.contributor.author
Umer, Husen M.
dc.contributor.author
Audain, Enrique
dc.contributor.author
Zhu, Yafeng
dc.contributor.author
Pfeuffer, Julianus
dc.contributor.author
Sachsenberg, Timo
dc.contributor.author
Lehtiö, Janne
dc.contributor.author
Branca, Rui
dc.contributor.author
Perez-Riverol, Yasset
dc.date.accessioned
2022-04-25T08:51:03Z
dc.date.available
2022-04-25T08:51:03Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/34810
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-34529
dc.description.abstract
We have implemented the pypgatk package and the pgdb workflow to create proteogenomics databases based on ENSEMBL resources. The tools allow the generation of protein sequences from novel protein-coding transcripts by performing a three-frame translation of pseudogenes, lncRNAs and other non-canonical transcripts, such as those produced by alternative splicing events. It also includes exonic out-of-frame translation from otherwise canonical protein-coding mRNAs. Moreover, the tool enables the generation of variant protein sequences from multiple sources of genomic variants including COSMIC, cBioportal, gnomAD and mutations detected from sequencing of patient samples. pypgatk and pgdb provide multiple functionalities for database handling including optimized target/decoy generation by the algorithm DecoyPyrat. Finally, we have reanalyzed six public datasets in PRIDE by generating cell-type specific databases for 65 cell lines using the pypgatk and pgdb workflow, revealing a wealth of non-canonical or cryptic peptides amounting to >5% of the total number of peptides identified.
en
dc.format.extent
3 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by-nc/4.0/
dc.subject
proteogenomics databases
en
dc.subject
non-canonical peptides
en
dc.subject
generation of protein sequences
en
dc.subject.ddc
500 Naturwissenschaften und Mathematik::570 Biowissenschaften; Biologie::570 Biowissenschaften; Biologie
dc.title
Generation of ENSEMBL-based proteogenomics databases boosts the identification of non-canonical peptides
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.doi
10.1093/bioinformatics/btab838
dcterms.bibliographicCitation.journaltitle
Bioinformatics
dcterms.bibliographicCitation.number
5
dcterms.bibliographicCitation.pagestart
1470
dcterms.bibliographicCitation.pageend
1472
dcterms.bibliographicCitation.volume
38
dcterms.bibliographicCitation.url
https://doi.org/10.1093/bioinformatics/btab838
refubium.affiliation
Mathematik und Informatik
refubium.affiliation.other
Institut für Informatik
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
1460-2059
refubium.resourceType.provider
WoS-Alert