dc.contributor.author
Gottlieb, Simon Gene
dc.contributor.author
Reinert, Knut
dc.date.accessioned
2025-12-15T06:16:33Z
dc.date.available
2025-12-15T06:16:33Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/50834
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-50561
dc.description.abstract
Adding rank support to strings over a fixed-sized alphabet has numerous applications. Prominent among those is the (bidirectional) FM-Index which is commonly utilized to index and analyze genomic data. At its core lies the rank operation on the Burrows-Wheeler-Transform (BWT) which, given a position in the BWT and a character, answers how often the specified character appears from the start to that position. Implementing those rank queries is usually based on bit vectors with rank support. In this work, we discuss three implementation improvements. First, a novel approach named paired-blocks that reduces the space overhead of the support structure by half to a total of only 1.6%. Second, a method for masking bits for the population count (also known as popcount) which greatly improves the runtime of 512-bit wide blocks in conjunction with AVX512 SIMD extensions. Third, a revised method for EPR-dictionaries (Pockrandt et al. in International conference on research in computational molecular biology. Springer, New York, 2017) called flattened bit vectors (fBV) with less space consumption and faster rank operations on strings, which is competitive in size and depending on the parameters between 2×and 9×faster than Wavelet Trees (Gog et al. in 13th International Symposium on Experimental Algorithms. Springer, New York, 2014).
en
dc.format.extent
17 Seiten
dc.rights
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
Rank support
en
dc.subject.ddc
000 Informatik, Informationswissenschaft, allgemeine Werke::000 Informatik, Wissen, Systeme::004 Datenverarbeitung; Informatik
dc.title
Engineering rank queries on bit vectors and strings
dc.type
Wissenschaftlicher Artikel
dc.date.updated
2025-12-15T02:18:13Z
dcterms.bibliographicCitation.articlenumber
21
dcterms.bibliographicCitation.doi
10.1186/s13015-025-00291-9
dcterms.bibliographicCitation.journaltitle
Algorithms for Molecular Biology
dcterms.bibliographicCitation.number
1
dcterms.bibliographicCitation.volume
20
dcterms.bibliographicCitation.url
https://doi.org/10.1186/s13015-025-00291-9
refubium.affiliation
Informatik & Mathematik
refubium.affiliation.other
Institut für Informatik

refubium.funding
Springer Nature DEAL
refubium.note.author
Gefördert aus Open-Access-Mitteln der Freien Universität Berlin.
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
1748-7188
refubium.resourceType.provider
DeepGreen