×

Burrows-Wheeler transform and run-length enconding. (English) Zbl 1405.68466

Brlek, Srečko (ed.) et al., Combinatorics on words. 11th international conference, WORDS 2017, Montréal, QC, Canada, September 11–15, 2017. Proceedings. Cham: Springer (ISBN 978-3-319-66395-1/pbk; 978-3-319-66396-8/ebook). Lecture Notes in Computer Science 10432, 228-239 (2017).
Summary: In this paper we study the clustering effect of the Burrows-Wheeler Transform (BWT) from a combinatorial viewpoint. In particular, given a word \(w\) we define the BWT-clustering ratio of \(w\) as the ratio between the number of clusters produced by BWT and the number of the clusters of \(w\). The number of clusters of a word is measured by its run-length encoding. We show that the BWT-clustering ratio ranges in \(]0, 2]\). Moreover, given a rational number \(r\,\in \,]0,2]\), it is possible to find infinitely many words having BWT-clustering ratio equal to \(r\). Finally, we show how the words can be classified according to their BWT-clustering ratio. The behavior of such a parameter is studied for very well-known families of binary words.
For the entire collection see [Zbl 1369.68013].

MSC:

68W32 Algorithms on strings