On the Computation of Longest Previous Non-overlapping Factors

Ohlebusch, Enno; Weber, Pascal

doi:10.1007/978-3-030-32686-9_26

Enno Ohlebusch¹⁰ &
Pascal Weber¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11811))

Included in the following conference series:

International Symposium on String Processing and Information Retrieval

614 Accesses
1 Citations

Abstract

The f-factorization of a string is similar to the well-known Lempel-Ziv (LZ) factorization, but differs from it in that the factors must be non-overlapping. There are two linear time algorithms that compute the f-factorization. Both of them compute the array of longest previous non-overlapping factors ($\mathsf {LPnF}$-array), from which the f-factorization can easily be derived. In this paper, we present a simple algorithm that computes the $\mathsf {LPnF}$-array from the $\mathsf {LPF}$-array and an array $\mathsf {prevOcc}$ that stores positions of previous occurrences of LZ-factors. The algorithm has a linear worst-case time complexity if $\mathsf {prevOcc}$ contains leftmost positions. Moreover, we provide an algorithm that computes the f-factorization directly. Experiments show that our first method (combined with efficient $\mathsf {LPF}$-algorithms) is the fastest and our second method is the most space efficient way to compute the f-factorization.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Longest Previous Non-overlapping Factors Table Computation

The Maximum Equality-Free String Factorization Problem: Gaps vs. No Gaps

Efficient Algorithms for Longest Closed Factor Array

1 Introduction

The Lempel-Ziv (LZ) factorization [20] of a string has played an important role in data compression for more than 40 years and it is also the basis of important algorithms on strings, such as the detection of all maximal repetitions (runs) in a string [16] in linear time. Because of its importance in data compression, there is extensive literature on algorithms that compute the LZ-factorization and [1,2,3,4, 10, 12, 13, 18, 19] is an incomplete list.

A variant of the LZ-factorization is the f-factorization, which played an important role in solving a long standing open problem: it enabled the development of the first linear time algorithm for seeds computation by Kociumaka et al. [15].

Definition 1

Let $S=S[0..n-1]$ be a string of length n on an alphabet $\varSigma $. The f-factorization $s_1s_2\cdots s_m$ of S can be defined as follows. Given $s_1s_2\cdots s_{j-1}$, the next factor $s_j$ is obtained by a case distinction on the character $c=S[i]$, where $i = |s_1s_2\cdots s_{j-1}|$:

(a)
if c does not occur in $s_1s_2\cdots s_{j-1}$ then $s_j=c$
(b)
else $s_j$ is the longest prefix of $S[i..n-1]$ that is a substring of $s_1s_2\cdots s_{j-1}$.

The difference to the LZ-factorization is that the factors must be non-overlapping. There are two linear time algorithms that compute the f-factorization [5, 6]. Both of them compute the $\mathsf {LPnF}$-array (defined below), from which the f-factorization can be derived (in case (b), the factor $s_j$ is the length $\mathsf {LPnF}[i]$ prefix of $S[i..n-1]$).

Definition 2

For a string S of length n, the longest previous non-overlapping factor $(\mathsf {LPnF})$ array of size n is defined for $0\le i < n$ by

$$ \mathsf {LPnF}[i] = \max \{\ell \mid 0\le \ell \le n-i ; S[i..i+\ell -1] \text{ is } \text{ a } \text{ substring } \text{ of } S[0..i-1]\} $$

In the following, we will give a simple algorithm that directly bases the computation of the $\mathsf {LPnF}$-array on the $\mathsf {LPF}$-array, which is used in several algorithms that compute the LZ-factorization. The $\mathsf {LPF}$-array is defined by ($0\le i < n)$

$$ \mathsf {LPF}[i] = \max \{\ell \mid 0\le \ell \le n-i ; S[i..i+\ell -1] \text{ is } \text{ a } \text{ substring } \text{ of } S[0..i+\ell -2]\} $$

In data compression, we are not only interested in the length of the longest previous factor but also in a previous position at which it occurred (because otherwise decompression would be impossible). For an $\mathsf {LPF}$-array, the positions of previous occurrences are stored in an array $\mathsf {prevOcc}$. If $\mathsf {LPF}[i] = 0$, we set $\mathsf {prevOcc}[i] = \bot $ (for decompression, one can use the definition $\mathsf {prevOcc}[i] = S[i]$). Figure 1 depicts the $\mathsf {LPF}$-array of $S=a^{16}$ and two of many possible instances of the $\mathsf {prevOcc}$-array: one that stores the rightmost (rm) positions of occurrences of longest previous factors and one that stores the leftmost (lm) positions.

2 Computing $\mathsf {LPnF}$ from $\mathsf {LPF}$

Algorithm 1 computes the $\mathsf {LPnF}$-array by a right-to-left scan of the $\mathsf {LPF}$-array and its $\mathsf {prevOcc}$-array. The computation of an entry $\ell =\mathsf {LPnF}[i]$ is solely based on entries $\mathsf {LPF}[j]$ and $\mathsf {prevOcc}[j]$ with $j\le i$. Consequently, after the calculation of $\ell $, it can be stored in $\mathsf {LPF}[i]$. Since Algorithm 1 overwrites the $\mathsf {LPF}$-array with the $\mathsf {LPnF}$-array (and the $\mathsf {prevOcc}$-array of $\mathsf {LPF}$ with the $\mathsf {prevOcc}$-array of $\mathsf {LPnF}$), no extra space is needed. Algorithm 1 is based on the following simple idea:

1.
If the factor starting at position i and its previous occurrence starting at position $j=\mathsf {prevOcc}[i]$ do not overlap, then clearly $\mathsf {LPnF}[i] = \mathsf {LPF}[i]$.
2.
Otherwise, the length of the (currently best) previous non-overlapping factor is $\ell =i-j$. A longer previous non-overlapping factor exists if $\mathsf {LPF}[j] > \ell $ (note that $\mathsf {LPF}[i] > \ell $ holds): the prefix of $S[i..n-1]$ of length $\min \{\mathsf {LPF}[i],\mathsf {LPF}[j]\}$ also occurs at position $\mathsf {prevOcc}[j]$ and even if the two occurrences (starting at i and $\mathsf {prevOcc}[j]$) overlap, their non-overlapping part must be greater than $\ell $ because $\mathsf {prevOcc}[j]<j$.
3.
Step 2 is repeated until there is no further candidate (condition of the while-loop in line 7) or the two occurrences under consideration do not overlap (line 11 of Algorithm 1).

On the one hand, the example in Fig. 1 shows that Algorithm 1 may have a quadratic run-time if it uses the $\mathsf {prevOcc}$-array that stores the rightmost positions of previous occurrences. On the other hand, the next lemma proves that Algorithm 1 has a linear worst-case time complexity if it uses the $\mathsf {prevOcc}$-array that stores the leftmost positions of previous occurrences. Its proof is based on the following notion: An integer p with $0<p\le |\omega |$ is called a period of $\omega \in \varSigma ^+$ if $\omega [i]=\omega [i+p]$ for all $i\in \{0,1,\dots ,|\omega |-p-1\}$.

Lemma 1

If $\mathsf {prevOcc}$ stores the leftmost positions of previous occurrences, then the else-case on line 12 in Algorithm 1 cannot occur.

Proof

For a proof by contradiction, suppose that the else-case on line 12 in Algorithm 1 occurs for some i. We have $\mathsf {LPF}[i]>0$, $j=\mathsf {prevOcc}[i]$ is the leftmost occurrence of the longest previous factor $\omega _i$ starting at i, and $j+ \mathsf {LPF}[i] > i$. Suppose $\mathsf {LPF}[j] > i-j$, i.e., the while-loop is executed. Let $m = \min \{\mathsf {LPF}[i],\mathsf {LPF}[j]\}$ and $k=\mathsf {prevOcc}[j]$. If $m = \mathsf {LPF}[i]$, then it would follow that an occurrence of $\omega _i$ starts at k. This contradicts the fact that j is the leftmost occurrence of $\omega _i$. Consequently, $m = \mathsf {LPF}[j] < \mathsf {LPF}[i]$. The else-case on line 12 occurs when $k+ m > i$. This implies $k+ m > j$ because $i > j$. Let $\omega _j$ be the longest previous factor starting at j. Let $a=S[k+m]$ (the character following the occurrence of $\omega _j$ starting at k) and $b=S[j+m]$ (the character following the occurrence of $\omega _j$ starting at j); see Fig. 2. By definition, $a\ne b$. We will derive a contradiction by showing that $a=b$ must hold in the else-case on line 12.

Since $k+ m > j$, the occurrence of $\omega _j$ starting at k overlaps with the occurrence of $\omega _j$ starting at j. Let u be the non-overlapping part of the occurrence of $\omega _j$ starting at k, i.e., $u = S[k..j-1]$. Because the occurrence of $\omega _j$ starting at j has u as a prefix and overlaps with the occurrence of $\omega _j$ starting at k, it follows that |u| is a period of $\omega _j$; see Fig. 2. By a similar reasoning, one can see that |v| is a period of $\omega _i$, where $v = S[j..i-1]$. Since $\omega _j$ is a length m prefix of $S[j..n-1]$ and $\omega _i$ is a length $\mathsf {LPF}[i]$ prefix of $S[j..n-1]$, where $m = \mathsf {LPF}[j] < \mathsf {LPF}[i]$, it follows that $\omega _j$ is a prefix of $\omega _i$. Hence |v| is also a period of $\omega _j$. In summary, both |u| and |v| are periods of $\omega _j$. Fine and Wilf’s theorem [8] states that if $|\omega _j| \ge |u| + |v| - gcd(|u|,|v|)$, then the greatest common divisor gcd(|u|, |v|) of |u| and |v| is also a period of $\omega _j$. Since $m = |\omega _j| \ge |u| + |v|$, the theorem is applicable. Let $\gamma $ be the length gcd(|u|, |v|) prefix of $\omega _j$. It follows that $v=\gamma ^q$ for some integer $q>0$, hence $|\gamma |$ is a period of $\omega _i$. Recall that $a=S[k+m]$ is the character $\omega _j[m-|u|] = \omega _i[m-|u|]$ and $b=S[j+m] = \omega _i[m]$. We derive $a = \omega _i[m-|u|] = \omega _i[m] = b$ because $|\gamma |$ is a period of $\omega _i$ and |u| is a multiple of $|\gamma |$. This contradiction proves the lemma.

To the best of our knowledge, Abouelhoda et al. [1] first computed the LZ-factorization based on the suffix array (and the $\mathsf {LCP}$-array) of S. Their algorithm computes the $\mathsf {LPF}$-array and the $\mathsf {prevOcc}$-array that stores leftmost positions of previous occurrences of longest factors. So the combination of their algorithm and Algorithm 1 gives a linear-time algorithm that computes the $\mathsf {LPnF}$-array. Subsequent work (e.g. [2,3,4, 12, 13, 18]) concentrated on LZ-factorization algorithms that are faster in practice or more space-efficient (or both). Some of them also first compute the arrays $\mathsf {LPF}$ and $\mathsf {prevOcc}$, but their $\mathsf {prevOcc}$-arrays neither store leftmost nor rightmost occurrences (in fact, these algorithms are faster because they use lexicographically nearby suffixes–a local property–while being the leftmost occurrence is a global property). However, leftmost occurrences can easily be obtained by Algorithm 2. The algorithm is based on the following simple observation: If $\mathsf {LPF}[i]>0$, $j=\mathsf {prevOcc}[i]$, and $\mathsf {LPF}[j]\ge \mathsf {LPF}[i]$, then $\mathsf {prevOcc}[j]$ is also the starting position of an occurrence of the factor starting at i. Since $\mathsf {prevOcc}[j] < j$, an occurrence left of j has been found. The while-loop in Algorithm 2 repeats this procedure until the leftmost occurrence is found. Note that the algorithm overwrites the $\mathsf {prevOcc}$-array. Consequently, if its for-loop is executed for i, then for every $0\le j < i$, $\mathsf {prevOcc}[j]$ stores a leftmost position. The next example shows that Algorithm 2 is not linear in the worst-case. Consider the string $S=a^1\#_1a^2\#_2a^3\#_3a^4\#_4\dots a^m\#_m$, where $m > 0$ and $\#_k$ are pairwise distinct separator symbols. Clearly, the length of S is $n = m + \sum ^m_{k=1} k = m + m(m+1)/2 = m(m+3)/2$. If Algorithm 2 is applied to the arrays $\mathsf {LPF}$ and $\mathsf {prevOcc}$ in Fig. 3, it computes the leftmost (lm) $\mathsf {prevOcc}$ array and the number of iterations of its while-loop (last row in Fig. 3) is $\sum ^{m-1}_{j=1} \sum ^j_{k=1} k = (\sum ^{m-1}_{j=1} j^2 + \sum ^{m-1}_{j=1} j)/2 = (m-1)m(m+1)/6$.

3 Direct Computation of the f-Factorization

Algorithm 3 computes the f-factorization of S based on backward search on $T=S^{rev}$ and range maximum queries ($\mathsf {RMQ}$s) on the suffix array of T.^{Footnote 1} It uses ideas of [2, Algorithm CPS2] and [18, Algorithm LZ_bwd]. In fact, Algorithm 3 computes the right-to-left f-factorization of the reverse string $S^{rev}$ of S. It is not difficult to see that $s_1s_2\cdots s_m$ is the (left-to-right) f-factorization of S if and only if $s^{rev}_m\cdots s^{rev}_2 s^{rev}_1$ is the right-to-left f-factorization of $S^{rev}$. In this subsection, we assume a basic knowledge of suffix arrays ($\mathsf {SA}$), the Burrows-Wheeler transform ($\mathsf {BWT}$), and wavelet trees; see e.g. [7, 18]. Given a substring $\omega $ of T, there is a suffix array interval [sp..ep]— called the $\omega $-interval—so that $\omega $ is a prefix of every suffix $T[\mathsf {SA}[k]..n]$ if and only if $sp\le k \le ep$. For a character c, the $c\omega $-interval can be computed by one backward search step backwardSearch(c, [sp..ep]); this takes $O(\log |\varSigma |)$ time if backward search is based on the wavelet tree of the $\mathsf {BWT}$ of T. A linear time preprocessing is sufficient to obtain a space-efficient data structure that supports $\mathsf {RMQ}$s in constant time; see [9] and the references therein. $\mathsf {RMQ}(sp,ep)$ returns the index of the maximum value among $\mathsf {SA}[sp],\mathsf {SA}[sp+1],\dots ,\mathsf {SA}[ep]$; hence $\mathsf {SA}[\mathsf {RMQ}(sp,ep)]$ is the maximum of these $\mathsf {SA}$-values. Suppose Algorithm 3 has already computed $s^{rev}_{j-1}\cdots s^{rev}_2 s^{rev}_1$ and let $i=n - (|s^{rev}_{j-1}\cdots s^{rev}_2 s^{rev}_1|+1)$. It computes the next factor $s^{rev}_j$ as follows. First, it stores the starting position i in a variable pos. In line 6, backwardSearch(T[i], [0..n]) returns the c-interval [sp..ep], where $c=T[i]$. In line 7, the maximum max of $\mathsf {SA}[sp],\mathsf {SA}[sp+1],\dots ,\mathsf {SA}[ep]$ is determined. If $max = pos$ ($max < pos$ is impossible because $c=T[pos]$), then there is no occurrence of c in $T[pos+1..n]$, so that $s^{rev}_j = c$ (the algorithm outputs 0, meaning that the next factor is the next character). Otherwise, there is a previous occurrence of c at position $max > pos$ and the process is iterated, i.e., i is decremented by one and the new T[i..pos]-interval is computed etc. Consider an iteration of the repeat-loop, where [sp..ep] is the T[i..pos]-interval for some $i<pos$. The repeat-loop must be terminated early (line 9) if $max \le pos$ because then the rightmost occurrence of T[i..pos] starts left of $pos+1$. In other words, T[i..pos] is not a substring of $T[pos+1..n]$. Since the repeat-loop did not terminate in the previous iteration, $T[i+1..pos]$ is a substring of $T[pos+1..n]$ that has a previous occurrence at position $m > pos$, where m is the maximum $\mathsf {SA}$-value of the previous iteration. So $s^{rev}_j = T[i+1..pos]$ and the algorithm outputs its length $|s^{rev}_j| = pos - (i+1) +1 = pos - i$, which coincides with $|s_j|$. Note that the algorithm can easily be extended so that it also computes positions of previous occurrences. Algorithm 3 has run-time $O(n\log |\varSigma |)$ because one backward search step takes $O(\log |\varSigma |)$ time.

Kolpakov and Kucherov [17] used the reversed f-factorization (they call it reversed LZ-factorization) for searching for gapped palindromes. The reversed f-factorization is defined by replacing case (b) in Definition 1 with: (b) else $s_j$ is the longest prefix of $S[i..n-1]$ that is a substring of $(s_1s_2\cdots s_{j-1})^{rev}$. It is not difficult so see that Algorithm 3 can be modified in such a way that it computes the reversed f-factorization of S in $O(n \log |\varSigma |)$ time (to find the next factor $s_j$, match prefixes of $S[i..n-1]$ against $T=S^{rev}$).

4 Experimental Results

Our implementation is based on the sdsl-lite library [11] and we experimentally compared it with the $\mathsf {LPnF}$ construction algorithm of Crochemore and Tischler [6], called $\texttt {CT}$-algorithm henceforth. Another $\mathsf {LPnF}$ construction algorithm is described in [5], but we could not find an implementation (this algorithm is most likely slower than the $\texttt {CT}$-algorithm because it uses two kinds of range minimum queries—one on the suffix array and one on the $\mathsf {LCP}$-array—and range minimum queries are slow; see below). The experiments were conducted on a 64 bit Ubuntu 16.04.4 LTS system equipped with two 16-core Intel Xeon E5-2698v3 processors and 256 GB of RAM. All programs were compiled with the O3 option using g++ (version 5.4.1). Our programs are publically available.^{Footnote 2} The test data—the files dblp.xml, dna, english, and proteins—originate from the Pizza & Chili corpus.^{Footnote 3} In our first experiment, we computed the $\mathsf {LPnF}$-array from the $\mathsf {LPF}$-array. Three algorithms that compute the $\mathsf {LPF}$-array were considered:

AKO: algorithm by Abouelhoda et al. [1]
LZ_OG: algorithm by Ohlebusch and Gog [18]
KKP3: algorithm by Kärkkäinen et al. [14]

It is known that AKO is slower than the others, but in contrast to the other algorithms it calculates leftmost $\mathsf {prevOcc}$-arrays. Thus, there was a slight chance that AKO in combination with Algorithm 1 is faster than LZ_OG or KKP3 in combination with Algorithm 1. However, our experiments showed that this is not the case. AKO is missing in Fig. 4 because the differences between the run-times of the other algorithms become more apparent without it. For the same reason, we did not take the suffix array construction time into account (note that each of the algorithms needs the suffix array). To find out whether or not it is advantageous to compute a leftmost $\mathsf {prevOcc}$-array by Algorithm 2 before Algorithm 1 is applied, we also considered the combinations of LZ_OG and KKP3 with both algorithms. Figures 4 and 5 show the results of the first experiment. As one can see in Fig. 4, for real world data it seems disadvantageous to apply Algorithm 2 before Algorithm 1 because the overall run-time becomes slightly worse. However, for ‘problematic’ strings such as $a^n$ and $a^nb$ it is advisable to use Algorithm 2: With it both LZ_OG and KKP3 outperformed the CT-algorithm (data not shown), but without it both did not terminate after 20 min. All in all, KKP3 in combination with Algorithms 1 and 2 is the best choice for the construction of the $\mathsf {LPnF}$-array. In particular, it clearly outperforms the CT-algorithm in terms of run-time and memory usage.

In the second experiment, we compared Algorithm 3—the only algorithm that computes the f-factorization directly—with the other algorithms (which first compute the $\mathsf {LPnF}$-array and then derive the f-factorization from it). Algorithm 3 uses only 44% of the memory required by KKP3, but its run-time is by an order of magnitude worse (data not shown). We blame the range maximum queries for the rather bad run-time because these are slow in practice.

Notes

1.
In the implementation, T is terminated by a special (EOF) symbol.
2.
https://www.uni-ulm.de/in/theo/research/seqana/.
3.
http://pizzachili.dcc.uchile.cl.

References

Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. J. Discrete Algorithms 2(1), 53–86 (2004)
Article MathSciNet Google Scholar
Chen, G., Puglisi, S.J., Smyth, W.F.: Lempel-Ziv factorization using less time & space. Math. Comput. Sci. 1(4), 605–623 (2008)
Article MathSciNet Google Scholar
Crochemore, M., Ilie, L.: Computing longest previous factor in linear time and applications. Inf. Process. Lett. 106(2), 75–80 (2008)
Article MathSciNet Google Scholar
Crochemore, M., Ilie, L., Iliopoulos, C.S., Kubica, M., Rytter, W., Waleń, T.: LPF computation revisited. In: Fiala, J., Kratochvíl, J., Miller, M. (eds.) IWOCA 2009. LNCS, vol. 5874, pp. 158–169. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10217-2_18
Chapter Google Scholar
Crochemore, M., Kubica, M., Iliopoulos, C.S., Rytter, W., Waleń, T.: Efficient algorithms for three variants of the LPF table. J. Discrete Algorithms 11, 51–61 (2012)
Article MathSciNet Google Scholar
Crochemore, M., Tischler, G.: Computing longest previous non-overlapping factors. Inf. Process. Lett. 111, 291–295 (2011)
Article MathSciNet Google Scholar
Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of 41st Annual IEEE Symposium on Foundations of Computer Science, pp. 390–398 (2000)
Google Scholar
Fine, N.J., Wilf, H.S.: Uniqueness theorem for periodic functions. Proc. Am. Math. Soc. 16, 109–114 (1965)
Article MathSciNet Google Scholar
Fischer, J., Heun, V.: Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM J. Comput. 40(2), 465–492 (2011)
Article MathSciNet Google Scholar
Fischer, J., I, T., Köppl, D., Sadakane, K.: Lempel-Ziv factorization powered by space efficient suffix trees. Algorithmica 80(7), 2048–2081 (2018)
Article MathSciNet Google Scholar
Gog, S., Beller, T., Moffat, A., Petri, M.: From theory to practice: plug and play with succinct data structures. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 326–337. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07959-2_28
Chapter Google Scholar
Goto, K., Bannai, H.: Simpler and faster Lempel Ziv factorization. In: Proceedings of 23rd Data Compression Conference, pp. 133–142. IEEE Computer Society (2013)
Google Scholar
Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Linear time Lempel-Ziv factorization: simple, fast, small. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 189–200. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38905-4_19
Chapter Google Scholar
Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Lazy Lempel-Ziv factorization algorithms. ACM J. Exp. Algorithmics 21(2), Article 2.4 (2016)
Google Scholar
Kociumaka, T., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: A linear time algorithm for seeds computation. In: Proceedings of 23rd Symposium on Discrete Algorithms, pp. 1095–1112 (2012)
Google Scholar
Kolpakov, R., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: Proceedings of 40th Annual IEEE Symposium on Foundations of Computer Science, pp. 596–604 (1999)
Google Scholar
Kolpakov, R., Kucherov, G.: Searching for gapped palindromes. Theor. Comput. Sci. 410(51), 5365–5373 (2009)
Article MathSciNet Google Scholar
Ohlebusch, E., Gog, S.: Lempel-Ziv factorization revisited. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 15–26. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21458-5_4
Chapter Google Scholar
Policriti, A., Prezza, N.: LZ77 computation based on the run-length encoded BWT. Algorithmica 80(7), 1986–2011 (2018)
Article MathSciNet Google Scholar
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Theoretical Computer Science, Ulm University, 89069, Ulm, Germany
Enno Ohlebusch & Pascal Weber

Authors

Enno Ohlebusch
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Weber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enno Ohlebusch .

Editor information

Editors and Affiliations

University of A Coruña, A Coruña, Spain
Nieves R. Brisaboa
University of Helsinki, Helsinki, Finland
Simon J. Puglisi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ohlebusch, E., Weber, P. (2019). On the Computation of Longest Previous Non-overlapping Factors. In: Brisaboa, N., Puglisi, S. (eds) String Processing and Information Retrieval. SPIRE 2019. Lecture Notes in Computer Science(), vol 11811. Springer, Cham. https://doi.org/10.1007/978-3-030-32686-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-32686-9_26
Published: 03 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32685-2
Online ISBN: 978-3-030-32686-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Computation of Longest Previous Non-overlapping Factors

Abstract

Similar content being viewed by others

Longest Previous Non-overlapping Factors Table Computation

The Maximum Equality-Free String Factorization Problem: Gaps vs. No Gaps

Efficient Algorithms for Longest Closed Factor Array