Abstract
We revisit the problem of finding shortest unique substring (SUS) proposed recently by Pei et al. (ICDE’13). We propose an optimal O(n) time and space algorithm that can find an SUS for every location of a string of size n and thus significantly improve their O(n 2) time complexity. Our method also supports finding all the SUSes covering every location, whereas theirs can find only one SUS for every location. Further, our solution is simpler and easier to implement and can also be more space efficient in practice, since we only use the inverse suffix array and the longest common prefix array of the string, while their algorithm uses the suffix tree of the string and other auxiliary data structures. Our theoretical results are validated by an empirical study that shows our method is much faster and more space-saving.
Part of the work was done while the first and the third authors were with TÜBİTAK-BİLGEM-UEKAE Mathematical and Computational Sciences Labs in 2013 summer. All missed proofs and pseudocode can be found in the full version of this paper [2].
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Crochemore, M., Rytter, W.: Jewels of stringology. World Scientific (2003)
İleri, A.M., Külekci, M.O., Xu, B.: Shortest unique substring query revisited, http://arxiv.org/abs/1312.2738
Kasai, T., Lee, G.H., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)
Ko, P., Aluru, S.: Space efficient linear time construction of suffix arrays. Journal of Discrete Algorithms 3(2-4), 143–156 (2005)
Pei, J., Wu, W.C.H., Yeh, M.Y.: On shortest unique substring queries. In: Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE), pp. 937–948 (2013)
Tsuruta, K., Inenaga, S., Bannai, H., Takeda, M.: Shortest unique substrings queries in optimal time. In: Geffert, V., Preneel, B., Rovan, B., Štuller, J., Tjoa, A.M. (eds.) SOFSEM 2014. LNCS, vol. 8327, pp. 503–513. Springer, Heidelberg (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
İleri, A.M., Külekci, M.O., Xu, B. (2014). Shortest Unique Substring Query Revisited. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds) Combinatorial Pattern Matching. CPM 2014. Lecture Notes in Computer Science, vol 8486. Springer, Cham. https://doi.org/10.1007/978-3-319-07566-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-07566-2_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07565-5
Online ISBN: 978-3-319-07566-2
eBook Packages: Computer ScienceComputer Science (R0)