Extending the Edit Distance Using Frequencies of Common Characters

Muhammad Fuad, Muhammad Marwan; Marteau, Pierre-François

doi:10.1007/978-3-540-85654-2_18

Muhammad Marwan Muhammad Fuad¹ &
Pierre-François Marteau¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5181))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1176 Accesses
3 Citations

Abstract

Similarity search of time series has attracted many researchers recently. In this scope, reducing the dimensionality of data is required to scale up the similarity search. Symbolic representation is a promising technique of dimensionality reduction, since it allows researchers to benefit from the richness of algorithms used for textual databases. To improve the effectiveness of similarity search we propose in this paper an extension to the edit distance that we call the extended edit distance. This new distance is applied to symbolic sequential data objects, and we test it on time series data bases in classification task experiments. We also prove that our distance is a metric.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

TimeExplorer: Similarity Search Time Series by Their Signatures

An empirical evaluation of kernels for time series

Article Open access 27 July 2021

Temporal Constraints and Sub-Dimensional Clustering for Fast Similarity Search over Time Series Data. Application to Information Retrieval Tasks.

Keywords

References

Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Proceedings of the 4th Conf. on Foundations of Data Organization and Algorithms (1993)
Google Scholar
Agrawal, R., Lin, K.I., Sawhney, H.S., Shim, K.,: Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In: Proceedings of the 21st Int’l Conference on Very Large Databases, Zurich, Switzerland, pp. 490–501 (1995)
Google Scholar
Chan, K., Fu, A.W.: Efficient Time Series Matching by Wavelets. In: Proc. of the 15th IEEE Int’l Conf. on Data Engineering, Sydney, Australia, March 23-26, 1999, pp. 126–133 (1999)
Google Scholar
Lin, J., Keogh, E.J., Lonardi, S., Chiu, B.Y.-c.: A symbolic representation of time series, with implications for streaming algorithms. DMKD 2003, 2–11 (2003)
Article Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra: Dimensionality reduction for fast similarity search in large time series databases. J. of Know. and Inform. Sys. (2000)
Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra: Locally adaptive dimensionality reduction for similarity search in large time series databases. SIGMOD, 151–162 (2001)
Google Scholar
Keogh, E.: Exact indexing of dynamic time warping. In: Proc. 28th Int. Conf. on Very Large Data Bases, pp. 406–417 (2002)
Google Scholar
Korn, F., Jagadish, H., Faloutsos, C.: Efficiently supporting ad hoc queries in large datasets of time sequences. In: Proceedings of SIGMOD 1997, Tucson, AZ, pp. 289–300 (1997)
Google Scholar
Morinaka, Y., Yoshikawa, M., Amagasa, T., Uemura, S.: The L-index: An indexing structure for efficient subsequence matching in time sequence databases. In: Proc. 5th PacificAisa Conf. on Knowledge Discovery and Data Mining, pp. 51–60 (2001)
Google Scholar
Wagner, R.A., Fischer, M.J.: The String-to-String Correction Problem. Journal of the Association for Computing Machinery 21(I), 168–173 (1974)
Google Scholar
Yi, B., K.: Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26st International Conference on Very Large Databases, Cairo, Egypt (2000)
Google Scholar
UCR Time Series datasets, http://www.cs.ucr.edu/~eamonn/time_series_data/

Download references

Author information

Authors and Affiliations

VALORIA, Université de Bretagne Sud, BP. 573, 56017, Vannes, France
Muhammad Marwan Muhammad Fuad & Pierre-François Marteau

Authors

Muhammad Marwan Muhammad Fuad
View author publications
You can also search for this author in PubMed Google Scholar
Pierre-François Marteau
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Sourav S. Bhowmick Josef Küng Roland Wagner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muhammad Fuad, M.M., Marteau, PF. (2008). Extending the Edit Distance Using Frequencies of Common Characters. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2008. Lecture Notes in Computer Science, vol 5181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85654-2_18

Download citation

DOI: https://doi.org/10.1007/978-3-540-85654-2_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85653-5
Online ISBN: 978-3-540-85654-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Extending the Edit Distance Using Frequencies of Common Characters

Abstract

Chapter PDF

Similar content being viewed by others

TimeExplorer: Similarity Search Time Series by Their Signatures

An empirical evaluation of kernels for time series

Temporal Constraints and Sub-Dimensional Clustering for Fast Similarity Search over Time Series Data. Application to Information Retrieval Tasks.

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Extending the Edit Distance Using Frequencies of Common Characters

Abstract

Chapter PDF

Similar content being viewed by others

TimeExplorer: Similarity Search Time Series by Their Signatures

An empirical evaluation of kernels for time series

Temporal Constraints and Sub-Dimensional Clustering for Fast Similarity Search over Time Series Data. Application to Information Retrieval Tasks.

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation