Abstract
Regular expression matching is a key task (and often the computational bottleneck) in a variety of widely used software tools and applications, for instance, the unix grep and sed commands, scripting languages such as awk and perl, programs for analyzing massive data streams, etc. We show how to solve this ubiquitous task in linear space and O(nm(loglogn)/(logn)3/2 + n + m) time where m is the length of the expression and n the length of the string. This is the first improvement for the dominant O(nm/logn) term in Myers’ O(nm/logn + (n + m)logn) bound [JACM 1992]. We also get improved bounds for external memory.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: principles, techniques, and tools. Addison-Wesley Longman Publishing Co., Inc., Boston (1986)
Arlazarov, V.L., Dinic, E.A., Kronrod, M.A., Faradzev, I.A.: On economic construction of the transitive closure of a directed graph. Dokl. Acad. Nauk. 194, 487–488 (1970)
Bille, P.: New algorithms for regular expression matching. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4051, pp. 643–654. Springer, Heidelberg (2006)
Bille, P., Farach-Colton, M.: Fast and compact regular expression matching. Theoret. Comput. Sci. 409, 486–496 (2008)
Chan, T.M.: More algorithms for all-pairs shortest paths in weighted graphs. In: Proc. 39th STOC, pp. 590–598 (2007)
Frederickson, G.N.: Ambivalent data structures for dynamic 2-edge-connectivity and k smallest spanning trees. SIAM J. Comput. 26(2), 484–538 (1997); announced at FOCS 1991
Galil, Z.: Open problems in stringology. In: Apostolico, A., Galil, Z. (eds.) Combinatorial problems on words. NATO ASI Series, vol. F12, pp. 1–8 (1985)
Glushkov, V.M.: The abstract theory of automata. Russian Math. Surveys 16(5), 1–53 (1961)
Johnson, T., Muthukrishnan, S., Rozenbaum, I.: Monitoring regular expressions on out-of-order streams. In: Proc. 23rd ICDE, pp. 1315–1319 (2007)
Kernighan, B., Ritchie, D.: The C Programming Language, 2nd edn. Prentice-Hall, Englewood Cliffs (1988) (1st edn., 1978)
Kleene, S.C.: Representation of events in nerve nets and finite automata. Automata Studies, Ann. Math. Stud. 34, 3–41 (1956)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. 27th VLDB, pp. 361–370 (2001)
Masek, W., Paterson, M.: A faster algorithm for computing string edit distances. J. Comput. System Sci. 20, 18–31 (1980)
McNaughton, R., Yamada, H.: Regular expressions and state graphs for automata. IRE Trans. on Electronic Computers 9(1), 39–47 (1960)
Murata, M.: Extended path expressions of XML. In: Proc. 20th PODS, pp. 126–137 (2001)
Myers, E.W.: A four-russian algorithm for regular expression pattern matching. J. ACM 39(2), 430–448 (1992)
Navarro, G.: NR-grep: a fast and flexible pattern-matching tool. Softw. Pract. Exper. 31(13), 1265–1312 (2001)
Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. J. Comp. Biology 10(6), 903–923 (2003)
Navarro, G., Raffinot, M.: New techniques for regular expression searching. Algorithmica 41(2), 89–116 (2004)
Stroustrup, B.: The C++ Programming Language: Special Edition, 3rd edn. Addison-Wesley, Reading (2000) (1st. edn., 1985)
Thompson, K.: Regular expression search algorithm. Comm. ACM 11, 419–422 (1968)
Wall, L.: The Perl Programming Language. Prentice Hall Software Series (1994)
Wu, S., Manber, U.: Agrep – a fast approximate pattern-matching tool. In: Proc. USENIX, pp. 153–162 (1992)
Yu, F., Chen, Z., Diao, Y., Lakshman, T.V., Katz, R.H.: Fast and memory-efficient regular expression matching for deep packet inspection. In: Proc. ANCS, pp. 93–102 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bille, P., Thorup, M. (2009). Faster Regular Expression Matching. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds) Automata, Languages and Programming. ICALP 2009. Lecture Notes in Computer Science, vol 5555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02927-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-02927-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02926-4
Online ISBN: 978-3-642-02927-1
eBook Packages: Computer ScienceComputer Science (R0)