Abstract
Efficient processing of complex streaming data presents multiple challenges, especially when combined with intelligent detection of hidden anomalies in real time. We label such systems Stream Anomaly Monitoring Systems (SAMS), and describe the CMU/Dynamix ARGUS system as a new kind of SAMS to detect rare but high value patterns combining streaming and historical data. Such patterns may correspond to hidden precursors of terrorist activity, or early indicators of the onset of a dangerous disease, such as a SARS outbreak. Our method starts from an extension of the RETE algorithm for matching streaming data against multiple complex persistent queries, and proceeds beyond to transitivity inferences, conditional intermediate result materialization, and other such techniques to obtain both accuracy and efficiency, as demonstrated by the evaluation results outperforming classical techniques such as a modern DMBS.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abadi, D.J., et al.: Aurora: a new model and architecture for data stream management. The VLDB Journal 12(2), 120–139 (2003)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symp. PODS (2002)
Blakeley, J.A., et al.: Updating Derived Relations: Detecting Irrelevant and Autonomously Computable Updates. ACM Trans. on Database Systems (TODS) 14(3), 369–400 (1989)
Chandrasekaran, S., et al.: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In: Proc. of the 2003 Conf. on Innovative Data Systems Research (2003)
Chen, J., et al.: Design and Evaluation of Alternative Selection Placement Strategies in Optimizing Continuous Queries. In: Proc. of the 18th Intl. Conf. on Data Engineering (2002)
Fink, E., Goldstein, A., Hayes, P., Carbonell, J.: Search for Approximate Matches in Large Databases. In: Proc. of the 2004 IEEE Intl. Conf. on Systems, Man, and Cybernetics (2004)
Forgy, C.L.: Rete: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem. Artificial Intelligence 19(1), 17–37 (1982)
Haas, L., et al.: Startburst Mid-Flight: As the Dust Clears. IEEE Trans. on Knowledge and Data Engineering 2(1), 143–160 (1990)
Hanson, E.N., Bodagala, S., Chadaga, U.: Optimized Trigger Condition Testing in Ariel Using Gator Networks. Technical Report TR-97-021, CISE Dept., Univ. of Florida (1997)
Hanson, E.N., et al.: Scalable Trigger Processing. In: Proc. of the 15th Intl. Conf. on Data Engineering (1999)
Jin, C., Carbonell, J.: ARGUS: Rete + DBMS = Efficient Continuous Profile Matching on Large-Volume Data Streams. Tech. Report CMU-LTI-04-181, Carnegie Mellon Univ. (2004), http://www.cs.cmu.den/~cjin/publications/Rete.pdf
Liu, L., Tang, W., Buttler, D., Pu, C.: Information Monitoring on the Web: A Scalable Solution. World Wide Web Journal 5(4) (2002)
Miranker, D.P.: TREAT: A New and Efficient Match Algorithm for IA Production Systems. Morgan Kaufmann, San Francisco (1990)
Miranker, D.P., Brant, D.A.: An algorithmic basis for integrating production systems and large databases. In: Proc. of the Sixth Intl. Conf. on Data Engineering (1990)
Ono, K., Lohman, G.: Measuring the Complexity of Join Enumeration in Query Optimization. In: Proc. of 16th Intl. Conf. on VLDB, pp. 314–325 (1990)
Perlin, M.W.: The match box algorithm for parallel production system match. Technical Report CMU-CS-89-163, Carnegie Mellon Univ. (1989)
Pirahesh, H., et al.: A Rule Engine for Query Transformation in Starburst and IBM DB2 C/S DBMS. In: Proc. of 13th Intl. Conf. on Data Engineering, pp. 391–400 (1997)
Schreier, U., Pirahesh, H., Agrawal, R., Mohan, C.: Alert: An Architecture for Transforming a Passive DBMS into an Active DBMS. In: Proc. of 17th Intl. Conf. on VLDB (1991)
Terry, D., Goldberg, D., Nichols, D., Oki, B.: Continuous Queries over Append-Only Databases. In: Proc. of the 1992 ACM SIGMOD Intl. Conf., (1992)
Widom, J., Ceri, S. (eds.): Active Database Systems. Morgan Kaufmann, San Francisco (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jin, C., Carbonell, J., Hayes, P. (2005). ARGUS: Rete + DBMS = Efficient Persistent Profile Matching on Large-Volume Data Streams. In: Hacid, MS., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds) Foundations of Intelligent Systems. ISMIS 2005. Lecture Notes in Computer Science(), vol 3488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11425274_15
Download citation
DOI: https://doi.org/10.1007/11425274_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25878-0
Online ISBN: 978-3-540-31949-8
eBook Packages: Computer ScienceComputer Science (R0)