Abstract
Correlated pattern mining has become increasingly an important task in data mining and knowledge discovery. Recently, concise exact representations dedicated for frequent correlated and for rare correlated patterns according to the Jaccard measure were presented. In this paper, we offer a new method of inferring new knowledge from the introduced concise representations. A new generic approach, called Gmjp, allowing the extraction of the sets of frequent correlated patterns, of rare correlated patterns and their associated concise representations is introduced. Pieces of new knowledge in the form of associations rules can be either exact or approximate. We also illustrate the efficiency of our approach over several data sets and we prove that Jaccard-based classification rules have very encouraging results.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB 1994), Santiago, Chile, pp. 487–499 (1994)
Barsky, M., Kim, S., Weninger, T., Han, J.: Mining flipping correlations from large datasets with taxonomies. In: Proceedings of the 38th International Conference on Very Large Databases, VLDB 2012, Istanbul, Turkey, pp. 370–381 (2012)
Ben Younes, N., Hamrouni, T., Ben Yahia, S.: Bridging conjunctive and disjunctive search spaces for mining a new concise and exact representation of correlated patterns. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS, vol. 6332, pp. 189–204. Springer, Heidelberg (2010)
Bonchi, F., Lucchese, C.: On condensed representations of constrained frequent patterns. Knowledge and Information Systems 9(2), 180–201 (2006)
Booker, Q.E.: Improving identity resolution in criminal justice data: An application of NORA and SUDA. Journal of Information Assurance and Security 4, 403–411 (2009)
Bouasker, S., Hamrouni, T., Ben Yahia, S.: New exact concise representation of rare correlated patterns: Application to intrusion detection. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part II. LNCS, vol. 7302, pp. 61–72. Springer, Heidelberg (2012)
Ganter, B., Wille, R.: Formal Concept Analysis. Springer (1999)
Grahne, G., Lakshmanan, L.V.S., Wang, X.: Efficient mining of constrained correlated sets. In: Proceedings of the 16th International Conference on Data Engineering (ICDE 2000), pp. 512–521. IEEE Computer Society Press, San Diego (2000)
Jaccard, P.: Étude comparative de la distribution orale dans une portion des Alpes et des Jura. Bulletin de la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)
Kim, S., Barsky, M., Han, J.: Efficient mining of top correlated patterns based on null-invariant measures. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 177–192. Springer, Heidelberg (2011)
Kim, W.-Y., Lee, Y.-K., Han, J.: CCMine: Efficient mining of confidence-closed correlated patterns. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 569–579. Springer, Heidelberg (2004)
Koh, Y.S., Rountree, N.: Rare Association Rule Mining and Knowledge Discovery: Technologies for Infrequent and Critical Event Detection. IGI Global Publisher (2010)
Le Bras, Y., Lenca, P., Lallich, S.: Mining classification rules without support: an anti-monotone property of jaccard measure. In: Elomaa, T., Hollmén, J., Mannila, H. (eds.) DS 2011. LNCS, vol. 6926, pp. 179–193. Springer, Heidelberg (2011)
Lee, Y.K., Kim, W.Y., Cai, Y.D., Han, J.: CoMine: efficient mining of correlated patterns. In: Proceedings of the 3rd International Conference on Data Mining (ICDM 2003), pp. 581–584. IEEE Computer Society Press, Melbourne (2003)
Mahmood, A.N., Hu, J., Tari, Z., Leckie, C.: Critical infrastructure protection: Resource efficient sampling to improve detection of less frequent patterns in network traffic. Journal of Network and Computer Applications 33(4), 491–502 (2010)
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 3(1), 241–258 (1997)
Manning, A.M., Haglin, D.J., Keane, J.A.: A recursive search algorithm for statistical disclosure assessment. Data Mining and Knowledge Discovery 16(2), 165–196 (2008)
Omiecinski, E.: Alternative interest measures for mining associations in databases. IEEE Transactions on Knowledge and Data Engineering 15(1), 57–69 (2003)
Romero, C., Romero, J.R., Luna, J.M., Ventura, S.: Mining rare association rules from e-learning data. In: Proceedings of the 3rd International Conference on Educational Data Mining (EDM 2010), Pittsburgh, PA, USA, pp. 171–180 (2010)
Segond, M., Borgelt, C.: Item set mining based on cover similarity. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 493–505. Springer, Heidelberg (2011)
Soulet, A., Raissi, C., Plantevit, M., Crémilleux, B.: Mining dominant patterns in the sky. In: Proceedings of the 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, Canada, pp. 655–664 (2011)
Surana, A., Kiran, R.U., Reddy, P.K.: Selecting a right interestingness measure for rare association rules. In: Proceedings of the 16th International Conference on Management of Data (COMAD 2010), Nagpur, India, pp. 115–124 (2010)
Szathmary, L., Valtchev, P., Napoli, A.: Generating rare association rules using the minimal rare itemsets family. International Journal of Software and Informatics 4(3), 219–238 (2010)
Tanimoto, T.T.: An elementary mathematical theory of classification and prediction. Technical Report, I.B.M. Corporation Report (1958)
Tsang, S., Koh, Y.S., Dobbie, G.: RP-tree: Rare pattern tree mining. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 277–288. Springer, Heidelberg (2011)
Wu, T., Chen, Y., Han, J.: Re-examination of interestingness measures in pattern mining: a unified framework. Data Mining and Knowledge Discovery 21, 371–397 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bouasker, S., Ben Yahia, S. (2013). Inferring Knowledge from Concise Representations of Both Frequent and Rare Jaccard Itemsets. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 2013. Lecture Notes in Computer Science, vol 8056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40173-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-40173-2_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40172-5
Online ISBN: 978-3-642-40173-2
eBook Packages: Computer ScienceComputer Science (R0)