Abstract
Creating case representations in unsupervised textual case-based reasoning applications is a challenging task because class knowledge is not available to aid selection of discriminatory features or to evaluate alternative system design configurations. Representation is considered as part of the development of a tool, called CAM, which supports an anomaly report processing task for the European Space Agency. Novel feature selection/extraction techniques are created which consider word co-occurrence patterns to calculate similarity between words. These are used together with existing techniques to create 5 different case representations. A new evaluation technique is introduced to compare these representations empirically, without the need for expensive, domain expert analysis. Alignment between the problem and solution space is measured at a local level and profiles of these local alignments used to evaluate the competence of the system design.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.: Fast discovery of association rules. In: Advances in Knowledge Discovery and DM, pp. 307–327 (1995)
Baker, L., McCallum, A.: Distributional clustering of words for text classification. In: Proceedings of the 21st ACM Int. Conf. on IR, pp. 96–103. ACM Press, New York (1998)
Cohen, W., Singer, Y.: Context-sensitive learning methods for text categorisation. ACM Transactions in Information Systems 17(2), 141–173 (1999)
Davis, R., Buchanan, B., Shortliffe, E.: Production Rules as a Representation for a Knowledge-Based Consultation Program. Artificial Intelligence 8, 15–45 (1977)
Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R.: Indexing by latent semantic analysis. American Society of Information Science 41(6), 391–407 (1990)
Gupta, K., Aha, D.: Towards acquiring case indexing taxonomies from text. In: Proceedings of the 7th Int FLAIRS Conf, pp. 307–315 (2004)
Kang, N., Domeniconi, C., Barbara, D.: Categorization and keyword identification of unlabelled documents. In: Proceedings of the 5th IEEE Int. Conf. on Data Mining (2005)
Lee, L.: On the effectiveness of the skew divergence for statistical language analysis. In: Artificial Intelligence and Statistics, pp. 65–72 (2001)
Lamontagne, L.: Textual CBR Authoring using Case Cohesion. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS (LNAI), vol. 4106, Springer, Heidelberg (2006)
Liu, T., Liu, S., Chen, Z., Ma, W.: An evaluation on feature selection for text clustering. In: Proc. of the 12th Int. Conf. on ML, pp. 488–495 (2003)
Massie, S., Craw, S., Wiratunga, N.: Complexity profiling for informed Case-Base Editing. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS (LNAI), vol. 4106, pp. 325–329. Springer, Heidelberg (2006)
Patterson, D., Rooney, N., Dobrynin, V., Galushka, M.: Sophia: A novel approach for textual case-based reasoning. In: Proc. of the 19th IJCAI Conference, pp. 1146–1153 (2005)
Pereira, F., Tishby, N., Lee, L.: Distributional clustering of English words. In: Proc. of the 30th Annual Meeting of the Association for Computational Linguistics, pp. 183–190 (1993)
Salton, G., McGill, M.: An introduction to modern IR. McGraw-Hill, New York (1983)
Slonim, N., Tishby, N.: The power of word clusters for text classification. In: Proc. of the 23rd European Colloquium on IR Research (2001)
Smyth, B., McKenna, E.: Footprint-based Retrieval. In: Althoff, K.-D., Bergmann, R., Branting, L.K. (eds.) Case-Based Reasoning Research and Development. LNCS (LNAI), vol. 1650, pp. 343–357. Springer, Heidelberg (1999)
Weber, R., Ashley, K., Bruninghaus, S.: Textual case-based reasoning. The Knowledge Engineering Review (to appear, 2006)
Weber, R., Proctor, J.M., Waldstein, I., Kriete, A.: CBR for Modeling Complex Systems. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 625–639. Springer, Heidelberg (2005)
Wiratunga, N., Koychev, I., Massie, S.: Feature Selection and Generalisation for Retrieval of Textual Cases. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 806–820. Springer, Heidelberg (2004)
Wiratunga, N., Lothian, R., Chakraborty, S., Koychev, I.: Propositional approach to textual case indexing. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 380–391. Springer, Heidelberg (2005)
Wiratunga, N., Lothian, R., Massie, S.: Unsupervised Feature Selection for Text Data. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS (LNAI), vol. 4106, pp. 340–354. Springer, Heidelberg (2006)
Wiratunga, N., Massie, S., Craw, S., Donati, A., Vicari, E.: Case Based Reasoning for Anomaly Report Processing. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS (LNAI), vol. 4106, pp. 44–49. Springer, Heidelberg (2006)
Yang, Y., Pedersen, J.: A comparative study on feature selection in text categorisation. In: Proc. of the 14th Int. Conf. on ML, pp. 412–420 (1997)
Zelikovitz, S.: Mining for features to improve classification. In: Proc. of ML Models, Technologies and Applications (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Massie, S., Wiratunga, N., Craw, S., Donati, A., Vicari, E. (2007). From Anomaly Reports to Cases. In: Weber, R.O., Richter, M.M. (eds) Case-Based Reasoning Research and Development. ICCBR 2007. Lecture Notes in Computer Science(), vol 4626. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74141-1_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-74141-1_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74138-1
Online ISBN: 978-3-540-74141-1
eBook Packages: Computer ScienceComputer Science (R0)