Improving the Performance of a Named Entity Recognition System with Knowledge Acquisition

Kim, Myung Hee; Compton, Paul

doi:10.1007/978-3-642-33876-2_11

Myung Hee Kim²⁵ &
Paul Compton²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7603))

Included in the following conference series:

International Conference on Knowledge Engineering and Knowledge Management

1940 Accesses
5 Citations

Abstract

Named Entity Recognition (NER) is important for extracting information from highly heterogeneous web documents. Most NER systems have been developed based on formal documents, but informal web documents usually contain noise, and incorrect and incomplete expressions. The performance of current NER systems drops dramatically as informality increases in web documents and a different kind of NER is needed. Here we propose a Ripple-Down-Rules-based Named Entity Recognition (RDRNER) system. This is a wrapper around the machine-learning-based Stanford NER system, correcting its output using rules added by people to deal with specific application domains. The key advantages of this approach are that it can handle the freer writing style that occurs in web documents and correct errors introduced by the web’s informal characteristics. In these studies the Ripple-Down Rule approach, with low-cost rule addition improved the Stanford NER system’s performance on informal web document in a specific domain to the same level as its state-of-the-art performance on formal documents.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

All that Glitters Is Not Gold – Rule-Based Curation of Reference Datasets for Named Entity Recognition and Entity Linking

Named Entity Recognition Through Learning from Experts

Named Entity Recognition in Natural Language Processing: A Systematic Review

Keywords

References

Califf, M.E., Mooney, R.J.: Relational Learning of Pattern-Match Rules for Information Extraction. In: ACL 1997 Workshop in Natural Language Learning (1997)
Google Scholar
Rozenfeld, B., Feldman, R.: Self-supervised relation extraction from the Web. Knowl. Inf. Syst. 17, 17–33 (2008)
Article Google Scholar
Collot, M., Belmore, N.: Electronic Language: A New Variety of English. In: Computer-Mediated Communications: Linguistic, Social and Cross-Cultural Perspectives. John Benjamins, Amsterdam/Philadelphia (1996)
Google Scholar
Rau, L.F.: Extracting Company Names from Text. In: 6th IEEE Conference on Artificial Intelligence Applications. IEEE Computer Society Press, Miami Beach (1991)
Google Scholar
Kang, B.H., Compton, P., Preston, P.: Multiple Classification Ripple Down Rules: Evaluation and Possibilities. In: 9th Banff Knowledge Acquisition for Knowledge Based Systems Workshop (1995)
Google Scholar
Bunescu, R.C., Mooney, R.J.: Learning to Extract Relations from the Web using Minimal Supervision. In: 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic (2007)
Google Scholar
Asahara, M., Matsumoto, Y.: Japanese Named Entity Extraction with Redundant Morphological Analysis. In: Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics (2003)
Google Scholar
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Linguisticae Investigationes 30, 3–26 (2007)
Article Google Scholar
Etzioni, O., Cafarella, M., Downey, D., Popescu, A., Shaked, T., Soderland, S., Weld, D., Yates, A.: Unsupervised named-entity extraction from the Web: An experimental study. Artif. Intell. 165, 91–134 (2005)
Article Google Scholar
Nguyen, D.P.T., Matsuo, Y., Ishizuka, M.: Relation extraction from wikipedia using subtree mining. In: 22nd National Conference on Artificial Intelligence, vol. 2, pp. 1414–1420. AAAI Press (2007)
Google Scholar
Zhu, J., Nie, Z., Liu, X., Zhang, B., Wen, J.R.: StatSnowball: a statistical approach to extracting entity relationships. In: 18th International Conference on World Wide Web, pp. 101–110. ACM, Madrid (2009)
Chapter Google Scholar
Zacharias, V.: Development and Verification of Rule Based Systems — A Survey of Developers. In: Bassiliades, N., Governatori, G., Paschke, A. (eds.) RuleML 2008. LNCS, vol. 5321, pp. 6–16. Springer, Heidelberg (2008)
Chapter Google Scholar
Toral, A., Muñoz, R.: A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy (2006)
Google Scholar
Kazama, J.i., Torisawa, K.: ExploitingWikipedia as External Knowledge for Named Entity Recognition. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic (2007)
Google Scholar
Banko, M., Etzioni, O.: The Tradeoffs Between Open and Traditional Relation Extraction. In: ACL 2008: HLT (2008)
Google Scholar
Riloff, E., Jones, R.: Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping. In: 16th National Conference on Artificial Intelligence and the 11th Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence (1999)
Google Scholar
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: The Association for Computer Linguistics (2005)
Google Scholar
Ratinov, L., Roth, D.: Design Challenges and Misconceptions in Named Entity Recognition. In: CONLL 2009 (2009)
Google Scholar
Nadeau, D., Turney, P.D., Matwin, S.: Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. In: Lamontagne, L., Marchand, M. (eds.) Canadian AI 2006. LNCS (LNAI), vol. 4013, pp. 266–277. Springer, Heidelberg (2006)
Chapter Google Scholar
Mikheev, A., Moens, M., Grover, C.: Named Entity recognition without gazetteers. In: 9th Conference on European Chapter of the Association for Computational Linguistics, pp. 1–8. Association for Computational Linguistics, Bergen (1999)
Chapter Google Scholar
Liu, X., Zhang, S., Wei, F., Zhou, M.: Recognizing Named Entities in Tweets. In: 49th Association for Computational Linguistics, pp. 359–367 (2011)
Google Scholar
Compton, P., Peters, L., Lavers, T., Kim, Y.S.: Experience with long-term knowledge acquisition. In: 6th International Conference on Knowledge Capture, pp. 49–56. ACM, Banff (2011)
Chapter Google Scholar
Pham, S.B., Hoffmann, A.: Extracting Positive Attributions from Scientific Papers. In: Discovery Science Conference (2004)
Google Scholar
Pham, S.B., Hoffmann, A.: Efficient Knowledge Acquisition for Extracting Temporal Relations. In: 17th European Conference on Artificial Intelligence, Riva del Garda, Italy (2006)
Google Scholar
Xu, H., Hoffmann, A.: RDRCE: Combining Machine Learning and Knowledge Acquisition. In: Pacific Rim Knowledge Acquisition Workshop (2010)
Google Scholar
Kim, M.H., Compton, P., Kim, Y.S.: RDR-based Open IE for the Web Document. In: 6th International Conference on Knowledge Capture, Banff, Alberta, Canada (2011)
Google Scholar
Clark, A., Tim, I.: Combining Distributional and Morphological Information for Part of Speech Induction. In: 10th Annual Meeting of the European Association for Computational Linguistics (2003)
Google Scholar
Ho, V.H., Compton, P., Benatallah, B., Vayssiere, J., Menzel, L., Vogler, H.: An incremental knowledge acquisition method for improving duplicate invoices detection. In: Proceedings of the International Conference on Data Engineering, Shanghai, China, pp. 1415–1418 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of New South Wales, Sydney, NSW, Australia
Myung Hee Kim & Paul Compton

Authors

Myung Hee Kim
View author publications
You can also search for this author in PubMed Google Scholar
Paul Compton
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Vrije Universiteit, Amsterdam, The Netherlands
Annette ten Teije
Institute of Computer Science and Business Informatics, University of Mannheim, Germany
Johanna Völker & Heiner Stuckenschmidt &
Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland
Siegfried Handschuh
Knowledge Media Institute, The Open University, Milton Keynes, UK
Mathieu d’Acquin & Andriy Nikolov &
Institut de Recherche en Informatique, Université de Toulouse, 118, route de Narbonne, 31062, Toulouse Cedex 4, France
Nathalie Aussenac-Gilles
Université de Toulouse, France
Nathalie Hernandez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, M.H., Compton, P. (2012). Improving the Performance of a Named Entity Recognition System with Knowledge Acquisition. In: ten Teije, A., et al. Knowledge Engineering and Knowledge Management. EKAW 2012. Lecture Notes in Computer Science(), vol 7603. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33876-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-33876-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33875-5
Online ISBN: 978-3-642-33876-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving the Performance of a Named Entity Recognition System with Knowledge Acquisition

Abstract

Chapter PDF

Similar content being viewed by others

All that Glitters Is Not Gold – Rule-Based Curation of Reference Datasets for Named Entity Recognition and Entity Linking

Named Entity Recognition Through Learning from Experts

Named Entity Recognition in Natural Language Processing: A Systematic Review

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Improving the Performance of a Named Entity Recognition System with Knowledge Acquisition

Abstract

Chapter PDF

Similar content being viewed by others

All that Glitters Is Not Gold – Rule-Based Curation of Reference Datasets for Named Entity Recognition and Entity Linking

Named Entity Recognition Through Learning from Experts

Named Entity Recognition in Natural Language Processing: A Systematic Review

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation