Self-training and Co-training Applied to Spanish Named Entity Recognition

Kozareva, Zornitsa; Bonev, Boyan; Montoyo, Andres

doi:10.1007/11579427_78

Zornitsa Kozareva²¹,
Boyan Bonev²¹ &
Andres Montoyo²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3789))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1511 Accesses
8 Citations

Abstract

The paper discusses the usage of unlabeled data for Spanish Named Entity Recognition. Two techniques have been used: self-training for detecting the entities in the text and co-training for classifying these already detected entities. We introduce a new co-training algorithm, which applies voting techniques in order to decide which unlabeled example should be added into the training set at each iteration. A proposal for improving the performance of the detected entities has been made. A brief comparative study with already existing co-training algorithms is demonstrated.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Named Entity Extraction via Automatic Labeling and Tri-training: Comparison of Selection Methods

Domain Adaptation with Active Learning for Named Entity Recognition

Named Entity Recognition Through Learning from Experts

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: COLT: Proceedings of the Workshop on Computational Learning Theory, pp. 92–100 (1998)
Google Scholar
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGAT Conference on EMNLP and VLC, pp. 100–111 (1999)
Google Scholar
Daelemans, W., Zavrel, J., Sloot, K., van den Bosch, A.: TiMBL: Tilburg Memory-Based Learner. Technical Report ILK 04-02, Tilburg University (2004)
Google Scholar
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 327–334 (2000)
Google Scholar
Kozareva, Z., Ferrandez, O., Montoyo, A., Muñoz, R., Suárez, A.: Combining data-driven systems for improving named entity recognition. In: Proceedings of Tenth International Conference on Applications of Natural Language to Information Systems, pp. 80–90 (2005)
Google Scholar
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of Ninth International Conference on Information and Knowledge Management, pp. 86–93 (2000)
Google Scholar
Sang, T.K.: Introduction to the conll-2002 shared task: Language independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158 (2002)
Google Scholar
Schroder, I.: A case study in part-of-speech tagging using the icopost toolkit. Technical Report FBI-HH-M-314/02, Department of Computer Science, University of Hamburg (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain
Zornitsa Kozareva, Boyan Bonev & Andres Montoyo

Authors

Zornitsa Kozareva
View author publications
You can also search for this author in PubMed Google Scholar
Boyan Bonev
View author publications
You can also search for this author in PubMed Google Scholar
Andres Montoyo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, México
Alexander Gelbukh
Technológico de Monterrey (ITESM), Campus Ciudad de México (CCM), Calle del Puente 222, Col. Ejudos de Huipulco, 14360 DF, Tlalpan, Mexico
Álvaro de Albornoz
Center for Intelligent Systems, Tecnológico de Monterrey, Campus Monterrey, 64849, Monterrey, N.L., Mexico
Hugo Terashima-Marín

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kozareva, Z., Bonev, B., Montoyo, A. (2005). Self-training and Co-training Applied to Spanish Named Entity Recognition. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science(), vol 3789. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11579427_78

Download citation

DOI: https://doi.org/10.1007/11579427_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29896-0
Online ISBN: 978-3-540-31653-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Self-training and Co-training Applied to Spanish Named Entity Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Named Entity Extraction via Automatic Labeling and Tri-training: Comparison of Selection Methods

Domain Adaptation with Active Learning for Named Entity Recognition

Named Entity Recognition Through Learning from Experts

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Self-training and Co-training Applied to Spanish Named Entity Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Named Entity Extraction via Automatic Labeling and Tri-training: Comparison of Selection Methods

Domain Adaptation with Active Learning for Named Entity Recognition

Named Entity Recognition Through Learning from Experts

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation