Abstract
In this paper we explore the use of hidden Markov models on the task of role identification from free text. Role identification is an important stage of the information extraction process, assigning roles to particular types of entities with respect to a particular event. Hidden Markov models (HMMs) have been shown to achieve good performance when applied to information extraction tasks in both semistructured and free text. The main contribution of this work is the analysis of whether and how linguistic processing of textual data can improve the extraction performance of HMMs. The emphasis is on the minimal use of computationally expensive linguistic analysis. The overall conclusion is that the performance of HMMs is still worse than an equivalent manually constructed system. However, clear paths for improvement of the method are shown, aiming at a method, which is easily adaptable to new domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bikel D.M., Miller S., Schwartz R., Weischedel R. (1997). Nymble: a high performance learning name finder. In Proceedings of ANLP-97, 194–201.
Brill E. (1995). Transformation-Based Error Driven Learning and Natural Language Processing: A Case study in Part of Speech Tagging, Computational Linguistics, vol. 21, n. 24.
Freitag D., McCallum A.K. (1999). Information extraction using HMMs and shrinkage. AAAI-99 Workshop on Machine Learning for Information Extraction, pp. 31–36. AAAI Technical Report WS-99-11.
Freitag D., McCallum A.K. (2000). Information extraction with HMM structures Learned by Stochastic Optimization, AAAI-2000, pp. 584–589.
Karkaletsis V., Farmakiotou D., Androutsopoulos I. Koutsias J., Paliouras G., Spyropoulos C.D. (2000). Information Extraction from Greek Texts in the MITOS Information Management System. Internal Technical Report, Institute of Informatics and Telecommunications, NCSR “Demokritos”.
Kupiec, J. (1992). Robust part-of-speech tagging using a hidden Markov model. Computer Speech and Language, 6, 225–242.
Leek T.R. (1997). Information extraction using hidden Markov models, Master’s thesis, UC San Diego.
MUC-6 (1995). Proceedings of the Sixth Message Understanding Conference, Morgan Kaufman, for Defense Advanced Research Projects Agency.
Petasis G., Paliouras G., Karkaletsis V., Spyropoulos C.D. and Androutsopoulos I. (1999). Using Machine Learning Techniques for Part-of-Speech Tagging in the Greek Language, Proceedings of the 7th Hellenic Conference on Informatics, Ioannina, Greece.
Petasis G., Karkaletsis V., Paliouras G., Androutsopoulos I., (2001). Ellogon: A Text Engineering Platform. Internal Technical Report, Institute of Informatics and Telecommunications, NCSR “Demokritos”.
Rabiner, L., Juang B. (1986). An introduction to hidden Markov models. IEEE Acoustics, Speech & Signal Processing Magazine, 3,4–16.
Rabiner, L. (1989). A tutorial on hidden Markov models and selected application in speech recognition. Proceedings of the IEEE 1977 (2).
Seymore K., McCallum A., Rosenfeld R. (1999). Learning hidden Markov model structure for information extraction. AAAI-99 Workshop on Machine Learning for Information Extraction., pp. 37–42.
Stolcke A., Omohundro S. (1992). Hidden Markov model induction by Bayesian model merging. In Advances in Neural Information Processing Systems, volume 5. Morgan Kaufmann.
Yamron J., Carp I., Gillick L., Lowe S., Mulbregt P. (1998). A hidden Markov model approach to text segmentation and event tracking. In Proceedings of the IEEE ICASSP.
Witten, I. H., Bell T.C. (1991). The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transactions on Information Theory 37 (4).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sigletos, G., Paliouras, G., Karkaletsis, V. (2002). Role Identification from Free Text Using Hidden Markov Models. In: Vlahavas, I.P., Spyropoulos, C.D. (eds) Methods and Applications of Artificial Intelligence. SETN 2002. Lecture Notes in Computer Science(), vol 2308. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46014-4_16
Download citation
DOI: https://doi.org/10.1007/3-540-46014-4_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43472-6
Online ISBN: 978-3-540-46014-5
eBook Packages: Springer Book Archive