Part-of-Speech Tagging with Two Sequential Transducers

Kempe, André

doi:10.1007/3-540-44674-5_34

André Kempe⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2088))

Included in the following conference series:

International Conference on Implementation and Application of Automata

417 Accesses
3 Citations

Abstract

We present a method of constructing and using a cascade consisting of a left-and a right-sequential finite-state transducer (FST), T ₁ and T ₂, for part-of-speech (POS) disambiguation. Compared to a Hidden Markov model (HMM), this FST cascade has the advantage of significantly higher processing speed, but at the cost of slightly lower accuracy. Applications such as Information Retrieval, where the speed can be more important than accuracy, could benefit from this approach.

In the process of POS tagging, we first assign every word of a sentence a unique ambiguity class c _i that can be looked up in a lexicon encoded by a sequential FST. Every c _i is denoted by a single symbol, e.g. “[ADJ NOUN]”, although it represents a set of alternative tags that a given word can occur with. The sequence of the c _i of all words of one sentence is the input to our FST cascade (Fig. 1). It is mapped by T ₁, from left to right, to a sequence of reduced ambiguity classes r _i. Every r _i is denoted by a single symbol, although it represents a set of alternative tags. Intuitively, T ₁ eliminates the less likely tags from c _i, thus creating r _i. Finally, T ₂ maps the sequence of r _i, from right to left, to an output sequence of single POS tags t _i. Intuitively, T ₂ selects the most likely ti from every r _i (Fig. 1).

Although our approach is related to the concept of bimachines [2] and factorization [1], we proceed differently in that we build two sequential FSTs directly and not by factorization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Enhancing Practical TAG Parsing Efficiency by Capturing Redundancy

Recursive Part-of-Speech Tagging Using Word Structures

Pointwise Prediction and Sequence-Based Reranking for Adaptable Part-of-Speech Tagging

References

C.C. Elgot, and J.E. Mezei. 1965. On relations defined by generalized finite automata. IBM Journal of Research and Development, pages 47–68, January.
Google Scholar
M.P. Schützenberger. 1961. A remark on finite transducers. Information and Control, 4:185–187.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Xerox Research Centre Europe - Grenoble Laboratory, 6 chemin de Maupertuis, 38240, Meylan, France
André Kempe

Authors

André Kempe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Middlesex College, The University of Western Ontario, London, ON, Canada, N6A 5B7
Shen Yu & Andrei Păun &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kempe, A. (2001). Part-of-Speech Tagging with Two Sequential Transducers. In: Yu, S., Păun, A. (eds) Implementation and Application of Automata. CIAA 2000. Lecture Notes in Computer Science, vol 2088. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44674-5_34

Download citation

DOI: https://doi.org/10.1007/3-540-44674-5_34
Published: 20 September 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42491-8
Online ISBN: 978-3-540-44674-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Part-of-Speech Tagging with Two Sequential Transducers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing Practical TAG Parsing Efficiency by Capturing Redundancy

Recursive Part-of-Speech Tagging Using Word Structures

Pointwise Prediction and Sequence-Based Reranking for Adaptable Part-of-Speech Tagging

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Part-of-Speech Tagging with Two Sequential Transducers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing Practical TAG Parsing Efficiency by Capturing Redundancy

Recursive Part-of-Speech Tagging Using Word Structures

Pointwise Prediction and Sequence-Based Reranking for Adaptable Part-of-Speech Tagging

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation