Exploring Predicate-Argument Relations for Named Entity Recognition in the Molecular Biology Domain

Wattarujeekrit, Tuangthong; Collier, Nigel

doi:10.1007/11563983_23

Tuangthong Wattarujeekrit²¹ &
Nigel Collier²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3735))

Included in the following conference series:

International Conference on Discovery Science

780 Accesses

Abstract

In this paper, the semantic relationships between a predicate and its arguments in terms of semantic roles are employed to improve lexical-based named entity recognition (NER) in the molecular biology domain. The semantic roles were realized in various sets of syntactic features used by a machine learning model to explore what should be the efficient way in allowing this knowledge to provide the highest positive effect on the NER. The empirical results show that the best feature set consists of predicate’s surface form, predicate’s lemma, voice, and the united feature of subject-object head’s lemma and transitive-intransitive sense. The performance improvement from using these features indicates the advantage of the predicate-argument semantic knowledge on NER. There are still rooms to enhance NER by using this semantic knowledge (e.g. to employ other semantic roles besides agent and theme and to extend the rules for efficient identification of an argument’s boundary).

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Enhanced Named Entity Recognition with Semantic Dependency

Named Entity Recognition for Icelandic: Annotated Corpus and Models

Named Entity Matching Method Based on the Context-Free Morphological Generator

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

DARPA. The 6th Message Understanding Conference. Columbia, Maryland (1995)
Google Scholar
Stapley, B.J., Benoit, G.: Biobibliometrics: Information retrieval and visualization from co-occurrences of gene names in Medline abstracts. In: Pac. Symp. Biocomp., pp. 529–540 (2000)
Google Scholar
Willett, R.: Recent trends in hierarchic document clustering: a critical review. Information Processing & Management 25, 577 (1998)
Google Scholar
Ohta, T., Tateishi, Y., Kim, J.D.: The GENIA corpus: An annotated research abstract corpus in the molecular biology domain. HLT (2002)
Google Scholar
Fukuda, K., Tamura, A., Tsunoda, T., Takagi, T.: Toward information extraction: identifying protein names from biological papers. In: Pac. Symp. Biocomp, pp. 707–718 (1998)
Google Scholar
Spasic, I., Nenadic, G., Ananiadou, S.: Using domain-Specific Verbs for Term Classification. In: The ACL Workshop on NLP in Biomed., pp. 17–24 (2003)
Google Scholar
Takeuchi, K., Collier, N.: Use of Support Vector Machines in Extended Named Entity Recognition. In: CONLL, pp. 119–125 (2002)
Google Scholar
Zhou, G., Su, J.: Exploring Deep Knowledge Resources in Biomedical Name Recognition. In: The Joint Workshop on NLP in Biomed. and its App (JNLPBA), pp. 84–87 (2004)
Google Scholar
Kim, J.D., Ohta, T., Tsuruoka, Y., Tateishi, Y., Collier, N.: Introduction to the Bio-Entity Task at JNLPBA, pp. 70–75 (2004)
Google Scholar
Collier, N., Nobata, C., Tsujii, J.: Extracting the names of genes and gene products with a Hidden Markov Model. In: COLING, pp. 201–207 (2000)
Google Scholar
Kazama, J., Makino, T., Ohta, Y., Tsujii, J.: Tuning Support Vector Machines for Biomedical Named Entity Recognition. In: The ACL Workshop on NLP in Biomed, pp. 1–8 (2002)
Google Scholar
Lee, K.J., Hwang, Y.S., Rim, H.C.: Two-phase biomedical NE Recognition based on SVMs. In: The ACL Workshop on NLP in Biomed, pp. 33–40 (2003)
Google Scholar
Vapnix, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1998)
Google Scholar
Blaschke, C., Andrade, M.A., Ouzounis, C., Valencia, A.: Automatic extraction of biological information from scientific text: Protein-protein interactions. In: The Int. Conf. on Intelligent System Molecular Biology, pp. 60–67 (1999)
Google Scholar
Ono, T., Hishigaki, H., Tanigami, A., Takagi, T.: Automated extraction of information on protein-protein interactions from the biological literature. Bioinform 17, 155–161 (2001)
Article Google Scholar
Pustejovsky, J., Castano, J., Zhang, J.: Robust Relational parsing over Biomedical Literature: Extracting Inhibit Relations. In: Pac. Symp. Biocomput., pp. 505–516 (2002)
Google Scholar
Rindflesch, T.C., Rajan, J.V., Hunter, L.: Extracting Molecular Binding Relationships from Biomedical Text. In: ANLP, pp. 188–195 (2000)
Google Scholar
Wattarujeekrit, T., Shah, P., Collier, N.: PASBio: predicate-argument structures for event extraction in molecular biology. BMC Bioinformatics 5, 155 (2004)
Article Google Scholar
Tapanainen, P., Jarvinen, T.: A non-projective dependency parser. In: ANLP, pp. 64–71 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan
Tuangthong Wattarujeekrit & Nigel Collier

Authors

Tuangthong Wattarujeekrit
View author publications
You can also search for this author in PubMed Google Scholar
Nigel Collier
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science & Engineering, The University of New South Wales, Sydney, Australia
Achim Hoffmann
Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, 567-0047, Ibaraki, Osaka, Japan
Hiroshi Motoda
Max Planck Institute for Computer Science, Saarbrücken, Germany
Tobias Scheffer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wattarujeekrit, T., Collier, N. (2005). Exploring Predicate-Argument Relations for Named Entity Recognition in the Molecular Biology Domain. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds) Discovery Science. DS 2005. Lecture Notes in Computer Science(), vol 3735. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563983_23

Download citation

DOI: https://doi.org/10.1007/11563983_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29230-2
Online ISBN: 978-3-540-31698-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exploring Predicate-Argument Relations for Named Entity Recognition in the Molecular Biology Domain

Abstract

Chapter PDF

Similar content being viewed by others

Enhanced Named Entity Recognition with Semantic Dependency

Named Entity Recognition for Icelandic: Annotated Corpus and Models

Named Entity Matching Method Based on the Context-Free Morphological Generator

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Exploring Predicate-Argument Relations for Named Entity Recognition in the Molecular Biology Domain

Abstract

Chapter PDF

Similar content being viewed by others

Enhanced Named Entity Recognition with Semantic Dependency

Named Entity Recognition for Icelandic: Annotated Corpus and Models

Named Entity Matching Method Based on the Context-Free Morphological Generator

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation