An Analysis of Word Sense Disambiguation (WSD)

Nanjundan, Preethi; Mathews, Eappen Zachariah

doi:10.1007/978-981-19-9090-8_22

Preethi Nanjundan⁴⁰ &
Eappen Zachariah Mathews⁴⁰

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 990))

221 Accesses
1 Citations

Abstract

Word sense disambiguation (WSD) is the method of using computer algorithms to determine the sense of arguments in the background. As a result of its difficult nature, WSD has measured an AI-complete problem, i.e., a problem whose key is as minimum as difficult as those posed by artificial intelligence. This article describes the task and introduces motives to resolve the ambiguity of words discussed throughout the text. This article summarizes supervised, unsupervised, and knowledge-based solutions. Senseval/semeval campaigns are described in relation to the assessment of WSDs, with the aim of an unbiased assessment of schemes working on numerous disambiguation errands. Finally, future directions, requests, open difficulties, and open problems are discoursed.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Practice of Word Sense Disambiguation

WSD-TIC: Word Sense Disambiguation Using Taxonomic Information Content

Advances Toward Word-Sense Disambiguation

Keywords

1 Introduction

In natural language processing (NLP), disambiguating different types of word senses (WSD) is a major challenge [1] that has quickly increased position since the advent of chatbots. When using WSD, homographs are differentiated based on their context words—identically spelled words that have different meanings and meanings in each of their surrounding sentences. WSD is the main NLP responsibilities that revolve about flawless solutions to sense arrangement and indication [2], and it often finds its way into applications surrounding NLP [3]. Numerous supervised methods toward the WSD unruly rely on models for training using sense-driven information [4]. Many of them, however, were not understandable. An example of NLP in a chatbot is that it is used for speech recognition. The problem of chatbots’ inability to distinguish between words with multiple meanings, distinct in different contexts, is well known from their real-life operation. In the sentence, “I want to buy a ticket for the upcoming movie, for instance, think of the term “book.” You can order the term “book” as “reservation,” but it is more like “reading material.” It fails to explain how it reached that conclusion. To date, state-of-the-art NLP techniques have not succeeded in interpretability which is enhanced while classification is maintained accuracy. This paper is organized in a sequential order. A full description of the task is discussed in Sect. 2. The work in this section is concluded in Sect. 3.

2 Description of the Task

Disambiguation of word senses is the skill to control, using computational methods, which sense a word assumes from its usage in a specific setting. WSD is typically applied to one or additional manuscripts (while bags of words, i.e., sequences of arguments, may be utilized if the terms are naturally occurring). Despite the punctuation in the text, we can view it as being composed of arguments (w₁, w₂, …, w_n), and WSD can be conceptualized as transfer suitable senses or sense map(s) to the words in T, identifying the mapping between words and senses, in the sense that A(i) ⊆ Senses D(wi), wherever Senses D(wi) is the usual of perceptions prearranged in a lexicon D for word wi, 1 and A(i) corresponds to those senses that are suitable in the setting of term wi * T. When plotting A, there can be additional sense assigned for individual word wi ∈ T, but naturally, only A(i) = 1 is given to each word. Natural language processing is likewise concerned with the classification [5], tagging parts of speech (redirecting parts of speech toward context-related target items), considering named entity resolution (utilizing predefined categories to classify specific texts), text categorization (i.e., assigning tags to manuscripts), etc. As a result, WSD is really composed of n separate organization responsibilities, where n is the scope of the lexicon. The generic task can be divided into two modifications:

(i)
A lexical sample (or targeted word sense disambiguation): Disambiguation occurs when a system needs to distinguish between a limited conventional of mark arguments, typically happening one at a time.
(ii)
Word-by-word disambiguation: Words in a text can be identified as nouns, adjectives, verbs, and adverbs by identifying their position in the text. As we approach the four main elements of WSD, we will look at how to select the word senses (i.e., classes), utilize outside information sources, represent setting, and choose an involuntary organization approach to use.

2.1 Choice of Word Senses

The term sense refers to an agreed upon understanding of a word. For example, you may interpret the following verdicts:

(a)
The mouse ate some cheese.
(b)
Double-click the mouse to make changes.

The word mouse is cast off in the overhead sentences with two dissimilar senses: a minor long-tailed rodent (a) and a computer device (b). The two senses are obviously linked, as they may be request an equal thing; though, the thing’s future usages are dissimilar. The instances brand it strong that causal the intelligence list of a term may be an important problematic in disambiguation of word sense: Are we future to allocate dissimilar lessons to the two amounts of the mouse in sentences (a) and (b)? Consider the sense list for the noun mouse, for example. (a) Should we add a more sense to the list for “a small long-tailed rodent” or does the primary sense include this intelligence? Because of such doubts, dissimilar selections are successful to be complete in numerous lexicons.

2.2 Knowledge Resources from Outside the Organization

The principles of WSD are based on knowledge. Computer-assisted dictionaries, thesauri, glossaries, ontologies, etc., are also included in this research. Information on these and other matters can be found in Litkowski [6] and Agirre [7].

Assets are arranged as a structure:

(i)
These materials contain thesauruses, which offer data about relations among arguments, including synonymity (like car, a substitute of motorcar), antonymy (as opposed to beautiful), and potentially others [8]. WSD very generally uses Roget’s International Thesaurus [9].
(ii)
Machine-readable dictionaries (MRDs)—Recent decades have seen the development of highly valuable sources of information for NLP, such as Collins English Dictionary and Oxford Dictionary of English, as well as Longman Dictionary of Contemporary English. To summarize the extensively explored LDOCE, WordNet's prolixity has been the most frequently used machine-readable lexicon in the NLP exploration community.
(iii)
Ontologies which are conceptualizations of certain fields of curiosity are frequently comprising classification and a usual of semantic relations in this regard, in addition to rearrange and postulate WordNet, the SUMO upper ontology,
1. (a)
  Corpora—Corpora are groups of documents used to learn language models. There are two types of corpora: sense-annotated and raw (i.e., unlabeled).
2. (b)
  Raw corpora—The Wall Street Journal (WSJ) corpus [10] is covering about 30 million arguments, and the Gig word corpus is consisting of 2 billion arguments of paper text [11].
3. (c)
  The main and widely used sense-tagged corpus—The MultiSemCor multisense corpus of English and Italian words, the SemCor corpus containing 4000 sense-tagged illustrations of nouns, adjectives, and verbs, and the Open Mind Word Expert dataset [12] were also utilized.
4. (d)
  Samples contain the Word Draft Engine—Web1T corpus [13] is a huge collection of manuscript co-occurrences that has quickly added approval in the WSD communal. One trillion words of the Web are used to generate frequency data for sequences of up to five words.
5. (e)
  The second category of resources includes word frequency lists, stop lists (a list of words without discrimination such as a, an, the, etc.), domain tags [14], etc.

The subsequent are some of the knowledge sources broadly used in the ground: WordNet.

WordNet. Synsets (sets of synonyms) based on psycholinguistic principles encode concepts (Miller et al. 1990; Fellbaum 1998). There are more than 117,000 synsets and 155,000 words in WordNet 3.0. As an example, consider the synset for the word automobile (remember superscripts and subscripts represent sense identifiers and parts of speech, respectively):

$$ \left\{ {{\text{lion}}_{n}^{1} ,{\text{king of beasts}}_{n}^{1} ,{\text{Panthera leo}}_{n}^{1} } \right\}. $$

Synsets can be thought of as sets of word senses that express (roughly) the same meaning. As described in Segment 2.1, the subsequent function assigns, respectively, part-of-speech tagged word, a WordNet sense corresponding to its aspect:

$$ {\text{Senses}}_{W\;N} :L \times {\text{POS}} \to {2}^{{{\text{SYNSETS}}}} , $$

where SYNSETS is the complete set of synsets in WordNet. For instance:

$$ \begin{aligned} {\text{Senses}}_{W\;N} {\text{(lion}}_{n} {)} & = \left\{ {\left\{ {{\text{lion}}_{n}^{1} ,{\text{king of beasts}}_{n}^{1} ,{\text{Panthera leo}}_{n}^{1} } \right\}} \right., \\ & \quad \left\{ {{\text{lion}}_{n}^{2} ,{\text{social lion}}_{n}^{2} } \right\},\left\{ {{\text{lion}}_{n}^{3} ,{\text{Leo}}_{n}^{3} } \right\}, \\ & \quad \left. {\left\{ {{\text{lion}}_{n}^{3} ,{\text{Leo}}_{n}^{4} ,{\text{Leo the Lion}}_{n}^{4} } \right\}} \right\}. \\ \end{aligned} $$

Every word sense can be identified unambiguously as belonging to a single synset. In particular, the synset of animal1, bird 1, canary 1, fish 4, and shark is unambiguously determined given animal¹_n. A WordNet semantic network excerpt showing animal¹_n synset is shown in Fig. 1. The following information is provided by WordNet for each synset:

(a)
This gloss is a documented definition of the synset perhaps including instances (e.g., a gloss of animal¹_n would read “a living being equipped with skin, that has movement, eats, breathes, is loyal.”).
(b)
There are lexical and semantic connections between word senses and synsets. Lexical relatives denote word senses included in different synsets, whereas semantic relations characterize synsets in their entirety. The following are some examples of lexical relations:
(c)
If X is antonymous with Y, then it means the opposite of Y (e.g., good¹_a is the opposite of bad¹_a). Despite its name, a synonym exists for every word in the English language.
(d)
Pertainymy: It is described by the adjective X used to describe a noun (or a different word) Y (e.g., dental¹_a relates to tooth¹_n).
(e)
X nouns nominalize Y verbs (e.g., service²_n nominalizes serve⁴_v). Here are some semantic relations.
(f)
An hypernym is a relation in which one (kind of) X is the same as another (car¹_n is a hypernym of motor vehicle¹_n). Nominal and verbal synsets exhibit hypernymy.

2.3 Contextual Representation

In order to accomplish this, a preprocessing of the input text is typically performed, which can include the following steps (but not necessarily):

i.
Tokenization, the process of dividing a text into tokens (usual words).
ii.
Part-of-speech tagging, i.e., assigning grammatical categories to words ((Ram/NN, is/VBZ, a/DT, good/JJ, boy/NN, went/VBD, to/TO, school/NN), where DT, JJ, VBZ/VBD, and NN are tags for determinants, adjectives, predicates, and noun, respectively).
iii.
Lemmatization, which morphological variants are reduced to their most basic form (e.g., was → be, boys → boy).
iv.
Chunking, the division of a document into pieces that are syntactically related (e.g., [Ram]NNP [went to school] VP are the noun phrase and the verb phrase of the example, respectively).
v.
Analyzing syntactical structures of sentences is done by constructing syntax trees (Fig. 2).
Fig. 2
Extract of the WordNet domain tags’ classification
Full size image

Figure 3 shows an illustration of how the processing flow works. A preprocessing step can result in the representation of each word as a path of different structures, or in a more designed way, such as a tree or diagram that shows a relation between words. The context is signified by a set of features. These data include information resulting from preprocessing steps, including the parts of speech tags, the grammar relations, the lemmas, etc. The following features may be included in surveys:

(a)
Local features: This type of characteristic describes the local framework of word usage, i.e., structures of a few adjacent words, such as word forms, parts of speech, relative positions to the target word, and so on.
(b)
Topical features: Topical features, as opposed to local features, reflect larger circumstances (e.g., a gap of words, a phrase, a section, etc.) and are typically represented as bags of words.
(c)
Syntactic features: Describe syntactic prompts and arguments as there is a relationship between the words in the sentence and the target term (keep in mind that these words are not always in the local context).
(d)
Semantic features: A semantic feature represents aspects of a word such as its sense in context, a domain indicator, etc.

A feature vector can then be constructed based on individual word occurrence (regularly within a sentence). One of the following features is shown as a probable feature vector as illustrated in Table 1.

Table 1 Nouns in sentences are represented by feature vectors

Full size table

Consider Table 1, (a) the tank is full, and (b) the new tank has yet to be tested in the field, where tank is our vectors containing ten resident structures for the part-of-speech tags, and our target phrase is “The tank” which has two words on the left-hand and eight words on the correct, as well as a sense categorization tag (either VESSEL or ARMORED MILITARY VEHICLE in our illustration). Table 2 presents varying sized images for each word. A target word might be an n-gram (a sequence of n words combined with the target word), a bigram (n = 2), a trigram (n = 3), or a whole phrase or sentence.

Table 2 Variations in word context sizes

Full size table

Trees or graphs are frequently used as representations for word contexts that span the duration of a book. As training cases are frequently (but not always) conducted in this manner, flat depictions (such as background vectors) are best for supervised disambiguation approaches. The benefits of a structured portrayal lie in their use in both unsupervised and knowledge-based approaches, since they allow the full exploitation of the lexical and semantic links between ideas in computational lexicons and semantic networks.

3 Conclusion

Disambiguation of all words is incredibly helpful from a practical standpoint though we consider that disambiguating all content words is a bit academic. For example, in Senseval-3 all-words test set, tokens lemmatized as Bev make up roughly 8% of the total. This common verb does not appear to have much of an impact on the success of user inquiries in information retrieval systems. In vitro testing of WSD systems is not recommended because it reduces their performance unnecessarily and provides no information of benefits of end-to-end implementations. Several knowledge-based and supervised systems may perform with precision exceeding 90% and poor recall even after good sense distinctions are being utilized. This configuration may also have an impact on the concept of web semantics: the availability of technologies that can disambiguate would undoubtedly aid semantic interoperability. Disambiguation may be required only for a subset of a page content that conveys the resource's true content. Based on the meaning words convey, they can be disambiguated using computational lexicons and domain ontologies. Optimization (“disambiguate less, disambiguate better”), in both application-specific and relative settings, should be studied in subsequent assessment campaigns, in our opinion.

References

Agirre E, Edmonds P (2007) Word sense disambiguation: algorithms and applications. Springer, Dordrecht
Google Scholar
Navigli R, Camacho-Collados J, Raganato A (2017) Word sense disambiguation: a unified evaluation framework and empirical comparison. In: EACL
Google Scholar
de Lacalle OL, Agirre E (2015) A methodology for word sense disambiguation at 90% based on large-scale crowdsourcing. In: SEM@NAACL-HLT
Google Scholar
Liao K, Ye D, Xi Y (2010) Research on enterprise text knowledge classification based on knowledge schema. In: 2010 2nd IEEE international conference on information management and engineering, pp 452–456
Google Scholar
Abney S, Light M (1999) Hiding a semantic class hierarchy in a Markov model. In: Proceedings of the ACL workshop on unsupervised learning in natural language processing (College Park, MD), pp 1–8
Google Scholar
Litkowski KC (2005) Computational lexicons and dictionaries. In: Brown KR (ed) Encyclopedia of language and linguistics, 2nd edn. Elsevier Publishers, Oxford, U.K., pp 753–761
Google Scholar
Ide N, Wilks Y (2006) Making sense about sense. In: Agirre E, Edmonds P (eds) Word sense disambiguation: algorithms and applications. Springer, New York, NY, pp 47–73
Chapter Google Scholar
Kilgarriff A, Yallop C (2000) What’s in a thesaurus? In: Proceedings of the 2nd conference on language resources and evaluation (LREC, Athens, Greece), pp 1371–1379
Google Scholar
Roget PM (1911) Roget’s international thesaurus, 1st edn. Cromwell, New York, NY
Google Scholar
Charniak E, Blaheta D, Ge N, Hall K, Hale J, Johnson M (2000) Blip 1987–89 WSJ corpus release 1. Tech. rep. LDC2000T43. Linguistic Data Consortium (Philadelphia, PA)
Google Scholar
Graff D (2003) English gig word. Tech. rep. LDC2003T05. Linguistic Data Consortium, Philadelphia, PA
Google Scholar
Atkins S (1993) Tools for computer-aided corpus lexicography: the Hector project. Acta Linguistica Hungarica 41:5–72
Google Scholar
Brants T, Franz A (2006) Web 1t 5-gram, ver. 1, ldc2006t13. Linguistic Data Consortium, Philadelphia, PA
Google Scholar
Magnini B, Cavaglia` G (2000) Integrating subject field codes into WordNet. In: Proceedings of the 2nd conference on language resources and evaluation (LREC, Athens, Greece), pp 1413–1418
Google Scholar
Bernard JRL (ed) (1986) Macquarie thesaurus. macquarie, Sydney, Australia
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Data Science, Christ University, Pune, Lavasa, India
Preethi Nanjundan & Eappen Zachariah Mathews

Authors

Preethi Nanjundan
View author publications
You can also search for this author in PubMed Google Scholar
Eappen Zachariah Mathews
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Preethi Nanjundan .

Editor information

Editors and Affiliations

National Institute of Technology Kurukshetra, Kurukshetra, India
Sarika Jain
University of Lübeck, Lübeck, Germany
Sven Groppe
MIT-IBM Watson AI Lab, Cambridge, MA, USA
Nandana Mihindukulasooriya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nanjundan, P., Mathews, E.Z. (2023). An Analysis of Word Sense Disambiguation (WSD). In: Jain, S., Groppe, S., Mihindukulasooriya, N. (eds) Proceedings of the International Health Informatics Conference. Lecture Notes in Electrical Engineering, vol 990. Springer, Singapore. https://doi.org/10.1007/978-981-19-9090-8_22

Download citation

DOI: https://doi.org/10.1007/978-981-19-9090-8_22
Published: 19 May 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-9089-2
Online ISBN: 978-981-19-9090-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Analysis of Word Sense Disambiguation (WSD)

Abstract