A Machine Learning Approach for Automated Evaluation of Short Answers Using Text Similarity Based on WordNet Graphs

Vij, Sonakshi; Tayal, Devendra; Jain, Amita

doi:10.1007/s11277-019-06913-x

A Machine Learning Approach for Automated Evaluation of Short Answers Using Text Similarity Based on WordNet Graphs

Published: 22 November 2019

Volume 111, pages 1271–1282, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Wireless Personal Communications Aims and scope Submit manuscript

A Machine Learning Approach for Automated Evaluation of Short Answers Using Text Similarity Based on WordNet Graphs

Download PDF

Sonakshi Vij¹,
Devendra Tayal¹ &
Amita Jain²

807 Accesses
24 Citations
Explore all metrics

Abstract

Answer sheet evaluation is a time-consuming task that requires lot of efforts by the teachers and hence there is a strong need of automation for the same. This paper proposes a machine learning based approach that relies on WordNet graphs for finding out the text similarity between the answer provided by the student and the ideal answer provided by the teacher to facilitate the automation of answer sheet evaluation. This work is the first attempt in the field of short answer-based evaluation using WordNet graphs. Here, a novel marking algorithm is provided which can incorporate semantic relations of the answer text into consideration. The results when tested on 400 answer sheets yield promising results as compared with the state-of-art.

Automatic Short Answer Grading Using Corpus-Based Semantic Similarity Measurements

A Novel Semantic Similarity Based Technique for Computer Assisted Automatic Evaluation of Textual Answers

Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Answer sheet evaluations for the heart and soul of the examination system around the globe. Every now and then examinations are being carried out for various classes starting from nursery to higher education in the form of end-term examination, internal examination as well as weekly tests. This puts a lot of pressure on the evaluator to check and allot marks to the students in a given time frame. Since the evaluator is also involved in teaching, this leaves him/her very little time to completely dedicate themselves towards answer sheet evaluations. But this task is crucial and needs proper focus from the evaluator. Also, this needs to be an impartial task. It is common to see that sometimes due to the prejudices of the teacher; the marks of the student get affected. For instance, the teachers are slightly more inclined to giving good marks to the more obedient students. This is a psychological fact and can’t be ignored.

In the recent years, technology has crept in the classrooms, making way for an efficient learning environment. From online lectures to e-classrooms, digitization has led to the evolution of a teacher and student friendly learning ambience. Automation of answer sheet evaluation has also been aimed in the past but high levels of accuracy have not been achieved. Earlier attempts in this field have either been not so accurate or they have been extensively time consuming. Through this paper, we aim at amalgamating the concepts of natural language processing such as context identification and text similarity into question answering, to facilitate the whole process of short answer-based script evaluation in an accurate and time efficient manner. Text similarity is a widely used technique for finding the amount of relatedness between two texts [1,2,3,4,5,6,7,8,9]. This concept has been studied by researchers worldwide for applications in various domains [10,11,12,13,14,15]. Text similarity approaches are being refined on a regular basis to provide optimum results for various applications [16,17,18,19,20,21,22,23,24,25].

Some of the latest research works in the field of short answer evaluation highlight that Fuzzy WordNet graphs play a significant role in the analysis. Vii et al. [26] depicts that the WordNet graph for the ideal answer can be generated and then it may be used to create a set of keywords that are essential for evaluating the students answer sheet. Since this paper uses WordNet as the sense repository hence it established context in a more elaborate manner. But this paper lacks the presentation of fruitful results as the testing is done only on a synthetic dataset. In order to add more relevance to such works, it becomes essential for us to include in this paper, a set of more elaborate results which are obtained after testing on a larger dataset.

In order to hand out meaningful explanations for sheet analysis, handwriting recognition can also be amalgamated in the process as depicted in [27]. Sijimol and Varghese [27] presents a model that can be used for acquiring a model that learns on the previous data based on the handwriting of an individual. But this is not a practical approach. Also, testing data is not sufficient for this analysis. Cosine sentence similarity was used in [27].

Van Hoecke [28] works on the algorithm that aims at utilizing sentence-based summarization techniques for performing grading of short answers of students. But this poses a limitation that sentence based ranking is not always accurate. So, the error due to faulty sentence ranking is relayed and carry forward in grading as well. Also, mostly sentence based ranking algorithms are based on machine translation and similarity scores which not very accurate. Hence, these types of approached are practically useful. Roy et al. [29] compares and contrasts the various existing techniques for performing short answer evaluation in terms of grading. This type of study is useful for us as it enables us to briefly outline the cons of the existing state-of-art techniques.

This paper focuses on proposing a novel method of automatically evaluating the answer sheets of students using a machine learning based approach. The technique adopted for the same is the generation of WordNet graphs. WordNet [5, 23] developed at Princeton university, is a computational lexicon consists of words and various relations between words. WordNet can be viewed as a graph where nodes represent words and edges represents relation between words. WordNet is widely being used in the literature for resolving several natural language processing tasks including word sense disambiguation, machine translation, information retrieval etc. [20, 21]. WordNet graphs play a significant role in information retrieval and hence they help in incorporating semantic significance and structural dependencies [22].

In this paper, WordNet is used for finding out the text similarity between the ideal answer provided by the teacher and the answer provided by the students in their answer sheet. The WordNet graphs are constructed to represent ideal answer and the answer for evaluation. Now the similarity between these two graphs is computed based on common nodes appearance. The marks for the answer to be evaluated are assigned in proportional to the similarity between these two graphs. The results for the proposed method are obtained on a dataset consisting of 400 students answer sheets. Answer sheets for evaluation are selected in a way to incorporate similarity and diversity in the data set.

The rest of the paper is framed as follows: Sect. 2 highlights the background study related to text similarity. Section 3 describes the proposed approach. Section 4 explains the results obtained. Section 5 concludes the work and states the relevant future scope.

2 Background Study

The main concept utilized in this paper for answer sheet evaluation is finding the text similarity between the ideal answer and the answer provided by the student. In order to study the latest recent trends in the field of text similarity, Web of Science (WoS) is taken as the data source. The below mentioned query was used for extracting the research papers pertaining to this field:

$$ {\text{TI}} = \left( {{\text{``Text Similarity''}}} \right) $$

The research papers were obtained through the above-mentioned query from the year 1989–2017 [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]. The keywords occurring in these research papers were analyzed to visualize a keyword co-occurrence network as shown in Fig. 1. These keywords depict the various research topics associated with text similarity.

It can be observed from Fig. 1 that graph theory/technique, semantic dependencies and structural dependencies are closely associated to this field. Hence in this paper, a combination of these is taken to propose a novel method for calculating text similarity applied to answer sheet evaluation.

3 Proposed Approach

This section highlights the proposed approach adopted for evaluating the answer sheets of students in an automated manner. As concluded from the previous section, graph theory, semantic and structural dependencies play a significant role in text similarity calculation. Hence in this paper, a machine learning oriented WordNet graph-based method is proposed for answer sheet evaluation. WordNet is an online lexical dictionary type of database that consists of senses of a word according to its various part of speech tags. It consists of various semantic relationships intertwined to make a huge lexical database. WordNet graphs have been widely used in the literature for resolving lexical issues such as word sense disambiguation [20, 22]. The WordNet graph generated in this paper uses semantic relations hypernym, hyponym, meronym and holonym.

Siddiqi et al. [24] highlights that several types of short answer evaluations occur like the ones dealing with “True–False” type questions, fill in the blanks, sentence completion, “description required”, “justification required”, “example required” etc. the method proposed in this paper deals with short answer evaluation for the type of questions where a brief description is to be provided by the student with relevant short explanation if needed. The context can be well established for short answer evaluation using WordNet but for larger queries the context dissolves. For instance, it is difficult to automatically evaluate answers that have technical words in it since all of them are not available in WordNet. Other types of questions may be handled in the future. The method is explained as in Table 1.

Table 1 Proposed method for automated answer sheet evaluation using WordNet graph-based text similarity

Full size table

To illustrate this, let us take the following text as the question to be evaluated:

Question: What is a car?
Answer: (Ideal, as provided by the teacher): Car is a vehicle with four wheels.

This answer text is treated as query Q₁. The implementation of the proposed method is carried out in python using Natural Language Tool Kit (NLTK) libraries.

Q₁ is tokenized and POS Tagging is done for the same. The result of the POS tagging set is as follows:

Tagged words set for Q₁ = [(‘car’, ‘NN’), (‘is’, ‘VBZ’), (‘a’, ‘DT’), (‘vehicle’, ‘NN’), (‘with’, ‘IN’), (‘four’, ‘CD’), (‘wheels’, ‘NNS’)].

For the sake of simplicity, the content words chosen for generating the WordNet graph are (‘car’, ‘NN’), (‘vehicle’, ‘NN’), and (‘wheels’, ‘NNS’). The semantic relations hypernym, hyponym, meronym and holonym are taken for this purpose. The WordNet graph is generated using depth first search algorithm [20]. It is shown as in Fig. 2 and has 49 nodes.

The node set of this graph (N_S) is as follows:

N_S= {“Synset(‘Wheeled_Vehicle.N.01’)”, “Synset(‘Car.N.01’)”, “Synset(‘Wheel.V.02’)”, “Synset(‘Wheel.N.01’)”, “Synset(‘Travel.V.01’)”, “Synset(‘Valve.N.03’)”, “Synset(‘Car_Wheel.N.01’)”, “Synset(‘Handwheel.N.02’)”, “Synset(‘Rack.N.04’)”, “Synset(‘Car.N.04’)”, “Synset(‘Cable_Car.N.01’)”, “Synset(‘Minivan.N.01’)”, “Synset(‘Helm.N.01’)”, “Synset(‘Ride.V.02’)”, “Synset(‘Compartment.N.02’)”, “Synset(‘Van.N.05’)”, “Synset(‘Vehicle.N.01’)”, “Synset(‘Steering_System.N.01’)”, “Synset(‘Steering_Wheel.N.01’)”, “Synset(‘Wheel.V.03’)”, “Synset(‘Wagon_Wheel.N.01’)”, “Synset(‘Lathe.N.01’)”, “(‘Vehicle’, ‘NN’)”, “Synset(‘Bicycle_Wheel.N.01’)”, “Synset(‘Instrumentality.N.03’)”, “Synset(‘Sprocket.N.02’)”, “Synset(‘Wheel.N.04’)”, “Synset(‘Conveyance.N.03’)”, “Synset(‘Cab.N.01’)”, “Synset(‘Bicycle.N.01’)”, “Synset(‘Vehicle.N.03’)”, “Synset(‘Bicycle.V.01’)”, “Synset(‘Medium.N.01’)”, “Synset(‘Passenger_Van.N.01’)”, “Synset(‘Vehicle.N.02’)”, “Synset(‘Motor_Vehicle.N.01’)”, “Synset(‘Roulette_Wheel.N.01’)”, “(‘Wheels’, ‘NNS’)”, “(‘Car’, ‘NN’)”, “Synset(‘Wheel.N.03’)”, “Synset(‘Car.N.03’)”, “Synset(‘Fomite.N.01’)”, “Synset(‘Self-Propelled_Vehicle.N.01’)”, “Synset(‘Wagon.N.01’)”, “Synset(‘Car.N.02’)”, “Synset(‘Handwheel.N.01’)”, “Synset(‘Travel.V.05’)”, “Synset(‘Wheel.V.01’)”, “Synset(‘Truck.N.01’)”}

The answer sheets will be evaluated based on these nodes. There may exist, 2 basic types of answer sheets:

(a)
When the answer written by the student matches logically with the ideal answer
(b)
When the answer written by the student does not match with the ideal answer and is not relevant to the context either.
- Case 1: When the student has written an accurate and logical answer according to the context

Let us suppose that the 1st candidate has put up the answer as:

Q₂: Car has wheels and an engine.

Now, in order to evaluate the 1st candidate answer sheet, Q₂ is tokenized and tagged as follows:

$$ {\text{Tagged}}\;{\text{words}}\;{\text{set}}\;{\text{for}}\;Q_{2} = \left[ {\left( {{\text{`car'}},\;{\text{`NN'}}} \right),\left( {{\text{`has'}}, {\text{`VBZ'}}} \right),\left( {{\text{`wheels'}},{\text{`NNS'}}} \right),\left( {{\text{`and'}},{\text{`CC'}}} \right),\left( {{\text{`an'}},{\text{`DT'}}} \right),\left( {{\text{`engine'}},{\text{`NN'}}} \right)} \right] $$

where NN = Noun, VBZ = Verb, DT = Determiner, IN = Preposition, CD = Cardinal Digit

For the sake of simplicity, the content words chosen for generating the WordNet graph are (‘car’, ‘NN’), (‘wheels’, ‘NNS’), and (‘engine’, ‘NN’). The WordNet graph is generated as shown as in Fig. 3. The total number of nodes in this WordNet graph is 53.

The node set of this graph (N_S1) is as follows:

N_S1= {“Synset(‘Engine.N.02’)”, “Synset(‘Wheeled_Vehicle.N.01’)”, “Synset(‘Car.N.01’)”, “Synset(‘Wheel.V.02’)”, “Synset(‘Wheel.N.01’)”, “Synset(‘Travel.V.01’)”, “Synset(‘Valve.N.03’)”, “Synset(‘Motor.N.01’)”, “Synset(‘Car_Wheel.N.01’)”, “Synset(‘Handwheel.N.02’)”, “Synset(‘Instrument_Of_Torture.N.01’)”, “(‘Engine’, ‘NN’)”, “Synset(‘Rack.N.04’)”, “Synset(‘Engine.N.04’)”, “Synset(‘Car.N.04’)”, “Synset(‘Cable_Car.N.01’)”, “Synset(‘Minivan.N.01’)”, “Synset(‘Helm.N.01’)”, “Synset(‘Ride.V.02’)”, “Synset(‘Compartment.N.02’)”, “Synset(‘Van.N.05’)”, “Synset(‘Automobile_Engine.N.01’)”, “Synset(‘Steering_System.N.01’)”, “Synset(‘Steering_Wheel.N.01’)”, “Synset(‘Wheel.N.04’)”, “Synset(‘Locomotive.N.01’)”, “Synset(‘Instrument_Of_Punishment.N.01’)”, “Synset(‘Lathe.N.01’)”, “Synset(‘Bicycle.V.01’)”, “Synset(‘Sprocket.N.02’)”, “Synset(‘Machine.N.01’)”, “Synset(‘Instrument.N.01’)”, “Synset(‘Bicycle_Wheel.N.01’)”, “(‘Wheels’, ‘NNS’)”, “Synset(‘Cab.N.01’)”, “Synset(‘Wagon_Wheel.N.01’)”, “Synset(‘Bicycle.N.01’)”, “Synset(‘Engine.N.01’)”, “Synset(‘Passenger_Van.N.01’)”, “Synset(‘Wheel.V.03’)”, “Synset(‘Motor_Vehicle.N.01’)”, “Synset(‘Roulette_Wheel.N.01’)”, “Synset(‘Wheel.N.03’)”, “(‘Car’, ‘NN’)”, “Synset(‘Car.N.03’)”, “Synset(‘Travel.V.05’)”, “Synset(‘Self-Propelled_Vehicle.N.01’)”, “Synset(‘Wagon.N.01’)”, “Synset(‘Device.N.01’)”, “Synset(‘Car.N.02’)”, “Synset(‘Handwheel.N.01’)”, “Synset(‘Wheel.V.01’)”, “Synset(‘Truck.N.01’)”}

Now, find out the nodes that match between N_S1 and N_S and put them in N:

N = {“Synset(‘Wheeled_Vehicle.N.01’)”, “Synset(‘Car.N.01’)”, “Synset(‘Wheel.V.02’)”, “Synset(‘Wheel.N.01’)”, “Synset(‘Travel.V.01’)”, “Synset(‘Valve.N.03’)”, “Synset(‘Car_Wheel.N.01’)”, “Synset(‘Handwheel.N.02’)”, “Synset(‘Rack.N.04’)”, “Synset(‘Car.N.04’)”, “Synset(‘Cable_Car.N.01’)”, “Synset(‘Minivan.N.01’)”, “Synset(‘Helm.N.01’)”, “Synset(‘Ride.V.02’)”, “Synset(‘Compartment.N.02’)”, “Synset(‘Van.N.05’)”, “Synset(‘Steering_System.N.01’)”, “Synset(‘Steering_Wheel.N.01’)”, “Synset(‘Wheel.N.04’)”, “Synset(‘Lathe.N.01’)”, “Synset(‘Bicycle.V.01’)”, “Synset(‘Sprocket.N.02’)”, “Synset(‘Bicycle_Wheel.N.01’)”, “(‘Wheels’, ‘NNS’)”, “Synset(‘Cab.N.01’)”, “Synset(‘Wagon_Wheel.N.01’)”, “Synset(‘Bicycle.N.01’)”, “Synset(‘Passenger_Van.N.01’)”, “Synset(‘Wheel.V.03’)”, “Synset(‘Motor_Vehicle.N.01’)”, “Synset(‘Roulette_Wheel.N.01’)”, “Synset(‘Wheel.N.03’)”, “(‘Car’, ‘NN’)”, “Synset(‘Car.N.03’)”, “Synset(‘Self-Propelled_Vehicle.N.01’)”, “Synset(‘Wagon.N.01’)”, “Synset(‘Car.N.02’)”, “Synset(‘Handwheel.N.01’)”, “Synset(‘Wheel.V.01’)”, “Synset(‘Truck.N.01’)”}

It can be observed that N consists of 40 nodes (|N|) which means that out of 49 nodes in the ideal answer sheet graph, 40 matches with the 1st candidate answer sheet. This means that the answer is very relevant to the given context, and hence it can be marked for a 10-mark question as (40*10/49) = 8.1.

Case 2: When the answer written by the student does not match with the ideal answer and is not relevant to the context either.

Now the 2nd candidate answer sheet has put up the answer as:

Q₃: Car is used for transportation.

Now, in order to evaluate the 2nd candidate answer sheet, Q₃ is tokenized and tagged as follows:Tagged words set for Q₃ = [(‘car’, ‘NN’), (‘is’, ‘VBZ’), (‘used’, ‘VBN’), (‘for’, ‘IN’), (‘transportation’, ‘NN’)]where NN = Noun, VBZ = Verb, IN = Preposition, VBN = Verb (past participle)

For the sake of simplicity, the content words chosen for generating the WordNet graph are (‘car’, ‘NN’) and (‘transportation’, ‘NN’). The WordNet graph is generated as shown as in Fig. 4. The total number of nodes in this WordNet graph is 27.

The node set of this graph (N_S2) is as follows:

N_S2= {“Synset(‘Be.V.02’)”, “Synset(‘Exist.V.01’)”, “Synset(‘Equal.V.01’)”, “Synset(‘Practice.V.04’)”, “Synset(‘Exploit.V.01’)”, “Synset(‘Be.V.10’)”, “Synset(‘Secondhand.S.02’)”, “Synset(‘Used.A.01’)”, “(‘Is’, ‘VBZ’)”, “Synset(‘Constitute.V.01’)”, “Synset(‘Be.V.12’)”, “Synset(‘Be.V.11’)”, “Synset(‘Exploited.S.02’)”, “Synset(‘Use.V.01’)”, “Synset(‘Use.V.02’)”, “Synset(‘Stay.V.01’)”, “Synset(‘Embody.V.02’)”, “Synset(‘Be.V.03’)”, “Synset(‘Use.V.06’)”, “Synset(‘Cost.V.01’)”, “Synset(‘Take.V.02’)”, “Synset(‘Be.V.05’)”, “Synset(‘Use.V.03’)”, “Synset(‘Use.V.04’)”, “Synset(‘Be.V.01’)”, “Synset(‘Be.V.08’)”, “(‘Used’, ‘VBN’)”}

Now, find out the nodes that match between N_S2 and N_S and put them in N:

$$ N = \left\{ \varPhi \right\}//{\text{NULL}}\;{\text{SET}} $$

It can be observed that in this case, N doesn’t consist of any nodes which means that out of 49 nodes in the ideal answer sheet graph, none matches with the 2nd candidate answer sheet i.e. |N| = 0. This means that the answer is not relevant to the given context, and hence it would be marked zero. The results for the example taken for illustration are summarized as in Table 2.

Table 2 Results for the considered example

Full size table

4 Results and Evaluation

To test the effectiveness of this approach, a dataset was considered in which answer sheets of 400 students were collected. The answer sheets belong to the subject social studies. This was observed through experimentation that the proposed system does not apply well to technical subjects like computer science engineering. This is so because WordNet doesn’t contain all the technical words and definitions. For the result evaluation, these 400 answer sheets were checked beforehand by the teachers. These sheets were scanned, and their text was converted into a machine-readable format using OCR (Optical Character Recognition). The answers in these sheets were analyzed according to the proposed method and were re-evaluated. The marks obtained by the proposed method and the actual marks were compared to calculate the Root Mean Square Error (RMSE) using Eq. 1.

$$ RMSE = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {(X_{obs,i} - X_{mo\,del,i} )^{2} } }}{n}} $$

(1)

here X_obs,i = Marks of the answer sheet as evaluated by the teacher, X_model,i = Marks of the answer sheet as calculated by the proposed method, n = Number of observations = 400.

Table 3 summarizes the performance of the proposed method as compared with the state-of-art, when applied to the considered dataset. Better results are obtained as compared to the state-of-art owing to the novelty of the proposed algorithm that takes into consideration the degree of semantic relatedness of the candidate answer to the ideal answer decided and provided by the teacher/evaluator. This would in turn help in impartial evaluations of the answer sheets.

Table 3 Standard deviation for accuracy and time for proposed method vs. state-of-art methods when tested on synthetic dataset

Full size table

Hence, it can be concluded that the proposed method yields promising results. This can be attributed to the fact that the state-of-art doesn’t take into consideration the semantic relationships and lexical expansion, but the proposed method does. It should also be highlighted here that IndusMarker [24] generates the word cloud in an automated manner which is to be manually analyzed by the evaluator. The proposed system on the other hand generates the WordNet graphs and assigns the scores automatically. This in turn assists in reducing the time of evaluation which is another significant aspect of answer sheet checking. In order to further increase the accuracy, there is a need to incorporate more measures of semantic relatedness.

5 Conclusion and Future Scope

This paper proposes a novel concept for answer sheet evaluation using the concept of text similarity applied to WordNet graphs. The answer sheets are evaluated by identifying the common nodes that occur between the node set of the ideal answer WordNet graph and the candidate answer WordNet graph. This kind of an evaluation combines the various significant concepts related to text similarity like semantic and structural dependencies. The root mean square error for the proposed approach was found to be 0.319 when tested on a dataset consisting of 400 students answer sheets. Unlike the state-of-art, the proposed method generates the WordNet graphs and assigns the scores automatically which in turn assists in reducing the time of evaluation. This shows that the proposed approach of answer sheet evaluation yields promising results in terms of both accuracy and time of evaluation. This work is suitable in scenarios where the student enters the correct spelling of the concerned words. The WordNet graph for erroneous non-words won’t be generated. In the future, this work might be extended to incorporate measures to resolve this issue. Although marks are deducted in manual evaluation too for incorrect spellings, but some partial assignment is possible.

References

Li, X., Liu, N., Yao, C. L., & Fan, F. L. (2017). Text similarity measurement with semantic analysis. International Journal of Innovative Computing Information and Control.,13(5), 1693–1708.
Google Scholar
Li, X., Yao, C. L., Fan, F. L., & Yu, X. Q. (2017). A text similarity measurement method based on singular value decomposition and semantic relevance. Journal of Information Processing Systems,13(4), 863–875. https://doi.org/10.3745/jips.02.0067.
Article Google Scholar
Al-Smadi, M., Jaradat, Z., Al-Ayyoub, M., & Jararweh, Y. (2017). Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features. Information Processing and Management,53(3), 640–652. https://doi.org/10.1016/j.ipm.2017.01.002.
Article Google Scholar
Kchaou, D., Bouassida, N., & Ben-Abdallah, H. (2017). UML models change impact analysis using a text similarity technique. IET Software,11(1), 27–37. https://doi.org/10.1049/iet-sen.2015.0113.
Article Google Scholar
Cho, S. G., & Kim, S. B. (2017). A data-driven text similarity measure based on classification algorithms. International Journal of Industrial Engineering-Theory Applications and Practice,24(3), 328–339.
Google Scholar
Abdul-Jabbar, S. S., & George, L. E. (2017). A comparative study for string metrics and the feasibility of joining them as combined text similarity measures. ARO The Scientific Journal of Koya University,5(2), 6–18. https://doi.org/10.14500/aro.10180.
Article Google Scholar
Reddy, G. S., & Rajinikanth, T. V. (2017). A text similarity measure for document classification. IADIS-International Journal on Computer Science and Information Systems,12(1), 14–25.
Google Scholar
Abdul-Rahman, A., Roe, G., Olsen, M., Gladstone, C., Whaling, R., Cronk, N., et al. (2017). Constructive visual analytics for text similarity detection. Computer Graphics Forum,36(1), 237–248. https://doi.org/10.1111/cgf.12798.
Article Google Scholar
Atoum, I., & Otoom, A. (2016). Efficient hybrid semantic text similarity using WordNet and a corpus. International Journal of Advanced Computer Science and Applications,7(9), 124–130.
Article Google Scholar
Bao, X. A., Dai, S. C., Zhang, N., & Yu, C. H. (2016). Large-scale text similarity computing with spark. International Journal of Grid and Distributed Computing,9(4), 95–100. https://doi.org/10.14257/ijgdc.2016.9.4.09.
Article Google Scholar
Kashyap, A., Han, L., Yus, R., Sleeman, J., Satyapanich, T., Gandhi, S., et al. (2016). Robust semantic text similarity using LSA, machine learning, and linguistic resources. Language Resources and Evaluation,50(1), 125–161. https://doi.org/10.1007/s10579-015-9319-2.
Article Google Scholar
Rahutomo, F., & Aritsugi, M. (2014). Econo-ESA in semantic text similarity. Springerplus. https://doi.org/10.1186/2193-1801-3-149.
Article Google Scholar
Huang, C. H., Liu, Y., Xia, S. Z., & Yin, J. A. (2011). A text similarity measure based on suffix tree. Information-an International Interdisciplinary Journal,14(2), 583–592.
Google Scholar
Quan, X. J., Liu, G., Lu, Z., Ni, X. L., & Wenyin, L. (2010). Short text similarity based on probabilistic topics. Knowledge and Information Systems,25(3), 473–491. https://doi.org/10.1007/s10115-009-0250-y.
Article Google Scholar
Sun, Z. H., Errami, M., Long, T., Renard, C., Choradia, N., & Garner, H. (2010). Systematic characterizations of text similarity in full text biomedical publications. PLoS ONE,5(9), 11. https://doi.org/10.1371/journal.pone.0012704.
Article Google Scholar
Atlam, E. (2008). A new approach for text similarity using articles. International Journal of Information Technology & Decision Making,7(1), 23–34. https://doi.org/10.1142/s021962200800279x.
Article MATH Google Scholar
Lewis, J., Ossowski, S., Hicks, J., Errami, M., & Garner, H. R. (2006). Text similarity: An alternative way to search medline. Bioinformatics,22(18), 2298–2304. https://doi.org/10.1093/bioinformatics/btl388.
Article Google Scholar
Liu, T., & Guo, J. (2005). Text similarity computing based on standard deviation. Advances in Intelligent Computing,3644, 456–464.
Article Google Scholar
Ozalp, S. A., Ulusoy, O. (2005). Effective early termination techniques for text similarity join operator. In Proceedings of computer and information sciences—ISCIS 2005 (Vol. 3733, pp 791). Berlin: Springer
Navigli, R., & Lapata, M. (2010). An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence,32(4), 678–692.
Article Google Scholar
Jain, A., Mittal, K., & Tayal, D. K. (2014). Automatically incorporating context meaning for query expansion using graph connectivity measures. Progress in Artificial Intelligence,2(2–3), 129–139.
Article Google Scholar
Jain, A., Tayal, D. K., & Vij, S. (2017). A semi-supervised graph-based algorithm for word sense disambiguation. Global Journal of Enterprise Information System,8(2), 13–19.
Article Google Scholar
https://wordnet.princeton.edu/
Siddiqi, R., Harrison, C. J., & Siddiqi, R. (2010). Improving teaching and learning through automated short-answer marking. IEEE Transactions on Learning Technologies,3(3), 237–249.
Article Google Scholar
Jayashankar, S., & Sridaran, R. (2017). Superlative model using word cloud for short answers evaluation in eLearning. Education and Information Technologies,22(5), 2383–2402.
Article Google Scholar
Vii, S., Tayal, D., & Jain, A. (2019). A fuzzy WordNet graph based approach to find key terms for students short answer evaluation. In 2019 4th international conference on internet of things: Smart innovation and usages (IoT-SIU) (pp 1–6). IEEE
Sijimol, P. J., & Varghese, S. M. (2018). Handwritten short answer evaluation system (HSAES).
Van Hoecke, O. D. C. S. (2019). Summarization evaluation meets short-answer grading. In Proceedings of the 8th workshop on NLP for computer assisted language learning, pp. 79–85
Roy, S., Rajkumar, A., & Narahari, Y. (2018). Selection of automatic short answer grading techniques using contextual bandits for different evaluation measures. International Journal of Advances in Engineering Sciences and Applied Mathematics,10(1), 105–113.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of CSE, Indira Gandhi Delhi Technical University for Women, Delhi, 110006, India
Sonakshi Vij & Devendra Tayal
Department of CSE, Ambedkar Institute of Advanced Communication Technologies and Research, Delhi, 110031, India
Amita Jain

Authors

Sonakshi Vij
View author publications
You can also search for this author in PubMed Google Scholar
Devendra Tayal
View author publications
You can also search for this author in PubMed Google Scholar
Amita Jain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Devendra Tayal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vij, S., Tayal, D. & Jain, A. A Machine Learning Approach for Automated Evaluation of Short Answers Using Text Similarity Based on WordNet Graphs. Wireless Pers Commun 111, 1271–1282 (2020). https://doi.org/10.1007/s11277-019-06913-x

Download citation

Published: 22 November 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s11277-019-06913-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Machine Learning Approach for Automated Evaluation of Short Answers Using Text Similarity Based on WordNet Graphs

Abstract

Similar content being viewed by others

Automatic Short Answer Grading Using Corpus-Based Semantic Similarity Measurements

A Novel Semantic Similarity Based Technique for Computer Assisted Automatic Evaluation of Textual Answers

Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review

1 Introduction

2 Background Study

3 Proposed Approach

4 Results and Evaluation

5 Conclusion and Future Scope

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Machine Learning Approach for Automated Evaluation of Short Answers Using Text Similarity Based on WordNet Graphs

Abstract

Similar content being viewed by others

Automatic Short Answer Grading Using Corpus-Based Semantic Similarity Measurements

A Novel Semantic Similarity Based Technique for Computer Assisted Automatic Evaluation of Textual Answers

Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review

Explore related subjects

1 Introduction

2 Background Study

3 Proposed Approach

4 Results and Evaluation

5 Conclusion and Future Scope

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation