Abstract
Answer sheet evaluation is a time-consuming task that requires lot of efforts by the teachers and hence there is a strong need of automation for the same. This paper proposes a machine learning based approach that relies on WordNet graphs for finding out the text similarity between the answer provided by the student and the ideal answer provided by the teacher to facilitate the automation of answer sheet evaluation. This work is the first attempt in the field of short answer-based evaluation using WordNet graphs. Here, a novel marking algorithm is provided which can incorporate semantic relations of the answer text into consideration. The results when tested on 400 answer sheets yield promising results as compared with the state-of-art.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Answer sheet evaluations for the heart and soul of the examination system around the globe. Every now and then examinations are being carried out for various classes starting from nursery to higher education in the form of end-term examination, internal examination as well as weekly tests. This puts a lot of pressure on the evaluator to check and allot marks to the students in a given time frame. Since the evaluator is also involved in teaching, this leaves him/her very little time to completely dedicate themselves towards answer sheet evaluations. But this task is crucial and needs proper focus from the evaluator. Also, this needs to be an impartial task. It is common to see that sometimes due to the prejudices of the teacher; the marks of the student get affected. For instance, the teachers are slightly more inclined to giving good marks to the more obedient students. This is a psychological fact and can’t be ignored.
In the recent years, technology has crept in the classrooms, making way for an efficient learning environment. From online lectures to e-classrooms, digitization has led to the evolution of a teacher and student friendly learning ambience. Automation of answer sheet evaluation has also been aimed in the past but high levels of accuracy have not been achieved. Earlier attempts in this field have either been not so accurate or they have been extensively time consuming. Through this paper, we aim at amalgamating the concepts of natural language processing such as context identification and text similarity into question answering, to facilitate the whole process of short answer-based script evaluation in an accurate and time efficient manner. Text similarity is a widely used technique for finding the amount of relatedness between two texts [1,2,3,4,5,6,7,8,9]. This concept has been studied by researchers worldwide for applications in various domains [10,11,12,13,14,15]. Text similarity approaches are being refined on a regular basis to provide optimum results for various applications [16,17,18,19,20,21,22,23,24,25].
Some of the latest research works in the field of short answer evaluation highlight that Fuzzy WordNet graphs play a significant role in the analysis. Vii et al. [26] depicts that the WordNet graph for the ideal answer can be generated and then it may be used to create a set of keywords that are essential for evaluating the students answer sheet. Since this paper uses WordNet as the sense repository hence it established context in a more elaborate manner. But this paper lacks the presentation of fruitful results as the testing is done only on a synthetic dataset. In order to add more relevance to such works, it becomes essential for us to include in this paper, a set of more elaborate results which are obtained after testing on a larger dataset.
In order to hand out meaningful explanations for sheet analysis, handwriting recognition can also be amalgamated in the process as depicted in [27]. Sijimol and Varghese [27] presents a model that can be used for acquiring a model that learns on the previous data based on the handwriting of an individual. But this is not a practical approach. Also, testing data is not sufficient for this analysis. Cosine sentence similarity was used in [27].
Van Hoecke [28] works on the algorithm that aims at utilizing sentence-based summarization techniques for performing grading of short answers of students. But this poses a limitation that sentence based ranking is not always accurate. So, the error due to faulty sentence ranking is relayed and carry forward in grading as well. Also, mostly sentence based ranking algorithms are based on machine translation and similarity scores which not very accurate. Hence, these types of approached are practically useful. Roy et al. [29] compares and contrasts the various existing techniques for performing short answer evaluation in terms of grading. This type of study is useful for us as it enables us to briefly outline the cons of the existing state-of-art techniques.
This paper focuses on proposing a novel method of automatically evaluating the answer sheets of students using a machine learning based approach. The technique adopted for the same is the generation of WordNet graphs. WordNet [5, 23] developed at Princeton university, is a computational lexicon consists of words and various relations between words. WordNet can be viewed as a graph where nodes represent words and edges represents relation between words. WordNet is widely being used in the literature for resolving several natural language processing tasks including word sense disambiguation, machine translation, information retrieval etc. [20, 21]. WordNet graphs play a significant role in information retrieval and hence they help in incorporating semantic significance and structural dependencies [22].
In this paper, WordNet is used for finding out the text similarity between the ideal answer provided by the teacher and the answer provided by the students in their answer sheet. The WordNet graphs are constructed to represent ideal answer and the answer for evaluation. Now the similarity between these two graphs is computed based on common nodes appearance. The marks for the answer to be evaluated are assigned in proportional to the similarity between these two graphs. The results for the proposed method are obtained on a dataset consisting of 400 students answer sheets. Answer sheets for evaluation are selected in a way to incorporate similarity and diversity in the data set.
The rest of the paper is framed as follows: Sect. 2 highlights the background study related to text similarity. Section 3 describes the proposed approach. Section 4 explains the results obtained. Section 5 concludes the work and states the relevant future scope.
2 Background Study
The main concept utilized in this paper for answer sheet evaluation is finding the text similarity between the ideal answer and the answer provided by the student. In order to study the latest recent trends in the field of text similarity, Web of Science (WoS) is taken as the data source. The below mentioned query was used for extracting the research papers pertaining to this field:
The research papers were obtained through the above-mentioned query from the year 1989–2017 [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]. The keywords occurring in these research papers were analyzed to visualize a keyword co-occurrence network as shown in Fig. 1. These keywords depict the various research topics associated with text similarity.
It can be observed from Fig. 1 that graph theory/technique, semantic dependencies and structural dependencies are closely associated to this field. Hence in this paper, a combination of these is taken to propose a novel method for calculating text similarity applied to answer sheet evaluation.
3 Proposed Approach
This section highlights the proposed approach adopted for evaluating the answer sheets of students in an automated manner. As concluded from the previous section, graph theory, semantic and structural dependencies play a significant role in text similarity calculation. Hence in this paper, a machine learning oriented WordNet graph-based method is proposed for answer sheet evaluation. WordNet is an online lexical dictionary type of database that consists of senses of a word according to its various part of speech tags. It consists of various semantic relationships intertwined to make a huge lexical database. WordNet graphs have been widely used in the literature for resolving lexical issues such as word sense disambiguation [20, 22]. The WordNet graph generated in this paper uses semantic relations hypernym, hyponym, meronym and holonym.
Siddiqi et al. [24] highlights that several types of short answer evaluations occur like the ones dealing with “True–False” type questions, fill in the blanks, sentence completion, “description required”, “justification required”, “example required” etc. the method proposed in this paper deals with short answer evaluation for the type of questions where a brief description is to be provided by the student with relevant short explanation if needed. The context can be well established for short answer evaluation using WordNet but for larger queries the context dissolves. For instance, it is difficult to automatically evaluate answers that have technical words in it since all of them are not available in WordNet. Other types of questions may be handled in the future. The method is explained as in Table 1.
To illustrate this, let us take the following text as the question to be evaluated:
Question: What is a car?
Answer: (Ideal, as provided by the teacher): Car is a vehicle with four wheels.
This answer text is treated as query Q1. The implementation of the proposed method is carried out in python using Natural Language Tool Kit (NLTK) libraries.
Q1 is tokenized and POS Tagging is done for the same. The result of the POS tagging set is as follows:
Tagged words set for Q1 = [(‘car’, ‘NN’), (‘is’, ‘VBZ’), (‘a’, ‘DT’), (‘vehicle’, ‘NN’), (‘with’, ‘IN’), (‘four’, ‘CD’), (‘wheels’, ‘NNS’)].
For the sake of simplicity, the content words chosen for generating the WordNet graph are (‘car’, ‘NN’), (‘vehicle’, ‘NN’), and (‘wheels’, ‘NNS’). The semantic relations hypernym, hyponym, meronym and holonym are taken for this purpose. The WordNet graph is generated using depth first search algorithm [20]. It is shown as in Fig. 2 and has 49 nodes.
The node set of this graph (NS) is as follows:
NS= {“Synset(‘Wheeled_Vehicle.N.01’)”, “Synset(‘Car.N.01’)”, “Synset(‘Wheel.V.02’)”, “Synset(‘Wheel.N.01’)”, “Synset(‘Travel.V.01’)”, “Synset(‘Valve.N.03’)”, “Synset(‘Car_Wheel.N.01’)”, “Synset(‘Handwheel.N.02’)”, “Synset(‘Rack.N.04’)”, “Synset(‘Car.N.04’)”, “Synset(‘Cable_Car.N.01’)”, “Synset(‘Minivan.N.01’)”, “Synset(‘Helm.N.01’)”, “Synset(‘Ride.V.02’)”, “Synset(‘Compartment.N.02’)”, “Synset(‘Van.N.05’)”, “Synset(‘Vehicle.N.01’)”, “Synset(‘Steering_System.N.01’)”, “Synset(‘Steering_Wheel.N.01’)”, “Synset(‘Wheel.V.03’)”, “Synset(‘Wagon_Wheel.N.01’)”, “Synset(‘Lathe.N.01’)”, “(‘Vehicle’, ‘NN’)”, “Synset(‘Bicycle_Wheel.N.01’)”, “Synset(‘Instrumentality.N.03’)”, “Synset(‘Sprocket.N.02’)”, “Synset(‘Wheel.N.04’)”, “Synset(‘Conveyance.N.03’)”, “Synset(‘Cab.N.01’)”, “Synset(‘Bicycle.N.01’)”, “Synset(‘Vehicle.N.03’)”, “Synset(‘Bicycle.V.01’)”, “Synset(‘Medium.N.01’)”, “Synset(‘Passenger_Van.N.01’)”, “Synset(‘Vehicle.N.02’)”, “Synset(‘Motor_Vehicle.N.01’)”, “Synset(‘Roulette_Wheel.N.01’)”, “(‘Wheels’, ‘NNS’)”, “(‘Car’, ‘NN’)”, “Synset(‘Wheel.N.03’)”, “Synset(‘Car.N.03’)”, “Synset(‘Fomite.N.01’)”, “Synset(‘Self-Propelled_Vehicle.N.01’)”, “Synset(‘Wagon.N.01’)”, “Synset(‘Car.N.02’)”, “Synset(‘Handwheel.N.01’)”, “Synset(‘Travel.V.05’)”, “Synset(‘Wheel.V.01’)”, “Synset(‘Truck.N.01’)”}
The answer sheets will be evaluated based on these nodes. There may exist, 2 basic types of answer sheets:
- (a)
When the answer written by the student matches logically with the ideal answer
- (b)
When the answer written by the student does not match with the ideal answer and is not relevant to the context either.
Case 1: When the student has written an accurate and logical answer according to the context
Let us suppose that the 1st candidate has put up the answer as:
Q2: Car has wheels and an engine.
Now, in order to evaluate the 1st candidate answer sheet, Q2 is tokenized and tagged as follows:
where NN = Noun, VBZ = Verb, DT = Determiner, IN = Preposition, CD = Cardinal Digit
For the sake of simplicity, the content words chosen for generating the WordNet graph are (‘car’, ‘NN’), (‘wheels’, ‘NNS’), and (‘engine’, ‘NN’). The WordNet graph is generated as shown as in Fig. 3. The total number of nodes in this WordNet graph is 53.
The node set of this graph (NS1) is as follows:
NS1= {“Synset(‘Engine.N.02’)”, “Synset(‘Wheeled_Vehicle.N.01’)”, “Synset(‘Car.N.01’)”, “Synset(‘Wheel.V.02’)”, “Synset(‘Wheel.N.01’)”, “Synset(‘Travel.V.01’)”, “Synset(‘Valve.N.03’)”, “Synset(‘Motor.N.01’)”, “Synset(‘Car_Wheel.N.01’)”, “Synset(‘Handwheel.N.02’)”, “Synset(‘Instrument_Of_Torture.N.01’)”, “(‘Engine’, ‘NN’)”, “Synset(‘Rack.N.04’)”, “Synset(‘Engine.N.04’)”, “Synset(‘Car.N.04’)”, “Synset(‘Cable_Car.N.01’)”, “Synset(‘Minivan.N.01’)”, “Synset(‘Helm.N.01’)”, “Synset(‘Ride.V.02’)”, “Synset(‘Compartment.N.02’)”, “Synset(‘Van.N.05’)”, “Synset(‘Automobile_Engine.N.01’)”, “Synset(‘Steering_System.N.01’)”, “Synset(‘Steering_Wheel.N.01’)”, “Synset(‘Wheel.N.04’)”, “Synset(‘Locomotive.N.01’)”, “Synset(‘Instrument_Of_Punishment.N.01’)”, “Synset(‘Lathe.N.01’)”, “Synset(‘Bicycle.V.01’)”, “Synset(‘Sprocket.N.02’)”, “Synset(‘Machine.N.01’)”, “Synset(‘Instrument.N.01’)”, “Synset(‘Bicycle_Wheel.N.01’)”, “(‘Wheels’, ‘NNS’)”, “Synset(‘Cab.N.01’)”, “Synset(‘Wagon_Wheel.N.01’)”, “Synset(‘Bicycle.N.01’)”, “Synset(‘Engine.N.01’)”, “Synset(‘Passenger_Van.N.01’)”, “Synset(‘Wheel.V.03’)”, “Synset(‘Motor_Vehicle.N.01’)”, “Synset(‘Roulette_Wheel.N.01’)”, “Synset(‘Wheel.N.03’)”, “(‘Car’, ‘NN’)”, “Synset(‘Car.N.03’)”, “Synset(‘Travel.V.05’)”, “Synset(‘Self-Propelled_Vehicle.N.01’)”, “Synset(‘Wagon.N.01’)”, “Synset(‘Device.N.01’)”, “Synset(‘Car.N.02’)”, “Synset(‘Handwheel.N.01’)”, “Synset(‘Wheel.V.01’)”, “Synset(‘Truck.N.01’)”}
Now, find out the nodes that match between NS1 and NS and put them in N:
N = {“Synset(‘Wheeled_Vehicle.N.01’)”, “Synset(‘Car.N.01’)”, “Synset(‘Wheel.V.02’)”, “Synset(‘Wheel.N.01’)”, “Synset(‘Travel.V.01’)”, “Synset(‘Valve.N.03’)”, “Synset(‘Car_Wheel.N.01’)”, “Synset(‘Handwheel.N.02’)”, “Synset(‘Rack.N.04’)”, “Synset(‘Car.N.04’)”, “Synset(‘Cable_Car.N.01’)”, “Synset(‘Minivan.N.01’)”, “Synset(‘Helm.N.01’)”, “Synset(‘Ride.V.02’)”, “Synset(‘Compartment.N.02’)”, “Synset(‘Van.N.05’)”, “Synset(‘Steering_System.N.01’)”, “Synset(‘Steering_Wheel.N.01’)”, “Synset(‘Wheel.N.04’)”, “Synset(‘Lathe.N.01’)”, “Synset(‘Bicycle.V.01’)”, “Synset(‘Sprocket.N.02’)”, “Synset(‘Bicycle_Wheel.N.01’)”, “(‘Wheels’, ‘NNS’)”, “Synset(‘Cab.N.01’)”, “Synset(‘Wagon_Wheel.N.01’)”, “Synset(‘Bicycle.N.01’)”, “Synset(‘Passenger_Van.N.01’)”, “Synset(‘Wheel.V.03’)”, “Synset(‘Motor_Vehicle.N.01’)”, “Synset(‘Roulette_Wheel.N.01’)”, “Synset(‘Wheel.N.03’)”, “(‘Car’, ‘NN’)”, “Synset(‘Car.N.03’)”, “Synset(‘Self-Propelled_Vehicle.N.01’)”, “Synset(‘Wagon.N.01’)”, “Synset(‘Car.N.02’)”, “Synset(‘Handwheel.N.01’)”, “Synset(‘Wheel.V.01’)”, “Synset(‘Truck.N.01’)”}
It can be observed that N consists of 40 nodes (|N|) which means that out of 49 nodes in the ideal answer sheet graph, 40 matches with the 1st candidate answer sheet. This means that the answer is very relevant to the given context, and hence it can be marked for a 10-mark question as (40*10/49) = 8.1.
Case 2: When the answer written by the student does not match with the ideal answer and is not relevant to the context either.
Now the 2nd candidate answer sheet has put up the answer as:
Q3: Car is used for transportation.
Now, in order to evaluate the 2nd candidate answer sheet, Q3 is tokenized and tagged as follows:Tagged words set for Q3 = [(‘car’, ‘NN’), (‘is’, ‘VBZ’), (‘used’, ‘VBN’), (‘for’, ‘IN’), (‘transportation’, ‘NN’)]where NN = Noun, VBZ = Verb, IN = Preposition, VBN = Verb (past participle)
For the sake of simplicity, the content words chosen for generating the WordNet graph are (‘car’, ‘NN’) and (‘transportation’, ‘NN’). The WordNet graph is generated as shown as in Fig. 4. The total number of nodes in this WordNet graph is 27.
The node set of this graph (NS2) is as follows:
NS2= {“Synset(‘Be.V.02’)”, “Synset(‘Exist.V.01’)”, “Synset(‘Equal.V.01’)”, “Synset(‘Practice.V.04’)”, “Synset(‘Exploit.V.01’)”, “Synset(‘Be.V.10’)”, “Synset(‘Secondhand.S.02’)”, “Synset(‘Used.A.01’)”, “(‘Is’, ‘VBZ’)”, “Synset(‘Constitute.V.01’)”, “Synset(‘Be.V.12’)”, “Synset(‘Be.V.11’)”, “Synset(‘Exploited.S.02’)”, “Synset(‘Use.V.01’)”, “Synset(‘Use.V.02’)”, “Synset(‘Stay.V.01’)”, “Synset(‘Embody.V.02’)”, “Synset(‘Be.V.03’)”, “Synset(‘Use.V.06’)”, “Synset(‘Cost.V.01’)”, “Synset(‘Take.V.02’)”, “Synset(‘Be.V.05’)”, “Synset(‘Use.V.03’)”, “Synset(‘Use.V.04’)”, “Synset(‘Be.V.01’)”, “Synset(‘Be.V.08’)”, “(‘Used’, ‘VBN’)”}
Now, find out the nodes that match between NS2 and NS and put them in N:
It can be observed that in this case, N doesn’t consist of any nodes which means that out of 49 nodes in the ideal answer sheet graph, none matches with the 2nd candidate answer sheet i.e. |N| = 0. This means that the answer is not relevant to the given context, and hence it would be marked zero. The results for the example taken for illustration are summarized as in Table 2.
4 Results and Evaluation
To test the effectiveness of this approach, a dataset was considered in which answer sheets of 400 students were collected. The answer sheets belong to the subject social studies. This was observed through experimentation that the proposed system does not apply well to technical subjects like computer science engineering. This is so because WordNet doesn’t contain all the technical words and definitions. For the result evaluation, these 400 answer sheets were checked beforehand by the teachers. These sheets were scanned, and their text was converted into a machine-readable format using OCR (Optical Character Recognition). The answers in these sheets were analyzed according to the proposed method and were re-evaluated. The marks obtained by the proposed method and the actual marks were compared to calculate the Root Mean Square Error (RMSE) using Eq. 1.
here Xobs,i = Marks of the answer sheet as evaluated by the teacher, Xmodel,i = Marks of the answer sheet as calculated by the proposed method, n = Number of observations = 400.
Table 3 summarizes the performance of the proposed method as compared with the state-of-art, when applied to the considered dataset. Better results are obtained as compared to the state-of-art owing to the novelty of the proposed algorithm that takes into consideration the degree of semantic relatedness of the candidate answer to the ideal answer decided and provided by the teacher/evaluator. This would in turn help in impartial evaluations of the answer sheets.
Hence, it can be concluded that the proposed method yields promising results. This can be attributed to the fact that the state-of-art doesn’t take into consideration the semantic relationships and lexical expansion, but the proposed method does. It should also be highlighted here that IndusMarker [24] generates the word cloud in an automated manner which is to be manually analyzed by the evaluator. The proposed system on the other hand generates the WordNet graphs and assigns the scores automatically. This in turn assists in reducing the time of evaluation which is another significant aspect of answer sheet checking. In order to further increase the accuracy, there is a need to incorporate more measures of semantic relatedness.
5 Conclusion and Future Scope
This paper proposes a novel concept for answer sheet evaluation using the concept of text similarity applied to WordNet graphs. The answer sheets are evaluated by identifying the common nodes that occur between the node set of the ideal answer WordNet graph and the candidate answer WordNet graph. This kind of an evaluation combines the various significant concepts related to text similarity like semantic and structural dependencies. The root mean square error for the proposed approach was found to be 0.319 when tested on a dataset consisting of 400 students answer sheets. Unlike the state-of-art, the proposed method generates the WordNet graphs and assigns the scores automatically which in turn assists in reducing the time of evaluation. This shows that the proposed approach of answer sheet evaluation yields promising results in terms of both accuracy and time of evaluation. This work is suitable in scenarios where the student enters the correct spelling of the concerned words. The WordNet graph for erroneous non-words won’t be generated. In the future, this work might be extended to incorporate measures to resolve this issue. Although marks are deducted in manual evaluation too for incorrect spellings, but some partial assignment is possible.
References
Li, X., Liu, N., Yao, C. L., & Fan, F. L. (2017). Text similarity measurement with semantic analysis. International Journal of Innovative Computing Information and Control.,13(5), 1693–1708.
Li, X., Yao, C. L., Fan, F. L., & Yu, X. Q. (2017). A text similarity measurement method based on singular value decomposition and semantic relevance. Journal of Information Processing Systems,13(4), 863–875. https://doi.org/10.3745/jips.02.0067.
Al-Smadi, M., Jaradat, Z., Al-Ayyoub, M., & Jararweh, Y. (2017). Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features. Information Processing and Management,53(3), 640–652. https://doi.org/10.1016/j.ipm.2017.01.002.
Kchaou, D., Bouassida, N., & Ben-Abdallah, H. (2017). UML models change impact analysis using a text similarity technique. IET Software,11(1), 27–37. https://doi.org/10.1049/iet-sen.2015.0113.
Cho, S. G., & Kim, S. B. (2017). A data-driven text similarity measure based on classification algorithms. International Journal of Industrial Engineering-Theory Applications and Practice,24(3), 328–339.
Abdul-Jabbar, S. S., & George, L. E. (2017). A comparative study for string metrics and the feasibility of joining them as combined text similarity measures. ARO The Scientific Journal of Koya University,5(2), 6–18. https://doi.org/10.14500/aro.10180.
Reddy, G. S., & Rajinikanth, T. V. (2017). A text similarity measure for document classification. IADIS-International Journal on Computer Science and Information Systems,12(1), 14–25.
Abdul-Rahman, A., Roe, G., Olsen, M., Gladstone, C., Whaling, R., Cronk, N., et al. (2017). Constructive visual analytics for text similarity detection. Computer Graphics Forum,36(1), 237–248. https://doi.org/10.1111/cgf.12798.
Atoum, I., & Otoom, A. (2016). Efficient hybrid semantic text similarity using WordNet and a corpus. International Journal of Advanced Computer Science and Applications,7(9), 124–130.
Bao, X. A., Dai, S. C., Zhang, N., & Yu, C. H. (2016). Large-scale text similarity computing with spark. International Journal of Grid and Distributed Computing,9(4), 95–100. https://doi.org/10.14257/ijgdc.2016.9.4.09.
Kashyap, A., Han, L., Yus, R., Sleeman, J., Satyapanich, T., Gandhi, S., et al. (2016). Robust semantic text similarity using LSA, machine learning, and linguistic resources. Language Resources and Evaluation,50(1), 125–161. https://doi.org/10.1007/s10579-015-9319-2.
Rahutomo, F., & Aritsugi, M. (2014). Econo-ESA in semantic text similarity. Springerplus. https://doi.org/10.1186/2193-1801-3-149.
Huang, C. H., Liu, Y., Xia, S. Z., & Yin, J. A. (2011). A text similarity measure based on suffix tree. Information-an International Interdisciplinary Journal,14(2), 583–592.
Quan, X. J., Liu, G., Lu, Z., Ni, X. L., & Wenyin, L. (2010). Short text similarity based on probabilistic topics. Knowledge and Information Systems,25(3), 473–491. https://doi.org/10.1007/s10115-009-0250-y.
Sun, Z. H., Errami, M., Long, T., Renard, C., Choradia, N., & Garner, H. (2010). Systematic characterizations of text similarity in full text biomedical publications. PLoS ONE,5(9), 11. https://doi.org/10.1371/journal.pone.0012704.
Atlam, E. (2008). A new approach for text similarity using articles. International Journal of Information Technology & Decision Making,7(1), 23–34. https://doi.org/10.1142/s021962200800279x.
Lewis, J., Ossowski, S., Hicks, J., Errami, M., & Garner, H. R. (2006). Text similarity: An alternative way to search medline. Bioinformatics,22(18), 2298–2304. https://doi.org/10.1093/bioinformatics/btl388.
Liu, T., & Guo, J. (2005). Text similarity computing based on standard deviation. Advances in Intelligent Computing,3644, 456–464.
Ozalp, S. A., Ulusoy, O. (2005). Effective early termination techniques for text similarity join operator. In Proceedings of computer and information sciences—ISCIS 2005 (Vol. 3733, pp 791). Berlin: Springer
Navigli, R., & Lapata, M. (2010). An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence,32(4), 678–692.
Jain, A., Mittal, K., & Tayal, D. K. (2014). Automatically incorporating context meaning for query expansion using graph connectivity measures. Progress in Artificial Intelligence,2(2–3), 129–139.
Jain, A., Tayal, D. K., & Vij, S. (2017). A semi-supervised graph-based algorithm for word sense disambiguation. Global Journal of Enterprise Information System,8(2), 13–19.
Siddiqi, R., Harrison, C. J., & Siddiqi, R. (2010). Improving teaching and learning through automated short-answer marking. IEEE Transactions on Learning Technologies,3(3), 237–249.
Jayashankar, S., & Sridaran, R. (2017). Superlative model using word cloud for short answers evaluation in eLearning. Education and Information Technologies,22(5), 2383–2402.
Vii, S., Tayal, D., & Jain, A. (2019). A fuzzy WordNet graph based approach to find key terms for students short answer evaluation. In 2019 4th international conference on internet of things: Smart innovation and usages (IoT-SIU) (pp 1–6). IEEE
Sijimol, P. J., & Varghese, S. M. (2018). Handwritten short answer evaluation system (HSAES).
Van Hoecke, O. D. C. S. (2019). Summarization evaluation meets short-answer grading. In Proceedings of the 8th workshop on NLP for computer assisted language learning, pp. 79–85
Roy, S., Rajkumar, A., & Narahari, Y. (2018). Selection of automatic short answer grading techniques using contextual bandits for different evaluation measures. International Journal of Advances in Engineering Sciences and Applied Mathematics,10(1), 105–113.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Vij, S., Tayal, D. & Jain, A. A Machine Learning Approach for Automated Evaluation of Short Answers Using Text Similarity Based on WordNet Graphs. Wireless Pers Commun 111, 1271–1282 (2020). https://doi.org/10.1007/s11277-019-06913-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-019-06913-x