Graph Entropy-Based Learning Analytics

Al-Zawqari, Ali; Vandersteen, Gerd

doi:10.1007/978-3-031-11647-6_3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13356))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

4009 Accesses
1 Citations

Abstract

The goal of this research is to strengthen the teaching strategy with quantitatively measured learning analytics. The entropy-based learning analytics aims to measure and understand students’ progress by quantitatively measuring the difference between the content to be learned, the tutors’ expectation of understanding, and the student’s knowledge. This quantification will take similar steps than taken by Shannon for his information theory using a mathematical formalism to quantitatively measure knowledge (equivalent to Shannon’s entropy) and knowledge transfer (equivalent to Shannon’s mutual information). Knowledge graphs will be used to represent the content to be learned, the tutors’ expectations, and the student’s knowledge. Early results reveal that advanced analytical algorithms and graph entropy specified for educational applications is necessary for this research project to succeed.

This work was financially supported in part by the Vrije Universiteit Brussel (VUB-SRP19), in part by the Flemish Government (Methusalem Fund METH1) and in part by the Fund for Scientific Research (FWO).

Access provided by Autonomous University of Puebla. Download conference paper PDF

Learning Analytics to Support Teachers’ Assessment of Problem Solving: A Novel Application for Machine Learning and Graph Algorithms

A Review of Learning Analytics in Educational Research

Learning Analytics: Using Data-Informed Decision-Making to Improve Teaching and Learning

Keywords

1 Introduction

The human learning process lies at the heart of many disciplines: pedagogy, psychology, neuroscience, linguistics, sociology, and anthropology [13]. Researchers of Artificial intelligence in education (AIEd) bring two interdisciplinary fields (AI and Education) together, intending to make explicit and precise computation forms of knowledge [14]. In addition, AIEd offers the tools to open up the “black box” of learning by providing a deeper understanding of its process [11]. This research project works towards using/developing AI techniques to accurately model and measure the students’ learning process for better understanding and improvement of this process. The rationale is inspired by Shannon’s work in information theory [15] which was the starting point of our modern communication society. He rigorously quantified the transferred information sent across a communication channel using entropy and mutual information concepts. This theory provided the tools to systematically improve and develop the communication techniques of our modern communication society. The long-term vision of this project is to develop and demonstrate a new paradigm, an entropy-based learning analytic. Novel graph information-based learning analytics will be created which accurately measure and help to optimize the learning processes. Therefore, a strategy similar to the one Shannon followed will be considered using concepts such as entropy, mutual information [15] and interaction information [12].

2 Theoretical Framework and Methodology

2.1 Modelling the Learning Problems

When taken at the abstract level, the communication of knowledge from the teachers or learning platform to the students can be considered as a communication problem [15], especially when looking at the level of communicating and remembering things. In this abstraction, the knowledge about the studied subject acts as the transmitter, the teaching channel is the noisy communication channel, and the student knowledge level is the receiver, as shown in Fig. 1.

This approach suggests distributing the learning problem’s modeling into three interactive models: the course subject model (domain knowledge), the teaching methods model (the channel), and the student model (student knowledge), as illustrated in Fig. 1. The learning rate can then be measured independently for each channel using mutual information between the transmitter (domain knowledge) and the receiver (student knowledge). Later, the “optimal” teaching channel can be personalized automatically to optimize the knowledge transfer to each student. However, in the educational context, the relation between different parts of information is essential, such as the connection between remembering the multiplication table and solving mathematical word problems. Shannon’s model has no levels that provide information about these additional relationships. As a result, integrating Shannon’s theory with pedagogy theory is a must for this work.

2.2 Knowledge Graph Representation for Education Sciences

As is established above, learning is more than transferring and storing information alone. It is also about understanding, applying, analyzing, etc. The different levels of learning were described by Bloom using his taxonomy [2] and later modified by splitting the taxonomy into a cognitive process dimension (remembering, understanding, applying, ...) and a knowledge dimension (factual, conceptual, procedural, ...) [10]. The knowledge to be learned can be expressed efficiently using graph-like representations in the scientific field of Knowledge Representation and Reasoning [3]. Here, the objective is to develop a probabilistic knowledge graph representation [8]. To this end, we need three graphs with the capabilities to capture different aspects of the problem: 1) the content graph: represents the knowledge to be learned (the content of the course); 2) the expectation graph: represents the learning levels/goals set by the teacher (using, e.g., Bloom’s taxonomy); 3) the student’s graph: represents the student’s estimated knowledge. This contains a probabilistic measure of whether the learning goal is attained or not.

The objective is to define a common knowledge graph representation that captures these three aspects. This representation is necessary as it serves as an input to calculate the graph entropy that is needed to determine the (mutual and interaction) graph information. The research hypothesis is that it is indeed possible to represent all three aspects within a common graph representation. All processing can be performed with the inclusion of elementary operations only, which are the basis for more complex ones. The challenge of this knowledge graph for education sciences is that it demands the adaptation of known knowledge graphs [3, 8] towards the requirements in education. This task requires an intensive collaboration with educational scientists to incorporate the various learning levels within the graph representation. Once it is possible to capture the different levels of learning (remembering, understanding, applying on examples), it becomes possible to measure these levels of learning using the theory described and developed in the following subsection.

2.3 Graph Entropy-Based Learning Analytics

The scientific objective is to develop a graph entropy-based approach to measure a student’s progress in the learning process using mutual and interaction information. These are defined between two or more graphs using an entropy measure [12, 15]. The entropy measures the information transport between the content graph, the expectation graph, and the student (learning) graph. Determining the entropy definition and its properties for the problem at hand will be pretty challenging. Various definitions already exist for information sources (with and without memory/correlation) and topologies of graphs. In addition, entropy at the level of remembering is identical to Shannon’s entropy. However, there is currently no graph entropy that fulfills the following requirements: 1) the entropy of a single vertex equals Shannon’s entropy for an information source; 2) the entropy of the union of independent graphs equals the sum of entropies of the individual graphs; 3) the expectation graph entropy must be able to measure the entropy of a subgraph for a particular learning level; 4) the student’s graph entropy is proportional to the probability that the student obtains/constructs a knowledge vertex, similar to the entropy concept used for Markov processes [15].

These requirements deviate from the existing entropies of possibly correlated information sources [15], \(H(v_i) = - \sum _i p_i\, log\, p_i\), and the entropies for graphs [5, 6, 9]. The latter entropies proposes a representative probabilistic function that contains the graph structure information [6] with \(p_i = f(v_i)/\sum _i f(v_i)\) where \(f(v_i)\) is a function of properties of the vertex \(v_i\). These entropies do not satisfy the requirements mentioned above as this probabilistic representation does not measure the actual information entropy \(H(v_i)\) of the vertex \(v_i\). They only capture the graph’s structure. Hence, an additional scientific objective is a derivation of supporting properties for this entropy: the mutual and interaction information. The research hypothesis is that a graph entropy that satisfies the above constraints can be found. The strategy used is similar to the work of Shannon, which starts by first defining the required properties/constraints for the entropy, followed by the determination of an entropy function that fulfills these constraints [15]. The challenge is that the needed entropy is somewhere in between the entropy definition of Shannon’s information and the current graphs’ entropies. Figure 2 shows the proposed integrated model of Bloom’s taxonomy with the graph-based entropy.

3 Early Results

The first step taken to examine the graph entropy-based learning analytics was to explore the possibility of measuring the student learning rate through probabilistic graphical models [1]. Figure 3 sums up the proposed framework. Each block represents a separate editable stage colored for different technology, namely: The content graph generates the graph of a specific course subject. It can be seen as part of a knowledge representation problem. The teacher graph adds probabilistic metrics to the knowledge graph. This block represents the rules and facts as the evaluation reference for what students will learn. The teacher input of objectives and goals offers power and control to the educators. The main output of this block will be probabilistic facts and vertex categorization based on Bloom’s taxonomy. The student input of answers to questions results from specific tests. The student learning parameters are where the knowledge reasoning part takes the role. The student evaluation results can be used as an interpretation, which will help to generate the probabilistic parameters for a specific student. ProbLog [4] is offering the algorithm used for this task, namely the learning from interpretations (LFI) [7]. The student graph provides an estimate of the graph learned by the student. It allows the teacher to have a more in-depth look at what the student has learned and understood. The most probable explanations (MPE) [16] framework was applied to get more insight into the student’s learning status.

The proposed framework was tested with synthetic data of multiple educators/students in teaching/learning the De Morgan theorem. Implementation results have demonstrated that the proposed framework can measure the student learning rate even when the student evaluation data are not fully covering the course’s objectives. However, it was shown that ProbLog could not provide the uncertainty level of a specific learned parameter in the case of incomplete data.

4 Summary

This project aims at introducing a new paradigm in the measurements of students learning in the academic setup. The rationale is inspired by Shannon’s information theory work, which was the starting point of our modern communication society. However, the meaning of information in the educational context is much more complex than in the communication systems. This complexity motivated the proposal of using AI techniques, namely the probabilistic graphical models as a tool that captures/encoded the levels of information in the educational setup. The information theory, graph theory, and pedagogy theory are the main pillars of the proposed work. It is summarized by having three different probabilistic graphical representations: course content, teacher expectation, and student knowledge. Entropy-based analytics will be drawn from the interaction between these graphs to quantify the student’s learning process. Early results showed that available probabilistic programming logic could serve as a tool for simple case scenarios with a complete evaluation of students’ knowledge. In complex scenarios, these tools failed in providing stable results.

References

Al-Zawqari, A.: AI-Assisted learning: modelling and measuring learning rate. Master’s thesis, Vrije Universiteit Brussel, Université Libre de Bruxelles (2020)
Google Scholar
Bloom, B.S.: Taxonomy of educational objectives: the classification of educational goals. Cognitive domain (1956)
Google Scholar
Chein, M., Mugnier, M.L.: Graph-based Knowledge Representation: Computational Foundations of Conceptual Graphs. Springer Science & Business Media, Heidelberg (2008). https://doi.org/10.1007/978-1-84800-286-9
De Raedt, L., Kimmig, A., Toivonen, H.: ProbLog: a probabilistic prolog and its application in link discovery. In: IJCAI, vol. 7, pp. 2462–2467. Hyderabad (2007)
Google Scholar
Dehmer, M., Emmert-Streib, F., Chen, Z., Li, X., Shi, Y.: Mathematical Foundations and Applications of Graph Entropy. John Wiley & Sons, Hoboken (2017)
Google Scholar
Dehmer, M., Mowshowitz, A.: A history of graph entropy measures. Inf. Sci. 181(1), 57–78 (2011)
Article MathSciNet Google Scholar
Fierens, D., et al.: Inference and learning in probabilistic logic programs using weighted Boolean formulas. Theory Pract. Logic Program. 15(3), 358–401 (2015)
Article MathSciNet Google Scholar
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT press, Cambridge (2009)
Google Scholar
Körner, J.: Coding of an information source having ambiguous alphabet and the entropy of graphs. In: 6th Prague Conference on Information Theory, pp. 411–425 (1973)
Google Scholar
Krathwohl, D.R.: A revision of bloom’s taxonomy: an overview. Theory Pract. 41(4), 212–218 (2002)
Article Google Scholar
Luckin, R., Holmes, W., Griffiths, M., Forcier, L.B.: Intelligence unleashed: an argument for AI in education (2016)
Google Scholar
McGill, W.: Multivariate information transmission. Trans. IRE Prof. Gr. Inf. Theory 4(4), 93–111 (1954)
Article MathSciNet Google Scholar
Sawyer, R.K.: The Cambridge Handbook of the Learning Sciences. Cambridge University Press, Cambridge (2005)
Google Scholar
Self, J., et al.: The defining characteristics of intelligent tutoring systems research: ITSs care, precisely. Int. J. Artif. Intell. Educ. 10(3–4), 350–364 (1999)
Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Article MathSciNet Google Scholar
Shterionov, D., Renkens, J., Vlasselaer, J., Kimmig, A., Meert, W., Janssens, G.: The most probable explanation for probabilistic logic programs with annotated disjunctions. In: Davis, J., Ramon, J. (eds.) ILP 2014. LNCS (LNAI), vol. 9046, pp. 139–153. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23708-4_10
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of ELEC, Vrije Universiteit Brussel, 1050, Brussels, Belgium
Ali Al-Zawqari & Gerd Vandersteen

Authors

Ali Al-Zawqari
View author publications
You can also search for this author in PubMed Google Scholar
Gerd Vandersteen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Al-Zawqari .

Editor information

Editors and Affiliations

Ateneo De Manila University, Quezon, Philippines
Maria Mercedes Rodrigo
Department of Computer Science, North Carolina State University, Raleigh, NC, USA
Noburu Matsuda
Durham University, Durham, UK
Alexandra I. Cristea
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Al-Zawqari, A., Vandersteen, G. (2022). Graph Entropy-Based Learning Analytics. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners’ and Doctoral Consortium. AIED 2022. Lecture Notes in Computer Science, vol 13356. Springer, Cham. https://doi.org/10.1007/978-3-031-11647-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-11647-6_3
Published: 26 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11646-9
Online ISBN: 978-3-031-11647-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics