1 Introduction

With the introduction of cognitive science principles to impart education through digital platforms, there has been a paradigm shift in the methodologies of effective learning. Al-Fraihat et al. (2020) expresses that E-learning, due to the fusion of technology and education, has become a prevalent means of learning. Xie et al. (2019), Valverde-Berrocoso et al. (2020) and Moubayed et al. (2018) describe the trends in the development of e-learning while exploring its characteristics and types. Rodrigues et al. (2019), Valverde-Berrocoso et al. (2020) explain e-learning to be an innovative web-based system whose main aim is to give users a customized, learner-centered, and interactive learning environment aiding and improving the learning process. The popularity of e-learning platforms has been increasing steadily due to the flexibility and availability of online content.

The primary challenge faced by developers is facilitating e-Learning systems to customize course content, provide accurate feedback and motivate the students to achieve their goals. Designing a one-stop solution to these problems that learners face in an online environment served as the motivation behind this project. To gain an understanding of learner’s expectations, we surveyed a group of 150 graduate and undergraduate students from the Management studies elective. The findings of this survey showed that the most popular media for self-study online were videos, open-source textbooks, and blog websites and students expressed their need for a one-stop platform for all these resources.

This paper formulates an entirely autonomous, didactic, and adaptive learning management system that employs a learner-centric model.

1.1 Contributions of this adaptive LMS

  1. 1.

    An efficient content-based data generation pipeline for suggesting personalized course material to the user based on the grasping speed, current proficiency, and learning goals.

  2. 2.

    A novel parameter, the learning quotient is devised to account for the user’s rate of understanding and learning capacity.

  3. 3.

    Induction of intrinsic motivation and self-regulation in the user’s mind to pursue the course until the end, by using a learner-centered framework.

  4. 4.

    Course content from various media customized according to user preferences, periodic diagnostic tests, and evaluation against peers.

  5. 5.

    Dynamic knowledge base updated using web-based mining technologies and feedback from students.

2 Related works

Owing to the overwhelming amount of information available on the internet, many methods have been devised to make e-learning adaptive/personalized. Klašnja-Milićević et al. (2011) states that personalized learning happens when e-learning systems take conscious efforts to design educational experiences that meet the requirements, aims, aptitude, and likes of their users. The personalization in an e-learning system can be brought by applying various techniques, such as intelligent agents, tag-based recommendation systems, data mining algorithms, and others, as mentioned by Šumak et al. (2019). Wu et al. (2015) Xie et al. (2019) show that customization of recommendation using personalized data sources like learner choices, knowledge levels, accounts, and learning logs largely increases its efficiency. Kumar and Sharma (2020) literature review points out that most learning applications use learner information, location and time for context. Further, Šumak et al. (2019) suggests that dynamic personalization can be done in the learning process via optimization based on the performance of other learners in the same learning process.

Abu-Alsaad (2019) explains that in adaptive e-learning systems, personalization is achieved using a learner model, that stores information about the user, a domain model, that represents the hierarchy and relationship among knowledge elements in a domain, and an adaptation model that acts as a bridge between the former two.

Several works by Kolekar et al. (2018), Chen et al. (2020), Amit and Singh (2018), and Chang et al. (2016), have attempted to recommend learning resources based on the learning style of the users as classified by the Felder-Silverman model (FSLM). Chen et al. (2020) applies collaborative filtering and association rules to extract the preferences of each cluster. Azzi and et (2019) uses web usage mining to capture learner’s behavior and then uses fuzzy C means clustering to map the users to FSLM categories. It also considers the average time spent by users on a learning object to gather their learning style. Amit and Singh (2018) critically examines five different learning models and attempts to study the influence of learning style on adaptive tutoring. Hochreiter and Schmidhuber (1997) clusters students focusing on the similarity in the learning styles to aid the instructors to adopt teaching approach accordingly to present learning materials, in addition to finding outlier students and device approaches for them accordingly.

Similarly, Fatahi (2019) clusters learners based on personality and emotion using the models Myers-Briggs Type Indicator and the Ortony, Clore, and Collins for personality and emotion modeling respectively. It established that the experimental group obtained higher scores and a higher rate of improvement of their scores. Also, Terzis et al. (2012) explores the positive impact of empathetic feedback and motivation on the learner’s intent to use computer-based platforms.

Many of these methods rely on behavioral data alone. The content of the study resources is needed for a more successful personalization service. Also, they haven’t deployed their work in the real-world online learning system. Kirschner (2017) states that ``learning style” is a perspective and not a scientific theory with concrete backing. In particular, there are numerous protests, doubting the most used means of calculating learning styles, showing the poor connection between learning styles and instructional methods, and pointing to scarce empirical outcomes in learning styles research.

Surjono (2013) mentions that all learners are uniform, ready, and equally motivated is a wrong premise. This is not applicable in a learning environment online that is very varied. Hochreiter and Schmidhuber (1997) stated that the fast growth of the e-learning industry has shifted the main aim from mere developing platforms for learning to enhance performance and contentment gained from the learning process.

Hence, Jagadeesan and Subbiah (2020) uses the score of students on skill tests to classify them via a Navies’ Bayes classifier. This classification based on performance and recommending materials accordingly showed considerable improvement in the scores of the learners. Further, Hubalovsky et al. (2019) proves that adaptive e-learning has a positive effect on the learning effectiveness of primary school students. They define learning effectiveness as the ratio of success rate in tests to the average time spent on the exercises. Our work builds on this by defining a learning quotient and classifying users based on this parameter. This ensures higher motivation among learners and addresses the high dropout rate before completion of an online course.

Wu et al. (2017), Alsadoon (2020), and Chang et al. (2016) evaluate these e-learning systems by exposing two groups, experimental and control group, to traditional teaching method and the e-learning system respectively. The T-testing metric, as explained mathematically in detail by Gerald (2018), can be used to compare the performance of the experimental and control group in the pre-test, before using the system, and post-test, after using the system, to quantitatively evaluate the system.

There are various other performance metrics available for the evaluation of frameworks in various fields, as discussed by Botchkarev (2018). This paper proposes a framework for metrics by analyzing the critical parameters that affect the structure of these performance metrics.

Existing literature talks about involving learner’s context, clustering users, using collaborative filtering, and a tree structure of users’ pedagogical experience. However, these may fail with insufficient information of the user or when applied to a particular course. The proposed system aims to overcome these loopholes by calculating the learning quotient of the user with the deducible information from their course of learning. Also, the platform is continuously improved based on users’ feedback, making the entire model dynamic. This makes the proposed model more adaptable to changing course content which in turn provides increased satisfaction to the users.

3 The novel learning quotient model

Cognitive Learning theory talks about mental processes involved in gaining knowledge and comprehension and the influence of internal and external factors that affect the process of learning. Yilmaz (2011) states that behaviorism, cognitivism, and constructivism are the three main areas in the spectrum of learning theories. Though external factors are not controllable in a virtual environment, the internal factors of the students could be observed and analyzed with their learning preferences, style, and outcomes. These internal factors define the individual’s cognitive structure that constructs the knowledge.

The proposed learning quotient model performs mathematical cognitive modeling with the user’s facts as inputs and customizes the learning resources provided to them to maximize their learning outcome. This model focuses primarily on the learning quotient of the users which is defined as the speed of learning in a given amount of time. The idea, implementation, and integration with the LMS are discussed in detail in Section 4.2.

4 Proposed implementation

The proposed system is an intelligent agent that uses the pedagogical history and preferred learning modes of the user to display a customized course by referring to its knowledge base. The learnability of the user is determined by the learning quotient model that is formulated based on the classification of students into one of the four classes namely beginner, intermediate, advanced, and expert.

The organization of the CHIRON system is represented in Fig. 1. The input pipeline formulates the knowledge base using data mined by a dynamic web crawler. This data is passed through a cleaning and pre-processing module followed by a tagging mechanism.

Fig. 1
figure 1

Block Diagram of the CHIRON expert system

4.1 Knowledge base

This application covers three courses in management studies: Basics of Management, Human Resource Management, and Marketing Management, whose topics were organized into a dependency tree as indicated in Fig. 2. Each of the three courses consists of units labeled “Easy”, “Medium” or “Hard” by a human expert from the Department of Management Studies.

Fig. 2
figure 2

Syllabus dependency tree

Data has been scraped from credible web sources to provide course content belonging to three different media - books, articles, and videos. A custom crawler mines articles and books whereas the YouTube open-source API has been employed for scraping the videos. Dynamism is added to web scraping via a trigger that gets fired every time a source site is updated. This ensures that the system is constantly updated with the latest resources. Data scraped from the web is passed through a data cleaning and pre-processing module. The data is then semantically tagged by topics using Latent Dirichlet Allocation as illustrated in Fig. 3.

Fig. 3
figure 3

Knowledge base tagging pipeline

LDA, a generative probabilistic model of a corpus, can categorise documents under certain topics based on the content using unsupervised learning. Given the document-topic density (α) and the topic-word density (β), the joint distribution of topic mixture 𝜃, a set of N topics z and N words w is given by (1):

$$ p(\theta,z,w|\alpha,\beta) = p(\theta|\alpha)\prod\limits_{n = 1}^{N} p(z_{n}|\theta)p(w_{n}|z_{n},\beta) $$
(1)

The processed data is passed through the LDA pipeline to determine the optimal number of topics that the data can be semantically classified into. This is done based on the coherence score and UMass measure as described by Stevens et al. (2012). Topic coherence uses the extent of semantic similarity between high-scoring words in a topic to assign a score to a single topic. It is calculated as given in (2):

$$ score_{coherence} = \sum\limits_{i<j} score(w_{i},w_{j}) $$
(2)

UMass is an intrinsic measure that uses a pairwise score function that checks if the probability of two words occurring together is more than them occurring separately. These two scores were used to formulate the LDA model in (Fig. 4). It is calculated as shown in (3):

$$ score_{umass}(w_{i},w_{j}) = log\left( \frac{D(w_{i},w_{j}) + 1}{D(w_{j})}\right) $$
(3)
Fig. 4
figure 4

LDA model visualization

Where, D(wi, wj) is the number of documents in which words wi and wj occur together. It was found that the document-topic density (alpha) and topic-word density (beta) of 0.31 each gave the highest coherence score when the number of topics was found to be 18 as shown in Fig. 5a. This was also verified with the UMass score as shown in Fig. 5b.

Fig. 5
figure 5

LDA evaluation scores

The LDA model so obtained with 18 topics and 0.31 document-topic and topic-word density can be visualized as shown in Fig. 4. The pre-processed documents are tagged with the topics that they have the highest probability of belonging to. These processed data tags obtained from the web mining titles and LDA topic tags are matched with the subject matter that is pertinent to using cosine similarity score, as explained by Gunawan et al. (2018) and Qaiser and Ali (2018), with a predefined syllabus tree.

The ranked information retrieval process sorts the matched resources in the descending order of their cosine scores in the knowledge base and the content generation algorithm displays the top 10 relevant resources to the user. This process has been represented in Fig. 3.

4.2 Learning quotient model

The amount of material suggested to the user and the decision to upgrade to a higher level of difficulty are based on the learning quotient of the user. The null hypothesis is that the learning quotient of users is directly proportional to the scores obtained. The model aims to maximize the amount of learning by the students in a limited amount of time.

The learning quotient model is defined as in (4):

$$ f(t,l) = \frac{T * g(l)} {t} $$
(4)

Where,

T - Actual Time required for the resource,

l - Level of Difficulty,

g(l) - Weight associated with level of difficulty (difficulty index),

t - Time taken by the user

The difficulty indices are dynamically determined such that the distribution of cognitive load into easy, medium, and hard levels is optimal to increase the learning quotient of the user.

The following discussions cover the theoretical basis for the learning quotient equation and supply empirical support for the same.

4.2.1 Theoretical basis

The Constructivist Theory revolves around the idea that the learners actively construct their knowledge upon the foundation of previous learning. Hamat and Amin (2010) gives a detailed analysis of the advantages of a learning-driven e-learning environment based on constructivism. With the combined perspective of the Cognitive and Constructivist Theory, the learning quotient model chooses the best collection of learning resources and cognitive load distribution for the learners.

Intrinsic motivation leads to greater persistence. It is an important factor, especially when the student fails an evaluation test. To promote this, when a student fails, the required learning quotient is recalculated to a higher value and the student is motivated to learn more to increase the current learning quotient to cross this threshold value. The learning quotient threshold recalculation procedure is explained in Section 4.2.4. Once this is achieved, the student is allowed to retake the evaluation test. This increase in learning quotient value is also impacted by the familiarity with concepts when the student gets redirected back to the learning resources page since prior knowledge impacts the learning process.

The learning quotient calculation was initially inspired by the theory in (Seel 2012). It defines a relationship for the degree of learning as stated in (5):

$$ \textit{Degree of Learning} = f\left( \frac{\textit{Time Actually Spent}}{\textit{Time Needed}}\right) $$
(5)

The equation, as mentioned by Dziuban and et (2016), implies that for a student to attain competence he/she should have enough time to acquire mastery. Seel (2012) stated that the numerator of this (5) relies on the perseverance, the aptitude of the user and that the aptitude of the user could be measured by the time taken to master a subject. Lesser time would imply higher aptitude and vice-versa.

The proposed model improves on this basic equation by taking into account that the time spent learning depends on the difficulty of the concept studied and the mode of learning used by the student. The time needed is calculated per resource based on the number of words (in the case of articles and books) and the video duration (in the case of videos). The difficulty indices of the resources are dynamically updated for more accurate learning quotient calculation.

4.2.2 Empirical support

It is important to note the fact that the students portray huge differences in prior knowledge, learning style, and preferences, motivation, cultural background, etc. These factors need to be considered for each individual to maximize the learning outcome. An effective way to evaluate the learning outcome of students is to analyze their scores in the tests conducted at the end of every unit. The proposed cognitive model performs well if it can help the students score better.

To evaluate the proposed hypothesis, we conducted a test with a focus group of students from diverse backgrounds with varying knowledge levels. Observing their learning quotient and score trends over time, the following results were obtained, as shown by the graphs in Fig. 6.

Fig. 6
figure 6

Learning Quotient-Scores relationship coursewise

The results prove the existence of a positive relationship between the learning quotient and scores obtained, supporting the hypothesis.

Using the data on time taken by the user and the actual time required, the final parameter to calculate the learning quotient, the difficulty indices, are updated based on threshold-based classification of the users (Table 1) as discussed in the following section.

Table 1 Feature (score/learning quotient) based user classification

4.2.3 Threshold-based classification of users

At the end of every unit, the evaluation test scores and average learning quotients are recorded for every user, using the Algorithm 1. Optimal difficulty indices are determined based on the fact that the learning quotient is directly proportional to the score obtained by the users.

figure a

The classes under which the users fall are determined based on the score and average learning quotient features independently. The users are said to be correctly classified if their score-based classes and average learning quotient-based classes match. Based on this, the misclassification rate is calculated. The number of correct and wrong classifications are saved for the calculation of the misclassification rate as new users complete the evaluation tests.

If the newly calculated misclassification rate crosses a specified tolerance level, then the difficulty indices are dynamically optimized according to the constraints given in Table 2 such that the final misclassification rate obtained is minimum, using Algorithm 2. Otherwise, the previous difficulty indices are retained. The ranges specified in Table 2 are adapted from Johari et al. (2010) and defined according to the context of use such that the impact of the difficulty index value for the extreme cases of too easy and too hard on the learning quotient calculation (lesser user time leading to very high LQ value and higher user time leading to very low LQ value respectively) is balanced.

figure b
Table 2 Difficulty index range

Using 300 learning quotient values from 60 users and their corresponding score values, optimal difficulty index values for all the three levels were obtained by iterating through the ranges of each of them defined in Table 2.

4.2.4 Learning quotient threshold updation

Each unit administers an evaluation composed of GATE level multiple-choice questions to test if the user has made progress. A user can unlock this test for the current topic only when his learning quotient exceeds a predetermined threshold. This threshold is calculated as the weighted average of the learning quotients of the group of peer users who passed that particular evaluation. This is illustrated in Algorithm 3.

figure c

4.3 Content generation algorithm

The knowledge base consists of constantly updated versions of data mined from the internet belonging to different modes. The proficiency level of a given user at a course is initially defined as based on the result of the diagnostic test.

The user experience pipeline is defined as follows - After account authentication, the user is taken to a profile page shown in Fig. 7a with options to choose from 15 different units belonging to three courses of Management Studies. Based on the choice, customized learning materials are retrieved from the knowledge base and displayed according to the level at which the user is currently present. The resulting page presents three classes of information - recommended articles, books and videos as illustrated in Fig. 7b, c.

Fig. 7
figure 7

Ranked retrieval content display

The user is granted access to progressively advanced content based on his learning gain as he covers more content using the following mechanism - The amount of time spent on each learning material is measured using a timer that is attached to each resource, and this gets triggered when the user starts learning, with the provision to pause for breaks. After a sufficient amount of content is covered as measured by his learning gain, he is redirected to an evaluation designed for that proficiency level. If the user passes the evaluation of his current level, then he is free to move on to another unit. If not, content for the same unit is generated in-depth and the threshold learning quotient gets increased. Fig. 8a is the evaluation and Fig. 8b shows the results of that evaluation.

Fig. 8
figure 8

Evaluation portal for tracking progress

The feedback mechanism shown in Fig. 9 requires the user to rate the relevance of each completed resource and updates a score for the resource in the knowledge base. The knowledge base is sorted based on the updated score values. If this measure of relevance falls below a minimum threshold value, the resource is removed from the knowledge base. This allows only the most relevant and useful resources to remain and ensures the dynamic updating of the knowledge base.

Fig. 9
figure 9

Feedback per resource for each user

The video of the working of the deployed LMS could be viewed from https://drive.google.com/file/d/1o0TGOjvEdqQCE6-Gt26r-Ftap6cIUNxn/view?usp=sharingLMSDemonstration Video.

5 Performance metrics for evaluation

The major components of the framework, the content generation block and the learning quotient model are tested individually, and the usability of the system, on the whole, is tested.

5.1 Content generation evaluation: Precision

With the help of the feedback data obtained from the users at the end of every material finished, the average precision percentage of the suggested materials is calculated for a different number of suggested materials in the range of 10 to 50. For the two modes of learning offered by the platform, the average precision percentage was calculated using the users’ feedback, as summarized in Table 3.

Table 3 Precision for different modes of learning

5.2 System evaluation - statistical analysis

5.2.1 T-score analysis

Two focus groups, A and B, containing 25 users each are formed. A diagnostic test is conducted before each user started learning from the platform. Group A used the platform with the learning rate model. Group B is provided with the same materials independently. Towards the end, a common assessment is conducted for both the groups, and the test scores of users from both groups are compared. The results are shown in Fig. 10.

Fig. 10
figure 10

Comparison of average scores between the focus groups

The above experiment can be statistically summarised as follows: Null Hypothesis: The improvement in scores is not significant.

Alternate Hypothesis: The improvement in scores is significant.

$$ t-score = \frac{ \mu_{1} - \mu_{2}}{\sqrt{v^{2} * (1 / N_{1} + 1 / N_{2})}} $$
(6)

Where, μ1, μ2 correspond to sample means

v2 equals the pooled sample variance

N1 and N2 are the sample sizes

tscore is a quantile with N1 + N2 − 2 degrees of freedom.

Significance Level = 0.05 (For P-value)

A t-score of 1 supports the null hypothesis whereas a value greater than 1 supports the alternate hypothesis. A P-value greater than the significance level (0.05 here) supports the null hypothesis while a value less than that supports the alternate hypothesis. Using the data from the two focus groups and the (6), a T-score of 2.93 and a P-value of 0.0028 was obtained, where the T-value is much greater than 1 and the P-value much lesser than 0.05. A significant difference is observed between the mean score values obtained by the focus groups. This validates the system’s scope to effectively aid the learning of users.

5.3 System evaluation: Usability test

After completing a section of a course, user feedback was collected on navigability, usability, and user satisfaction with the e-learning platform. The usability of the platform received a score of 7.6/10 as evaluated by 50 users.

5.3.1 Learning gain

The academic growth showed by the users could be quantitatively defined using Learning Gain. It measures the post-test score relative to the pre-test score, as shown by (7).

$$ Learning Gain = \frac{Post - Pre}{100 - Pre} $$
(7)

Where Post is the Post-test score and Pre is the Pre-test score obtained by the user.

For each of the three courses presented in the proposed model, a focus group of 20 students was directed to take up an evaluation test before and after using the learning platform for the pre-test and post-test data respectively. The learning gain results obtained using the equation above are shown in Table 4.

Table 4 Learning Gain Course-wise

It can be inferred that the learning gain values obtained in all three courses are positive. This indicates that the learning model has had a positive impact on the learning process of users with high confidence.

6 Comparison with previous work

There has been extensive work done in the field of e-learning, however, only a few of them have included a large number of features and done extensive testing.

6.1 Feature based comparison

The developed adaptive learning management system has the following features:

  1. 1.

    Dynamic Web Scraping

  2. 2.

    Evolving knowledge base

  3. 3.

    User profile: pedagogical constraint

  4. 4.

    User profile: learning style

  5. 5.

    User profile: preferred mode of learning

  6. 6.

    User profile: learning quotient

  7. 7.

    User clustering

  8. 8.

    Content Generation Pipeline

  9. 9.

    Learning Path Recommendation

  10. 10.

    User Feedback

  11. 11.

    Learning Quotient Threshold to enhance motivation

This model is compared with other e-learning systems on these seven aspects. The papers taken into consideration are enhanced learning model by Chen et al. (2020), e-learning with data mining clustering by Kausar et al. (2018), fuzzy tree-based approach by Wu et al. (2015), smart recommendation system by Tang and McCalla (2005) and learning styles based model by Amit and Singh (2018). The comparison results are tabulated in Table 5, where ✓in a row implies that the feature is present in the corresponding e-learning system.

Table 5 Feature comparison with existing models

The tabulated results show that compared to other e-learning systems, this proposed system enhances the learning experience of the users by (1) Calculating the learning quotient or the learning ability of the user at every stage of the course (2) Increasing the motivation of the users to study and understand more content before taking the evaluation to complete the course by placing a learning quotient cut-off requirement for each course (3) Ensuring the content in the knowledge base is in tune with the current web content and the user feedback.

6.2 Non-parametric test (Mann Whitney U test) based comparison

The proposed system is quantitatively evaluated as elaborated below. The analysis was conducted on two groups of 25 students belonging to different years and majors in college. Since this compares the performance of students who used the system and those who didn’t, this result corresponds to the post-test result. A two-tailed Mann-Whitney U test was performed on this data. This yielded a result as shown in Table 6.

Table 6 Summary of two-tailed mann whitney u test

Weaver (2015) considers samples with size greater than 15, and without outliers, to have an approximate normal distribution and so z-score can be used to obtain the p-value. The z-score was found to be 4.7397 that gave a p-value less than 0.0001. This result is significant at p <.05.

With an approximately normal distribution, t-testing was also performed on this data. This result is compared with those obtained by Wu et al. (2017), Alsadoon (2020), and Chang et al. (2016). Wu et al. (2017) showed tests that involved two groups of 30 students around the age of 21 to compute the efficiency of its system. Alsadoon (2020) collected two groups of 65 students each from different majors to test their platform. Chang et al. (2016) conducted a study using 58 first-year students in experimental and control groups. The test lasted for 8 weeks with 3 sessions per week covering topics in the subject Visual Basics. The conclusion is summarized in the Table 7.

Table 7 Statistical comparison with existing models

It can be inferred that the proposed system has a higher t-score result and a much lower p-value compared to the existing literature. This shows that this system provides greater learnability to users as compared to traditional and existing e-learning systems.

7 Conclusion

This paper implemented a learner-centered adaptive learning management system that automates the work done by instructors in three courses pertaining to Management Studies. It creates a custom course for every student based on their existing level of knowledge and their expectations and continuously updates the suggested course material based on their learning speed and preferred mode of learning. The web crawler is constrained to scrape data from reputed websites endorsed by a human expert and hence the sample space of data mined is of good quality. The ranked retrieval algorithm displaying data to the users exhibits an average precision percentage of 80 and 86 percent for videos and articles respectively with the prescribed syllabus and thus rendering only highly relevant content. The learning quotient model dynamically updates the difficulty index values to attain a minimum misclassification rate thus fulfilling its objective to enhance the learnability of the user. The qualitative testing of the entire platform showed a significant improvement in the performance of the users employing the platform. The limitations of the current model include restriction of the mode of learning to books, videos, and articles and the lack of subjective diagnostic and evaluation questions due to automated scoring.

To the best of our knowledge, this work is a pioneer that examines the benefits of a learning quotient model to measure learn-ability, compared to other approaches, in this context. Despite some drawbacks, the current results show potential ways to increase the effectiveness of learning by incorporating intrinsic motivation and continuous feedback. Future research should resolve these limitations and study the effect of the learning quotient model on custom-made databases for multiple subjects and a production scale sample of students.