Introduction

The practice of fast, detailed, and regular feedback is a fundamental issue in the assessment process in teaching and learning, whether in face-to-face or distance education settings (Ross et al., 2006). Automated assessment, in addition to facilitating this practice, provides rapid feedback for both the teacher and the student, potentially reducing the workload burden on teachers in grading assessments designed using closed-ended (multiple-choice or objective) and open-ended (essay or subjective) questions (Liu et al., 2016; Bukie, 2014).

An open-ended question is understood to be one in which the student is required to produce a written response in a natural language. Despite these questions being the most effective way to assess cognitive aspects related to creation, synthesis, and metacognition (Airasian et al., 2001), providing rapid feedback in the automated assessment of open-ended questions, known as Automatic Short Answer Grading (ASAG), is a computationally challenging problem to solve (Burrows et al., 2015).

According to Ramesh and Sanampudi (2022), most advancements in research on open-ended questions in automated assessments involve machine learning models, where the feedback is reduced to a percentage value indicating how correct the response to the open-ended question is (e.g., 75% correct). Consequently, despite providing some form of feedback, this strategy offers insufficient information that does not elucidate cognitive deficiencies or provide guidance on how the student can mitigate such deficiencies through clear and content-relevant learning guidelines.

Furthermore, literature works such as Silva et al. (2023) and Erdt et al. (2015) highlight one of the significant challenges in constructing Educational Recommendation Systems (ERS): the evaluation of ERS effectiveness. This challenge necessitates methods that go beyond traditional performance metrics (such as accuracy) and should encompass the impact on users’ learning performance. Therefore, the approach presented in this work seeks to mitigate this problem.

In the Brazilian context, the Common National Curriculum Base (BNCC) (da Educação, 2018) aims to establish a set of essential learning guidelines for students in Basic Education, ensuring their educational rights in accordance with the National Education Plan (PNE). Pedagogical decisions within the BNCC emphasize the development of competencies in students, encompassing knowledge, skills, attitudes, and values, equipping them to apply these elements in addressing complex demands of daily life, exercising citizenship, and preparing for the world of work. The BNCC takes into consideration the particularities of each area of knowledge, considering the objects of knowledge, the characteristics of the students, and the specifics of each stage of schooling.

Each area defines specific competencies to be developed within the formative pathways. In addition to competencies, each discipline presents a set of specific skills linked to various objects of knowledge. These skills are organized into thematic units and are defined in different areas, such as (i) Natural Sciences and their Technologies, (ii) Human and Applied Social Sciences, (iii) Mathematics and its Technologies, and (iv) Languages and their Technologies.

In light of the above, this work proposes an educational recommendation system approach that integrates characteristics of pedagogical guidelines, such as the development of competencies and skills based on the BNCC, contributing to a better efficiency in the adequacy between the ERS recommendations and the pedagogical needs of students, as well as improving the agility and quality of the feedback of the automatic evaluation of open-ended questions. This approach also utilizes ontologies generated from teacher and student responses, ontology alignment algorithms, and personalized action recommendation algorithms to minimize each student’s cognitive deficiencies.

To achieve this, the presentation of the work is organized into five additional sections. Section 2 outlines related works. Section 3 provides details on the methodology employed. Section 4 systematizes the proposed approach. Section 5 discusses the main findings. Finally, Section 6 presents concluding remarks and suggestions for future work.

Related Work

In the literature, there are numerous related works on automated assessment systems for open-ended questions. Ramesh and Sanampudi, for instance, provide a literature review on the subject (Ramesh & Sanampudi, 2022). In Ajetunmobi and Daramola (2017), they illustrate an approach for determining the final score by examining the similarity correspondence between student responses and teacher content, based on the Wu-Palmer ontology similarity algorithm (Wu & Palmer, 1994). The OpenNLP library was used for sentence detection, phrase chunking, part-of-speech tagging (POS), and morphological and syntactical analysis of texts. However, this work does not provide details regarding the quality of the created ontologies or the format of response feedback. The final score presented uses the Wu-Palmer measure, assigning scores as follows: maximum score for values greater than 0.6; average score for values between 0.4 and 0.6; and zero score for values less than 0.4.

In the work by Ramachandran et al. (2015), a graph-based approach was proposed to identify important patterns from texts provided by rubrics and responses from students with maximum scores. On the other hand, Zupanc et al. (2017) proposed sentence similarity networks with 30 different metrics to determine response scores.

There are also studies that assess the performance of Large Language Models (LLMs) in open-ended questions, using metrics such as cosine similarity (Pinto et al., 2023) and accuracy (Freire et al., 2023). These studies highlighted the good performance of the model’s responses after manual evaluation by humans. However, for model responses with inconsistent information, there is a concern about so-called “hallucinations”, which is a technical term used to describe the limitation that can lead LLMs to produce responses that, although they may appear correct, are not actually accurate.

The work (Gombert et al., 2024) presented a literature review with recent contributions on ASAG issues, grouping works into categories such as: automated essay scoring; content scoring; semantic segmentation; and natural language processing in the German language. Furthermore, Gombert et al. (2024) investigated a transformer-based approach GBERT and T5 models to evaluate discursive responses in a case study; as well as feedback assessment. Aimed at answering the following research questions: “To what extent can we automate the analytical scoring procedures necessary to provide highly informative feedback to students regarding the content of their essays?”; and, “How is the highly informative feedback provided perceived by students?”.

Among the limitations present in their approach, Gombert et al. (2024) mentions the possibility of improvements in feedback personalization. Where, the feedback should address students’ errors in more detail and give them individual explanations of what they did wrong, rather than just directing them to appropriate lesson content. In this sense, the present work presents contributions with an emphasis on improving feedback and directing to content, as detailed in the following sections.

The work (Silva et al., 2023) analyzed primary studies published between 2015 and 2020 to identify research trends, limitations, and opportunities related to educational recommendation systems (ERS). Implementing ERS faces complex challenges arising from the need to align recommendations with users’ specific learning expectations and needs, making personalized recommendations a challenging task (Cazella et al., 2014; Verbert et al., 2012; Buder & Schwind, 2012). Additionally, common technological issues in general recommendation systems, such as cold start and data scarcity, also impact ERS. Alongside these challenges, the problem of overspecialization can lead to user dissatisfaction (Garcia-Martinez & Hamou-Lhadj, 2013; Iaquinta et al., 2008; Khusro et al., 2016). Another important research challenge pertains to evaluating the effectiveness of ERS, requiring methods that go beyond traditional performance metrics (such as accuracy) and include the impact on users’ learning performance (Erdt et al., 2015) . Another significant research direction concerns the presentation format of recommended items. While few studies, such as Gharibi et al. (2024), have explored this topic in general recommendation systems, specific research on ERS in this area remains scarce (Silva et al., 2023). Further investigation is needed to determine whether there are more suitable ways to present specific types of items to users.

Based on what has been presented, the distinctive aspect of the present work lies in the fact that the assessment of open-ended questions, as well as the recommendation mechanism of the system, follows learning guidelines based on the competencies and skills of the BNCC. It utilizes formal aspects such as ontologies and their alignments, allowing for greater formalism and subsequent interoperability with other systems. It is important to highlight that this work is an evolution of the work presented in Feitosa et al. (2022). Regarding the construction of the recommendation system in this article, it is based on the agent recommendation formalism model proposed by de Souza et al. (2020).

Methodology

The present work used the case study as a methodological approach. Yin (2003) defines case study as “[...] an empirical inquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident”. Furthermore, the author (Yin, 2003) continues to characterize the case study that: “(i) copes with the technically distinctive situation in which there will be many more variables of interest than data point, and as one result; (ii) relies on multiple sources of evidence, with data needing to converge in a triangulation fashion, and as another result; (iii) benefits from the prior development of theoretical propositions to guide data collection and analysis”.

General Elements of Case Study

This study employs a descriptive strategy. The research question we aim to answer is: (RQ) “How does the proposed ERS apply the BNCC guidelines based on skill and competence descriptors to generate recommendations?”. To address this, we investigate two propositions: (P1) “the ERS allows the identification of the level of proficiency in skill descriptors of students”; and (P2) “the ERS generates satisfactory recommendations for students”. The investigation of propositions P1 and P2 will provide evidence for the RQ answer. The BNCC competence unit investigated is: “Build and analyze computational solutions to problems from different areas of knowledge, individually or collaboratively, selecting appropriate data structures (records, arrays, lists, and graphs), improving, and integrating knowledge”, on the thematic axis “Computational Thinking”.

Data Collection

The data used in this case study were collected from various sources, including: official BNCC documents; a textbook on the subject of the thematic axis addressed, available in open format on the web; a database of questions and answers for open-ended questions.

  • (D1) BNCC competencies and skills (da Educação, 2018) and da Educação (2022).

  • (D2) Skill descriptors from an open-format textbookFootnote 1.

  • (D3) Questions and answers from the database (Mohler et al., 2011).

Data Analysis

There is a link between the data and the investigated propositions. The data in D1, D2, and D3 provided support for propositions P1 and P2, as detailed in the case study discussion.

Limitations of the Approach

It is important to highlight some limitations of the proposed approach. The process of building ontologies from texts may present some degree of imprecision. Consequently, the alignment results and the system’s recommendations may be impacted.

Additionally, other limitations are related to the format of the recommendations. In this work, the recommendations are presented as messages reinforcing the contents (didactic and evaluative objects of knowledge, as well as the evaluation feedback report). However, improving the presentation with the use of visual aspects and other usability features can improve the user experience with the system.

Approach Details

The development process of the proposed recommendation system is divided into 5 stages, as illustrated in Fig. 1. In Stage 0 (zero), pedagogical data is configured, which will serve as input for the subsequent stages. This includes information about the course, discipline, competencies, skills, didactic and evaluative knowledge objects, among others, as detailed in the scenario described in this work.

Fig. 1
figure 1

Stages of the approach

In Stage 0, other important tasks include acquiring specialized knowledge in the form of condition-action rules for action recommendation and designing a mechanism that allows for the definition of current and desired situations, the effects of actions in current situations, and the notion of the problem’s state space for recommendation. This, in turn, enables the use of systematic and local search algorithms for the generation of recommendations in the form of plans, as future works.

Stage 1 is responsible for preprocessing unstructured text from users, including student and teacher text. Stage 2 involves constructing the entities, concepts, and relationships of ontologies. In Stage 3, the ontology alignment process takes place. Stage 4 takes the report generated in Stage 3 as input and returns another report containing correct entities, which includes information about correspondences, confidence measures between the two ontologies (student and teacher), missing entities, and incorrect entities. In Stage 4, the feedback from the assessment obtained with the information resulting from the ontology alignment is used as input for a recommendation system, which will provide suggestions for reinforcing the necessary study topics as output.

In summary, the recommendations are generated considering: (i) the question (open-ended question) and the reference answer crafted by the teacher; (ii) the ontology automatically constructed from the reference answer; (iii) the text of the student’s response to the question; (iv) the ontology automatically constructed from this response; (v) the alignment report between the teacher’s and student’s ontologies; and (vi) a set of learning resources associated with the question’s content, such as texts, videos, and other media that can be used to reinforce the learning of the necessary content for a correct response to the question.

Formalization of Educational Concepts

For the representation of educational resources in the present approach, the following concepts have been adopted, as detailed below.

  • Set of Courses: Represents the courses, each of which has a set of associated disciplines, teachers, and students. Denoted as “C”.

  • Set of Disciplines: Represents the disciplines associated with a course. Denoted as “Di”.

  • Set of Specific Competencies: Represents the competenciesFootnote 2 associated with a particular discipline. Denoted as “Co”.

  • Set of Skills: Represents the skillsFootnote 3 associated with a specific competency. Denoted as “H”.

  • Set of Descriptors: Represents the descriptorsFootnote 4 associated with a skill. Denoted as “D”.

  • Set of “Objects of Knowledge”: Represents the knowledge objects that group pairs of descriptors associated with an educational material, be it didactic or evaluative. Denoted as “K”.

  • Didactic Object of Knowledge: Associates a set of descriptors with the metadata of a didactic material (identifier, the material itself, other pertinent metadata). Denoted as “KD”.

  • Evaluative Object of Knowledge: Associates a set of descriptors with the metadata of an evaluative material, used to measure the student’s learning performance (described similarly to “KD”). Denoted as “KA”.

  • Set of Teachers: Represents the teachers associated with a discipline. Denoted as “P”.

  • Set of Students: Represents the students associated with a discipline. Denoted as “E”.

  • Set of “Student Competence Profiles”: Represents all pairs of descriptors and values, following the assessment of a student’s skill descriptors under a specific competency (through “KA”).

  • Values: Represent linguistic variables of Fuzzy sets, corresponding to the evaluation of a descriptor. For example: very low, low, medium, high, very high.

  • Current Competence Profile: “PC” represents the current (current) state of the student after the assessment of their descriptors.

  • Desired Competence Profile: “PD” represents the desired or goal state of a profile to be achieved after recommendations.

  • Teacher’s Response: Represents the teacher’s response (answer key) associated with a “KA”. Denoted as “RP”.

  • Teacher’s or Professor’s Ontology: Represents the ontology of the teacher’s response (answer key) associated with a “KA”. Denoted as “OP”.

  • Student’s Response: Represents the student’s response associated with a “KA”. Denoted as “RE”.

  • Student’s Ontology: Represents the ontology of the student’s response associated with a “KA”. Denoted as “OE”.

  • Ontology Alignment: Represents the result of the alignment of the teacher’s and student’s ontologies. Denoted as “A”.

Set of Condition-Action Rules

The following list enumerates the set of “Condition-Action Rules” considering the alignment result “A” related to skill “\(H_x\)”, with its evaluative knowledge object “\(KA_y\)”, of student “\(E_w\)”, at a specific time (scenario) “\(T_z\)”:

$$\begin{aligned} \text {If}\, A(H_x, KA_y, E_w, T_z) = \textsf {empty} \text {, then}\, Rec_{x1}. \end{aligned}$$
(1)
$$\begin{aligned} \text {If}\, A(H_x, KA_y, E_w, T_z) = \textsf {low} \text {, then}\, Rec_{x2}. \end{aligned}$$
(2)
$$\begin{aligned} \text {If}\, A(H_x, KA_y, E_w, T_z) = \{\textsf {regular} \text { or } \textsf {high} \} \text {, then}\, Rec_{x3}. \end{aligned}$$
(3)

Where \(Rec_{xn}\) are messages presented in the output of the recommendation system, containing feedback information about a particular skill \(H_x\) being assessed. For this purpose, such a message includes: the resulting report from the alignments of ontologies “\(OP_y\)” and “\(OE_y\)” with correct, incorrect, and missing concepts; as well as the list of “KD”s and “KA”s related to the respective descriptors and assessed skills that require reinforcement in learning.

Design of the Intelligent Educational Recommendation Agent

The problem of designing the artificial agent pertains to its architecture and program. The agent’s architecture extracts information from perceptual stimuli in the task environment through various sensors, executes the agent’s program with the aim of selecting rational actions, and carries out these actions in the environment through various actuators. The artificial agent can be viewed as a perception-action function that maps perceptions to actions, meaning agent: perceptions \(\rightarrow \) actions. The agent’s program is a representation of this function in a programming language suitable for the agent’s architecture.

Fig. 2
figure 2

Schematic diagram of an artificial agent program

Fig. 3
figure 3

Skeleton of an artificial agent program

The formalism used to describe the agent’s program model in this article has been synthesized from works on intelligent agents in Russel and Norvig (2020). Figures 2 and 3 depict its structure as an information processing system decomposed into three subsystems, represented by three functions: the perception subsystem represented by the function see, the internal state subsystem represented by the function next, and the decision-making subsystem represented by the function action. The structure considers that, at any moment k in a total time span K:

  1. 1.

    The artificial agent’s sensors perceive information from stimuli (perceptions) and measure some properties of the task environment, defined in a set of n possible observable properties by its sensors \(p^k \in P = \{p^1, p^2, \ldots , p^n\}\).

  2. 2.

    The agent’s program, program: \(P \rightarrow A\), maps information in \(p^k \in P\) to information about an action defined in a set of m possible actions \(a^k \in A = \{a^1, a^2, \ldots a^m\}\), in three stages of information processing:

    1. 1.

      Its perception subsystem, see: \(P \rightarrow S\), maps information in \(p^k \in P\) to a possible description of the state \(s^k \in S = \{s^1, s^2, \ldots \}\), which are computational representations of aspects of the observable properties described in \(p^k\).

    2. 2.

      Its internal state update subsystem, next: \(S \times I \rightarrow I\), maps the current description of the state \(s^k \in S\) and the previous description of the internal state \(i^{k-1} \in I = \{i^1, i^2, \dots \}\), to a possible new current internal state \(i^k \in I\), considering some information about the world model.

    3. 3.

      Its decision-making subsystem, action: \(I \rightarrow A\), maps the current description of the internal state in \(i^k \in I\) to an action described in the set of possible actions \(a^k \in A\), based on some information available for decision-making.

  3. 4.

    The actuators of the artificial agent execute the mapped (selected) action \(a^k \in A\) by the agent’s program and perform it in the environment.

At any moment \(k+1\) of interaction with its task environment, after executing the selected action \(a^k \in A\) and changing from the previous perception \(p^k \in P\) to a new current perception \(p^{k+1} \in P\), the artificial agent initiates another cycle of actions: (1) perceive the environment through sensors; (2) execute the program to represent the observable properties perceived from the environment through the see function, update the internal state through the next function, and select a new action through the action function; (3) perform the new action in the environment through actuators.

The internal state information \(i^k \in I\) describes aspects of the environment that are not currently being perceived in \(p^k \in P\) by the sensors and in \(s^k \in S\) by the see function. Specifically, the next function can calculate the current description of the internal state considering the current description of the state \(s^k \in S\), the previous description of the internal state \(i^{k-1} \in I\), and information about a world model, i.e., descriptions of the effects of possible agent actions on the states of the environment, represented by the Action and Result functions in Fig. 2, and how the environment evolves independently of the actions taken.

Figure 2 encapsulates the four agent program models suggested by Russel and Norvig (2020). The simple reactive agent program selects actions based on \(s^k \in S\) and a set of condition-action rules (C-A Rules). The model-based reactive agent program selects actions based on \(i^k \in I\) and the C-A Rules. The goal-based program selects actions according to \(i^k \in I\) and a description of the desired situation in the environment (Goal). The utility-based program selects actions according to \(i^k \in I\) and a utility function, mapping descriptions of internal states and actions to real numbers (utility).

Agent Function

The following algorithm describes the procedures of the recommendation agent. After receiving input information, such as the Current Student Profile data and the responses from the Student and the Teacher, through its PerceptionK, the next function performs alignments to update the agent’s internal state.

Algorithm 1
figure a

agent_function(\(Perception_K\)) returns an action.

Therefore, the function’s agent return corresponds to the recommendation message \(Rec_{x1}\), \(Rec_{x2}\), or \(Rec_{x3}\), formed from the alignment report data between the ontologies, containing the correct, incorrect, and missing concepts. In addition to the report, to create a personalized learning path, the KAs and KDs are also listed according to the Fuzzy degree values associated with the alignment responses: (i) values regular or high do not present the list of KDs and KAs; (ii) value low associates the elements from the KDs and KAs list that showed incorrect or missing values in the alignment; (iii) value empty associates the entire list of KDs and KAs.

Results

The presentation of results has been organized following the defined process steps in the previous section. The illustration of a complete scenario is detailed throughout this section. The configuration information for step 0 (zero) has been detailed in Tables 1 and 2.

In step 1, the process of automatic ontology construction (Ontology Learning - OL) proved to be a challenging task, and as detailed in Al-Aswadi et al. (2020), the literature presents some tools and techniques to assist in this step. In this step, we chose to implement our own solution that makes use of well-known tools and libraries from the literature - such as NLTK and Stanford OpenNLP - to support the preprocessing of natural language text and generation of Subject-Verb-Object (SVO) objects using the Python language. This choice was made due to difficulties in accessing and using tools from the literature, such as Text2Onto (Cimiano & Völker, 2005) and Ontogen (Fortuna et al., 2007), discontinuities in updates, lack of sufficient documentation for proper handling, and issues with exporting the ontology to OWL or RDF formats.

In Step 2, the SVO objects were converted into an ontology described in the OWL, as illustrated in Fig. 4 (a). The conversion process involved mapping the SVO objects to appropriate ontology elements, such as classes, properties, and relationships, in order to represent the knowledge captured from the text.

Table 1 Stage 0: Configuration of pedagogical elements
Table 2 Continuation of Table 1

In step 3, as mentioned in Euzenat et al. (2013); Otero-Cerdeira et al. (2015), the ontology alignment process has proven useful in various scenarios, and there are several ontology matching systems, metrics, and algorithms for ontology alignment. In this step, several ontology alignment tools were selected: Wang and Xu (2008), LogMap (Jiménez-Ruiz & Cuenca Grau, 2011), Ontoemma Wang et al. (2018), Machine-learning-ontology-matching (Bulygin & Stupnikov, 2019) e OntoMatch (Faria et al., 2013). These tools were evaluated based on characteristics such as documentation, ease of access and installation, programming language, and libraries. After conducting some experiments, OntoMatch was chosen among the listed tools for useFootnote 5. OntoMatch was selected because, in comparison to the others mentioned, it returned results without major issues or configuration difficulties, as shown in Fig. 4 (b).

Fig. 4
figure 4

(a) OWL ontology of the teacher. (b) Alignment report

Finally, in step 4, the results corresponding to the alignment report, as well as the items selected to compose the recommendation messages for the user, are illustrated in Tables 3 and 4.

The Case Study Scenario

To illustrate the case study scenario, an initial configuration was considered, with pedagogical data from a discipline related to “Computational Thinking”, based on information from documents from the BNCC (da Educação, 2018) and reports from study committees of the Brazilian Computing Society (da Educação, 2022). In addition, other information detailed here makes up a simplified scenario to illustrate the proposed approach, as detailed below.

Tables 1 and 2 illustrate an instance of the elements of the educational taxonomy proposed in this work. Among the information presented, the pedagogical details about the competence “Co” stand out, which deals with the knowledge necessary for manipulating data structures, item 6 in Table 1. For this, one of the skills “H” related to this competence is associating datasets with their appropriate types, item 7 in Table 1. Further analyzing Table 1, it is possible to observe that the list of descriptors “D” and their didactic knowledge objects “KDs” and evaluative “KAs”, items 8 and 9 in Table 1, cover the necessary information for the student to develop the mentioned skill.

For this purpose, in stage 0 (zero), the result of the evaluation of the “KAs” from the empty alignment “A” will form the student’s current competence profile “PC”, as indicated in item 10 of Table 2. Furthermore, their desired competence profile “PD”, item 11 of Table 2, is set to regular or high values for all the evaluated skills in competence “C”. This indicates that when their skills are assessed under a competence, the minimum required value to fulfill such competence is that their set of skills should have regular or high values. Finally, item 12 of Table 2 indicates the set of rules defined by an expert, which guides the selection of recommendation messages “Rec” based on the alignment results between the student’s and the professor’s responses.

Table 3 Stages 1 to 4: complete scenario of the case study
Table 4 Continuation of Table 3

The scenario illustrated in Tables 1, 2, 3 and 4 provides information to validate propositions P1 and P2. Specifically, items 1 to 26 detail: the pedagogical elements present in the data sets D1, D2, and D3; the steps required to validate propositions P1 and P2, as discussed below.

To validate proposition P1 (“the ERS allows the identification of the level of proficiency in skill descriptors of students”), we use the results of the alignments between the student and teacher ontologies. These results are used to construct the Student’s Current Competency Profile (PC), which records the level (low, medium, or high) of each skill descriptor for each student. As illustrated in items 10, 16, 17, 20, 21, 24, and 25 of Tables 1 to 4, the evaluation of each KA (Assessment Knowledge Object) derived from the alignment results allows us to record the proficiency level of each descriptor and each skill related to the students’ competencies.

To validate proposition P2 (“the ERS generates satisfactory recommendations for students”), we analyze the recommendations presented in items 18, 22, and 26. These recommendations include the following: the results of the KA assessment report (this report indicates the correct, incorrect, or missing concepts in the student’s answers to open-ended questions); and, the KDs (Didactic Knowledge Objects) and KAs (Assessment Knowledge Objects) that the student needs to focus on to improve their learning.

Conclusion

The proposed approach aims to enhance the quality of feedback in automatic assessments of open-ended questions, thereby reducing students’ cognitive difficulties, by recommending additional study materials for content that has not been effectively assimilated. This approach also promises to improve the efficiency of feedback for both students and teachers while reducing the workload of teachers in the open-ended question grading process.

The proposed approach contributes to the state of the art in Automatic Short Answer Grading (ASAG) research by focusing on content-centric teaching and learning processes, addressing a gap in current solutions. Additionally, it is anticipated that evaluating this approach in comparison to traditional methods will enrich the discussion and stimulate future research on content-centric assessment. Furthermore, an approach that incorporates pedagogical guidelines such as skills and competencies, and is ontology-driven, will facilitate interoperability with other educational solutions in the domain.

The presented approach goes beyond traditional performance metrics and includes the impact of the system’s performance on users’ learning outcomes. This approach addresses a key challenge in ERS evaluation: the lack of a comprehensive framework. Existing methods typically focus on traditional performance metrics, such as accuracy and recall, but these metrics do not capture the impact of the system on users’ learning outcomes.

Some limitations of the approach were also listed, where a more in-depth investigation of the performance of the ontology learning process could contribute to a better accuracy of the system’s recommendations. Therefore, the use of models with Generative Artificial Intelligence (GenAI) can assist in this process.

In the future, additional functionalities could be integrated into the proposed platform. For instance, automated question generation guided by ontologies could offer benefits to teachers by reducing the time required to create questions. This type of automation can assist in crafting questions that align with the content that students should genuinely be learning, as advocated by Gibbs and Simpson (2004), who outline the conditions under which assessment supports learning.

In our future endeavors, we aim to explore alternative structures for programs of intelligent artificial agents, including the model-based reactive agent, the rule-based agent, the goal-oriented agent, and the utility-based agent (Russel & Norvig, 2020). The model-based reactive agent operates by recommending actions based on a comprehensive model of the current situation and a set of expert rules, which map specific situations (conditions) to appropriate recommendations (actions). In contrast, the goal-oriented agent, instead of relying on predefined rules, derives recommendations in the form of suitable plans based on the desired outcomes. Similarly, the utility-based agent employs a multi-objective utility function to generate these plans, optimizing them according to various objectives. Through exploring these diverse agent architectures, we seek to enhance the efficacy and adaptability of our intelligent systems.