Introduction

During the COVID-19 pandemic, ordinary modes of teaching and learning activities were suspended. Many institutes of higher education and other organizations changed their teaching approaches and sought to provide a convenient, safe, and flexible educational environment for students. As a result of this sudden change, a new level of interest in distance learning and digital competence appeared in response to the need to deal with the challenges of online learning (Kawasaki et al. 2021; Saide and Sheng 2021). Many educators reported deficiencies in their digital competence, which led to higher workloads and negative emotions during the vigilance period (Väätäjä and Ruokamo 2021). The development of digital competence development and the distance learning experiences of teachers and students require continuous attention, including support for online participation, the operation of digital tools, the use of a learning management system (LMS), and the retrieval and reading of online resources (Heidari et al. 2021), as the current state is unlikely to end any time soon and may even become a cycle (Manca and Delfino 2021). Therefore, instructors, students, and educational institutions need to be sufficiently digitally competent so that they can adapt to rapid changes in teaching methods (Blau et al. 2020).

Unlike K-12 schools, universities offer a high degree of flexibility and heterogeneous curriculums, and they provide professional skills development in various domains. For example, some university courses emphasize FTF (face-to-face) interactions or operations; some require students to use ICT technology or apply digital tools to complete learning tasks; and some require students to engage in online learning activities, in some cases even allowing students to take complete courses from home via the internet (Chiu et al. 2021). Many university faculty members and students lack sufficient digital competence to deal with sudden changes in learning delivery, but few universities have sought to properly prepare their faculty and students to use online learning technologies (Cabero-Almenara et al. 2021; Weber et al. 2018). Scholars have suggested that universities must consider the state of knowledge in faculty and students regarding educational ICT technologies and digital tools to meet the needs they encounter as the educational setting changes (Monroy García et al. 2020). Assessing whether university faculty and students have sufficient digital competence to cope with future changes is an important issue at this stage (Starkey 2020).

Digital competency has been regarded as an important competency for teachers and students in the digital era; universities are the essential stage of the development of digital competency, and university instructors are the keys to digital competency development by students. Therefore, assessing and understanding the digital competencies of university instructors is an important issue in the field of higher education. Some researchers (Domínguez-Lloria et al. 2021) have used collaborative platforms and questionnaires to assess students’ competence with digital tools. Some researchers (Sánchez-Cruzado et al. 2021) used questionnaires to establish whether university instructors were sufficiently digitally competent to cope with the emergency educational model during the COVID-19 pandemic. Because recent studies suggest that university instructors are responding to help future professionals adapt to industrial and job market changes in the digital age, the assessment of instructors' digital competencies requires more research attention (Antón-Sancho et al. 2021; Sánchez-Cruzado et al. 2021). Specifically, the literature using quantitative analysis is limited by the breadth of the questionnaire items, and the use of manual analysis methods is limited by manpower and time, highlighting the current need for more extensive and effective digital competency assessment solutions (Sillat et al. 2021; Zhou 2021; Mattar et al. 2022).

To address this issue, scholars have argued that a syllabus constitutes a contract between instructors and students that describes not only the domain subject, pedagogy, and resources involved in the course but also the skills students will develop and the tasks that students are expected to accomplish during the course (Parkes and Harris 2002). Hence, the analysis of syllabi can be used to assess the digital competence of instructors and students at universities (Schina et al. 2020). However, syllabus analysis is more time-consuming and labor-intensive than questionnaires and cannot easily be performed (Bensen and Silman 2012). We note that machine learning (ML) technology, which enables computer programs to mimic human recognition and classification of text, is considered to be an excellent solution for analyzing text by automating the task of analyzing text with minimal human intervention (Kadhim 2019; Iatrellis et al. 2020; Golowko 2021). In addition, machine learning based on a data-driven approach will produce more objective results to improve the limitations of traditional analysis (e.g., statistical methods) (Yakubu and Abubakar 2021; Barthakur et al. 2022). However, the feasibility and reliability of digital competency assessment using ML remain unanswered. In this vein, we would like to confirm how our approach contributes to the assessment of digital competency by answering the following two research questions (RQ). RQ1: How well can ML evaluate the level of digital competence applied in a course from its syllabus? RQ2: How well can ML classify the levels of digital competence present in courses relative to human evaluation? These answers will help to understand digital competencies in universities and give universities the opportunity to respond accordingly and prepare for the development of quality digital competency formation for higher education.

Literature review

Digital competence in higher education

Digital competence is the set of abilities necessary to use technology to optimize daily life (Ferrari 2013). The European Commission considers digital competence to be a key life skill and has developed the European Digital Competence Framework (DIGCOMP) as a reference framework for introducing and demonstrating digital competence. DIGCOMP identifies the key components of digital competence in the following five areas: (1) information processing, (2) communication and collaboration, (3) digital content creation, (4) security, and (5) problem solving (Carretero et al. 2017). Some studies have used digital literacy to describe these competences. Although digital competence and digital literacy are not identical, a growing number of scholars have argued that the distinction between them has been blurred by overlapping definitions and translations (Madsen et al. 2018). It is generally accepted that digital competence is a skill related to digital literacy, media literacy, ICT literacy, information literacy, and internet literacy (Esteve-Mon et al. 2019). For example, in educational settings, students use digital tools to produce and share information, which demonstrates their digital competence. To provide a consistent description, we use the term digital competence in this study. During the COVID-19 pandemic, a focus on the evaluation and development of digital competence in higher education has reached an unprecedented level (Pinto et al. 2020).

Digital competence is an important competency for current learning and future employment. Universities, as cultivators of expertise in a variety of fields, are key to providing students with a quality education (Olszewski and Crompton 2020). Understanding the current digital competency offered in universities will contribute to both academia and industry. With regard to the former, by obtaining a complete picture of the current digital technologies used by university faculty and students in their courses, universities will have the opportunity to make further recommendations and invest in research topics to address the digital divide. Regarding the latter, the use of digital technologies is inevitable in the era of Industry 4.0, and the digital competency of the future industrial workforce depends on the planned development (Bartolomé et al. 2022). As a bridge between learning and employment, universities have the opportunity to prepare students for the job market of the future and to contribute to the development of a digital industrial environment when we can effectively understand the digital competencies offered in university courses (e.g., by examining the extent to which the use of digital technologies in curricula in various fields is key to meeting current industry needs).

Digital competence evaluation and syllabus analysis

Although research on digital competence in higher education is accumulating, the digital competence required of university faculty and students in their respective areas is often lacking attention (Vorobel et al. 2021). Recent studies of digital competence in higher education have shown that the evaluation and investigation of digital competence in higher education generally use traditional methods and are still in their infancy (Zhao et al. 2021). To cope with this issue, researchers have suggested that educators design learning activities in accordance with their own digital competence and their perception of their learners’ competence as well as in relation to the development of needed skills (Alarcón et al. 2020). If an instructor integrates digital competence into a course, analyzing the syllabus can help understand the instructor’s level of digital competence (Guillén-Gámez and Mayorga-Fernández 2020). Based on the digital competency framework (Mattar et al. 2022) and the perspective of educational practice (Gower et al. 1983; Richardson 1990), it is reasonable to assess the digital competency levels of teachers by analyzing their syllabi, and evaluating digital competence by identifying instructional media/tools and learning activities in a course related to the skill has been considered reasonable (Lucas et al. 2021). For example, instructors may use software for direct instruction in the classroom, conduct computer-assisted instruction, implement educational games, use digital media for communication, or operate LMSs/platforms. All of these practices are evidence of instructors’ digital competence levels (König et al. 2020). Some recent studies have investigated digital competence by analyzing syllabi. For example, Dubicki (Dubicki 2019) analyzed 180 syllabi to identify the outcomes of training digital competence development at a university. Beuoy and Boss (2019) analyzed syllabi to identify opportunities to support digital competence instruction and develop strategic pedagogy. Their results indicate that analyzing syllabi can effectively produce a big-picture understanding of the development of digital competence in education. Other studies have found that the evaluation of syllabi can produce evidence for identifying the contribution of certain courses to enhancing digital competence (Boss and Drabinski 2014). In other words, analyzing syllabi to evaluate the digital competence of university faculty is a reliable and valid solution.

Method

This study uses machine learning to analyze syllabi and evaluate digital competency. The data for the study were collected by a web crawler and were preprocessed. Furthermore, we clarify the criteria for assessing digital competency and the performance levels of ML models. These details are described in the subsequent subsections.

Elaborate criteria for the assessment of digital competence

The digital competence framework DIGCOMP has been proposed to assess digital competence (Carretero et al. 2017). DIGCOMP is a critical document for assessing digital competence, and it has been adopted in many contexts (Hernández-Martín et al. 2021). Retrieving and processing information, communication, and creating and managing learning content have been proposed as categories of digital competence related to education (Ferrari 2013). However, security and problem solving are skills usually developed or shaped by long-term interaction with the digital environment (Caena and Redecker 2019). In this study, therefore, we focused on the area of information and data literacy, communication and collaboration, and digital content creation and selected assessment criteria using the DIGCOMP framework. Table 1 presents the assessments we used to evaluate whether the syllabus covers the development of digital competence.

Table 1 Assessment of digital competence in an integrated syllabus from DIGCOMP

Data collection and labeling

Using web crawling, we collected 1200 syllabi at random from one university. To avoid excessive variation in the textual information, we ignored syllabi of less than 30 words or three sentences. We classified the syllabi into four levels of digital competence, namely, High (H), Moderate (M), Low (L), or Not currently integrated (N). If the syllabus indicated what skills the instructor would use or whether the students were required to accomplish an activity described in the area of information and data literacy and only fit into this area, the digital competence level (DCPL) of the course was set to low (L). If the syllabus mentioned the instructors or students performing activities in an area of communication and collaboration that might also be related to information processing, we considered the syllabus to show a moderate DCPL (M). Similarly, if at least one digital content creation activity was described in the syllabus and both low and moderate levels of digital competence were implied, the syllabus was labeled having a high DCPL (H). Conversely, if no digital competence-related description appeared in the syllabus, the course was labeled as not yet integrating digital competence (N). The labeling process was conducted by two students with master’s degrees in information education from a national university, and the decisions were verified by a professor with a background in information education who had participated in a campus information literacy assessment program at the university. Sample syllabi are shown in Table 2.

Table 2 Sample labeled syllabi

Data preprocessing

We removed noise from the raw data, including specs, punctuation marks, numbers, and non-English/Chinese characters, before feature extraction and classification were performed. All words were converted to lowercase after tokenization, and stop words, such as “the,” “a,” “an,” and “in,” were removed.

Feature extraction

Term frequency-inverse document frequency (TF-IDF) is a very common method for extracting features for text classification using ML. In general, TF-IDF can provide features more accurately than other algorithms, simplifying and streamlining text feature extraction (Al-Rimy et al. 2020). TF-IDF determines the importance of keywords in a document set using a weighting mechanism, whereby TF represents the frequency of a keyword in a document, which represents its importance in the document in which it appears, and IDF is the prevalence of a term across documents. The most representative terms in a particular document can be drawn from a large text set by considering both TF and IDF. This feature-extraction algorithm is suitable for classifying syllabi.

ML model building

Four common ML classifiers were used in this study, namely, support vector machine (SVM), logistic regression (logit), K-nearest neighbor (KNN), and naïve Bayes (NB). SVM functions by finding a maximal margin hyperplane such that data on one side of the hyperplane can be separated from those on its other side. Taking the high dimensionalities of the text features into account, SVM is capable of using nonlinear kernel/radial functions for classification and has been suggested for use in text classification (Joachims 1998). Logit classifiers use logical functions to model the relationships between features and specific outputs, although many complex extensions exist. Logit is also considered effective for text classification (Alsmadi and Hoon 2019). KNN uses all available data to classify cases based on similarity within a dataset. KNN is considered adequate for text classification (Mowafy et al. 2018). NB is a simple probabilistic classifier based on Bayes’ theorem and is often used as a baseline for text classification (Xu 2018). These classification models were adopted in this study.

We used both the test set (20% of the dataset) and a tenfold cross-validation method to evaluate the effectiveness of the ML models. Within the tenfold cross-validation, we randomly divided the training dataset into ten equally sized subsets. The classification model was trained with each ML algorithm using nine of the ten subsets (the training folds), with the remaining subsets being used for validation (verification folds). The average results of ten iterations for each ML classifier were included in the analysis to prevent overfitting.

Model evaluation

To identify the performances of the ML models, accuracy, precision, sensitivity, F1-score, and kappa were considered. Accuracy was measured by the percentage of correctly classified syllabi. Precision reflected the ability of the ML model to identify only the relevant syllabus for each DCPL. Sensitivity measured where a certain syllabus for a DCPL should have been classified and how many times it was correctly classified. The F1-score was established by the weighted average of precision and sensitivity and provided an overall metric for evaluating the ML classifier. All of the above metrics give values between 0 and 1. Values that approach 0 indicate increasingly unacceptable performance, while values approaching 1 indicate increasingly excellent performance. Kappa was used to evaluate the consistency between the ML results and the human classification results in a range from -1 to 1. Higher values indicate better agreement between the assessments.

Results

The classification effectiveness of the ML models

In total, 548, 303, 139, and 210 courses were labeled DCPL = H, DCPL = M, DCPL = L, and DCPL = N, respectively. We used the test set to evaluate the performances of the ML models and found that the accuracy levels of the four classification models, i.e., SVM, KNN, logit, and NB, ranged from 0.768 to 0.922, while the kappa value ranged from 0.662 to 0.886. Thus, TD-IDF demonstrated high accuracy and consistency as a feature extraction method for syllabus classification. We accordingly evaluated the ML models using tenfold cross-validation. The tenfold cross-validation results (Table 3) showed that the accuracy for the four classifiers ranged from 0.656 to 0.713, and precision ranged from 0.594 to 0.712, with sensitivities ranging from 0.587 to 0.708. These results indicated that more than 71% of the syllabi were correctly classified according to their DCPL using ML models. This was better than expected, as university syllabi usually do not have fixed norms. A possible reason for the success may be that the TD-IDF identified the difference and similarity in documents primarily through word frequency, and more representative distinguishable keywords simplified the distinctions among documents (McHugh et al. 2020). The syllabus describes the use of digital competence skills and tools, often including unique terms such as internet, e-mail, upload, software simulation, and source code (Stanny et al. 2015). These specific words can be distinguished from general words so that their significant features can be extracted to enhance the classification ability.

Table 3 Evaluation of classification models

Precision and sensitivity should also be considered when educational stakeholders review the digital competence in university syllabi. Many universities are providing digital-competence development programs for faculty to prepare them for dealing with possible learning environment transitions that may occur at any time. As a result, institutions often investigate whether the digital competence levels of their faculty have increased or changed after a certain period following a training session. A high-precision ML model can provide an efficient and effective means for the institution to address the issue above. More specifically, the greater the ML model precision, the better all courses can be classified into corresponding categories. Here, institutions can generate comprehensive reports on the ratio of courses at each level of digital competence with the changes that have been made. Another potential action would be to seek universal digital competence development on the campus by prioritizing the assignment of training resources to faculty, students, and departments that currently demonstrate lower levels of digital competence. Here, a highly sensitive classifier will be particularly appropriate for addressing this issue. This is because the more sensitive the model is, the more that courses that do not significantly integrate digital competence can be correctly distinguished from those that have not integrated digital competence at all. In this way, resources to support departments, faculty, and students as they develop higher digital competence levels can be precisely delivered.

Agreement between ML models and human classification

Beyond the accuracy, precision, and sensitivity of the ML models, we also evaluated the agreement between human and ML models using kappa as a measure of agreement between behavioral observers. Kappa is often used in ML models to compare agreement between machine and human judgment. One study showed that the average kappa for SVM, KNN, logit, and NB, using tenfold cross-validation, ranged from 0.460 to 0.555. Values of kappa higher than 0.4 are desirable when comparing human and machine evaluations (Sakiyama et al. 2008). That is, the results of using ML models for syllabus classification are consistent with those for human classification. Our results showed that, based on the confusion matrix (Fig. 1) for the four classifiers, the syllabi for each DCPL were mostly correctly differentiated. We then randomly selected syllabi identified as having no, low, moderate, and high DCPLs and examined their contents. The syllabi classified by the ML model included descriptions related to the assessment criteria involved in this study. For example, the syllabus for Engineering Mathematics (DCPL = N) had no words related to digital skills or tools. Students who took the Anthropocene (DCPL = L) course were required to access information on a website, referring to the area of information and data literacy. Similarly, sharing content via digital media and communicating and collaborating was required for students to take the Bilingual Creative Writing (DCPL = M) course. Likewise, students who took the Computer Programming and Engineering Application (DCPL = H) course were expected to be taught to create their own content using a programming language. These relate to digital content creation. This result provides further evidence that ML can be used to classify syllabi in manner that agrees well with human evaluation results (Table 4).

Fig. 1
figure 1

The average confusion matrix obtained through tenfold cross validation

Table 4 Example of result of syllabus classified by ML models

Discussion

Using ML text categorization to analyze syllabus

This study was conducted in response to two research questions, as follows. RQ1: How well can ML evaluate the level of digital competence applied in a course from its syllabus? RQ2: How well can ML classify the levels of digital competence present in courses relative to human evaluation? We marked 1200 syllabus to train the ML models to perform classification to address these questions. The results show that the four ML models used demonstrated high performance on the classification task. ML methods can be used to evaluate the level of digital competence integrated into a course with high accuracy, precision, and sensitivity. For the second research question, our results showed that the ML models could identify syllabi that covered different levels of digital competence and could produce classifications that were highly consistent with those produced by human beings.

The results of this study echoed previous findings that there is a large amount of qualitative data in the field of education and that applying ML methods to text classification can provide accurate, consistent, relevant, and verifiable results to facilitate educational data analysis (Immonen et al. 2015). This study also suggested that even though university syllabi describe content in different domains with unstructured text, ML methods can still provide a reliable, effective, and efficient means of automatically evaluating the level of digital competence integrated into a course. Of the four most popular ML models used for text classification, our results indicate that SVM has the best performance on the classification task and achieves the highest accuracy (0.713) and agreement (0.555) in tenfold cross-validation. This result corresponds to previous findings that indicated that SVM may perform better than other classifiers in text classification tasks that do not feature datasets that are very large (< 6000 instances) (Yu and Xu 2008). Therefore, we suggest that SVM be taken as a benchmark for developing further algorithms to reinforce the performance and consistency of the classification model in future studies. It is worth noting that high and moderate DCPL courses were misclassified more often than other levels. This may be due to an overlap in the definitions and assessment rules for moderate and high digital competence courses relative to the low and no DCPL courses. For example, interaction through technologies is represented by communication and collaboration, which indicates a moderate level, and content development is related to the domain of digital content creation and entails a high DCPL level. However, both of these levels call for students to be able to use various digital tools to accomplish learning tasks, and many digital tools and terms used in syllabi overlap or are similar, such as PowerPoint, electronics performance, the use of computers, or reliance on learning systems. Because the text document classification relied on extracting significant features from keywords, these similarities may have contributed to misclassification instances by ML models when evaluating syllabi. Further research is needed to address this issue.

Implications for evaluating digital competence based on ML models for higher education

The issue of digital competence in higher education institutions has attracted particular attention in recent years (König et al. 2020). For the purpose of professional knowledge training and career preparation, students' learning and use of digital technologies during higher education are crucial for their learning and life (Tsankov and Damyanov 2017). Understanding how and to what extent university instructors integrate digital competencies into their classes has become an essential issue. However, past methods of evaluating competence have critical limitations that need to be improved (Zhao et al. 2021). The results of this study confirm that the use of machine learning assessment methods is not only feasible but also highly consistent with human assessment results, which has several implications for assessing digital competencies in higher education institutions. First, the benefits of automated processing by machines provide a solution to the current human and time constraints that universities face in investigating digital competencies on campus. This high-performance approach can easily be applied to answer questions related to digital competence in higher education institutions in real time. For example, are universities offering sufficient courses to develop digital competencies? What percentage of students are enrolled in courses that cover digital competencies? In addition, from a pedagogical practice perspective, the syllabus is one of the documents that demonstrate teachers' technological pedagogical content knowledge (TPACK), and the solution proposed in this study can be used to provide objective evidence of teacher development theory and practice (Loveless 2011). For example, how do university instructors respond to the current digital age in terms of their technological knowledge? What levels of digital competence do instructors demonstrate in courses and in what ways? In other words, higher educational institutes can use this approach to gain a comprehensive understanding of the digital competency profile of campus faculty and students; prepare to provide appropriate training programs or assistance to students, faculty, or departments; and explore strategies to enhance faculty members’ and students’ digital competency levels based on evidence. Figure 2 provides the conceptual framework.

Fig. 2
figure 2

The conceptual framework of analysis the digital competence in university

Conclusion and limitations

The ways in which higher education instructors integrate digital competence into their pedagogy are essential to obtaining a quality education and are a significant pathway for students to gain digital skills, especially during the present COVID-19 crisis. Hence, research to understand digital competence in higher education from different perspectives is urgently needed. Previous studies have focused on applying self-reported, time-consuming, or human-intensive methods to assess digital competence. This study adopted ML methods to evaluate syllabi and propose solutions for assessing the degree of integration of digital competence in university courses. The results of this study suggest that the solution proposed in this study is efficient, effective, and objective relative to conventional methods of assessing digital competence. In addition, higher education institutes can more efficiently assess and practice digital competence and develop educational interventions. Taking this practical approach, universities can direct resources to increase their digital competence levels more effectively and plan for current and future development.

Although the results of this study are promising, they should be interpreted in light of its limitations. First, the methodology of this study produced the desired results; however, this study focused on the TF-IDF algorithm, and contextual sentences, semantic considerations, and implicit meaning were not considered. It is worth considering the long short-term memory model for advanced natural language processing to reinforce analyzing performance and minimize the gap between machines and humans evaluating digital competence in the future. Second, this study was conducted at a research university, and whether the effectiveness of this method for other types of universities (e.g., teaching universities and comprehensive universities) is consistent with the results at research universities is unclear. Future research could attempt to confirm the generalizability of this method by demonstrating the differences in its effectiveness across university types. Finally, there might still be a digital divide between instructors and students, and it would be valuable if future research further uncovered the differences between teachers' and students' digital competency levels to improve this issue.