Keywords

1 Introduction

One of the most crucial decisions of youth students at the end of high school is to choose an academic path. On one side, there is a society that asks you to choose something to make you happy, but the job market has particular career demands that do not always fit students’ skills or vocational interests.

For these students, the completion of vocational tests will give them a deeper understanding of their skills and interests and help them make a better career choice, and reduce the desertion rate at upper education institutions during the first semesters.

In Latin America, universities during the first semester of career, courses that explore topics related to introduction to university lifestyle or personal development are offer as compulsory (See [16,17,18,19]).

Contrary to this practice, in the USA, before join university, some courses that explore students’ interest and improve academic deficiencies in English and Math are mainly compulsory during the first two years of College [20].

The basic knowledge test as preparation for university is standard according to the country, for example, the Programme for International Student Assessment- PISA Test [22].

PISA is an international test applied to students near the end of high school. PISA is held every three years and has assessed skills in mathematics, science, and reading. However, each country has its academic assessment skills.

In the USA it is applied the Suite of Assessments (SAT) developed by the College Board [9]. This test evaluates Reading, Writing and Language, Math, and Optional Essay.

In China, a similar test is the Gaokao [10], with the difference that this is decisive to get a place in the university. This test evaluates Chinese literature, Mathematics, and English.

In Hong Kong is the Diploma of Secondary Education Examination (HKDSE) [11]. This exam evaluates at the end of high school: Chinese and English languages, Mathematics, and Liberal studies.

In South Korea, the Suneung [12] exam is applied, which is a University School Aptitude Test. It evaluates the Korean language, Mathematics, English, and Korean history, in addition to a second foreign language and a subject free of choice.

In Colombia, this test is known as the Saber 11 test and is performed in the last year of high school and applied by the Colombian Institute for the Promotion of Higher Education-ICFES (for its Spanish acronym) [26]. The Saber 11 evaluates Math, Critical Reading, Natural sciences, English, Citizen competencies.

We propose a career recommendation system based on the Gardner test, using the results of the Saber 11 tests and some variables with information on family per student. We validated this system with the results of the EVP2 vocational test. This paper is organized as follows. Section 2 shows the research prior to this research are mentioned. Section 3 describes the theorical framework for this research. Section 4 presents the approach of this research. Section 5 explains each detail of development. Section 6 shows the results obtained. Finally, Sect. 7 contains the conclusions of this research.

2 Related Work

For this research, we took into account the career recommendation systems based on the Gardner’s Multiple Intelligences test, like Shearer et al. [2] shows the practical value of applying Gardner’s multiple intelligences to vocational guidance. It helps reaffirm that beyond the interests that a student may present, it will be more valuable to consider their strengths and weaknesses obtained from the multiple intelligences tests using the Multiple Intelligences Developmental Assessment Scales (MIDAS).

Kaewkiriya et al. proposed a recommendation system based on the design of rules focused on E-learning and Multiple Intelligences using a questionnaire applied to a population sample [3]. They used several algorithms (Naive Bayes, NBTree, and others) for validation, obtaining a maximum prediction percentage of 83.436%.

Two other research that also implemented Naive Bayes, Kelly et al. [1], and E K Subramanian et al. [6]. They propose an intelligent-based on multiple intelligences. Its predictive engine uses the Naive Bayes model to identify the learning characteristics of each user, like personal interests, extracurricular and curricular activities, academic information.

These interests go hand in hand with vocational abilities. Obeid et al. [4] propose a recommendation system based on ontology and improved with machine learning techniques to recommend professional careers to students.

On the other hand, Yadalam et al. [5] propose a career recommender system based on content-based filtering. This system is only for engineers students based on the qualities and activities of each student. Based on this data, each career maps through the similarity of the cosine.

Dhar et al. [7] is a comprehensive work regarding the use of machine learning techniques. According to academic performance, the recommending system, based on machine learning techniques, predicts the appropriate academic program for higher studies. Among the algorithms used are K-nearest neighbors (KNN), Decision Tree (DTs), Naive Bayes, Random Forest (RF), XGBoost, Logistic regression, and others. However, in most tests performed, RF obtained the best prediction results.

Like the research presented before, our recommendation system used machine learning techniques to recommend careers per student. However, we took into account, in addition to the basic skills in fundamental areas (Math, Critical Reading, Natural Sciences, English, and Citizen competencies), some family-related variables as part of the recommendations with a standard vocational test.

3 Theoretical Framework

This research is a recommendation system based on Gardner’s multiple intelligences test and is validated using the standard EVP2 vocational test.

3.1 Test Gardner

Gardner’s Multiple Tntelligences test consists of 80 questions and is part of the theory of multiple intelligences in 1983 [8]. This test affirms that a person can be intelligent in an area and should not be measured solely by their intelligence quotient (IQ), an estimator of general intelligence resulting from standardized tests.

According to Gardner, for something to qualify as intelligence, it must follow the eight “signs” of intelligence that he proposes:

  • Musical-Rhythmic and harmonic: Sensitivity to recognize rhythms, tones, melody, and timbre.

  • Visual-Spatial: Refers to the ability to conceptualize and manipulate large-scale spatial models.

  • Linguistic-Verbal: Ability to identify words, know their meaning, order, sounds, inflections.

  • Logical-Mathematical: Ability to conceptualize logical relationships between actions or symbols, helps to be deductive and detect patterns to solve problems.

  • Bodily-Kinesthetic: Ability to use the whole body (or parts of it) to solve problems or create products.

  • Interpersonal: Ability to interact with others, sensitivity to their moods, feelings, temperament, and motivations.

  • Intrapersonal: Sensitivity to one’s feelings, goals, and anxieties.

  • Naturalistic: Ability to recognize and make distinctions in the world of nature.

Fig. 1.
figure 1

Implementation

3.2 EVP2

EVP2 is a systematized professional assessment and guidance scale that provides a graphical profile of interests based on 49 careers. This test consists of 245 questions and an average response time of 30 min. The instrument has a validity of Alpha Cronbach 94% [21].

Table 1 shows EVP2 classifies the results in the test. If it falls within each established range, it recommends a high score. It is important to note that this test recommends careers from a minimum score of 56 points per career.

Table 1. EVP2

3.3 Models

Our recommendation system consists of three machine learning models. We have used these models before, and we describe them below.

K-Nearest Neighbors (KNN). KNN is an algorithm based on instances used to predict continuous values or for classification. It is one of the simplest, and the objective is to find the closest points so that they contribute to the regression of the most distant ones [23]. Equation for the regression of k-nearest-neighbors [15]:

$$\begin{aligned} \widehat{y} = \frac{1}{k}\sum _{i=1}^{k} y_{i}(x) \end{aligned}$$

Decision Tree (DTs). DT is a supervised learning algorithm. This algorithm identifies the most significant variable, and its value allows creating a predictive model from the characteristics of the variable [25]. It is crucial to know the Mean Squared Error, the Poisson deviance, and the Mean Absolute Error for regression. These equations are below.

Mean Squared Error:

$$\begin{aligned} y_m=\frac{1}{N_m}\sum _{i\epsilon N_m}^{} y_i \end{aligned}$$
$$\begin{aligned} H(X_m)=\frac{1}{N_m}\sum _{i\epsilon N_m}^{} (y_i - y_m) \end{aligned}$$

Half Poisson deviance:

$$\begin{aligned} H(Q_m)=\frac{1}{N_m}\sum _{y\epsilon Q_m}^{}\left( y log\frac{y}{\overline{y}_m} -y + \overline{y}_m\right) \end{aligned}$$

Mean Absolute Error:

$$\begin{aligned} median(y)_m = median_{i\epsilon N_m} (y_i) \end{aligned}$$
$$\begin{aligned} H(X_m)=\frac{1}{N_m}\sum _{i\epsilon N_m} \ |y_i - (y)_m| \end{aligned}$$

XGBoost. XGBoots is a reinforced tree gradient algorithm. It is a supervised learning technique that predicts a value from a set of values. This algorithm boosting to arbitrary differentiable loss functions model [24]. The equation of this algorithm seeks to minimize to the Euclidean domain [14], see below.

$$\begin{aligned} \pounds ^{(t)}= \sum _{i=1}^{n}\iota \left( y_i, \widehat{y_i}^{(t-1)} + f_t(X_i) \right) + \varOmega (f_t) \end{aligned}$$

The final minimized equation, eliminating the constant parts is:

$$\begin{aligned} \pounds ^{(t)}= \sum _{i=1}^{n}\left[ g_i f_t(X_i) + \frac{1}{2} h_i f^{2}_t(X_i) \right] + \varOmega (f_t) \end{aligned}$$
Table 2. Variables

4 Our Approach

Taking into account that in Cartagena, according to the city education department, Secretaria de Educación Distrital - SED (Secretary of District Education) [27], Cartagena has 398 active schools. Still, only 26% (103) are public.

We obtain the database from a questionnaire and group interviews with 250 senior students in a public school in Cartagena-Colombia for this research. The public school selected during 2019 had 2649 students from Pre-K to 11th grade, so the total of senior students represented a lot of the entire school.

General and family information was collected by conducting interviews with the students. The Gardner and EVP2 test was applied. Also, we considered the Saber 11 results. We used the trial and error method for selection variables, adding variables to the system and seeing how much performance was improving.

The first test carried out was with the results of the Saber 11 test and the professional careers. Then we agree on family variables one by one: The relationship of the people with whom they live, age, occupation, level of studies. We mapped these variables to Gardner’s Test results. The variables used are described in Table 2. For our system, we determine the top five careers according to the EVP2 score.

For this research, we not considered IQ and other learning because it skewed the study’s objective, as well as that it does not guarantee the intelligence that a student can have more developed. So for this research, we only focus on multiple intelligences.

5 Implementation

Figure 1 shows our implementation recommendation system for careers. We used the machine learning algorithms to implement our recommendation system: KNN, DTs, and XGBoost. For train/test, we divide the data according to the Pareto 80-20 principle [13]. Our investigation was carried out in stages, which we describe below.

5.1 Stage 1: Data Preparation

In this stage, we make the data exploration and variable selection. Using the trial and error method, we added the variables in each machine learning model. We made annotations of how the prediction was improving with family variables added.

Initially, we only thought of only using the Saber 11 tests and the Gardner test results, but using only these variables, the career prediction failed. So we checked if the family variables could influence. We realized that these students have different functional families, and very few follow the father, mother, and children model.

The age of these relatives also influences because some were older siblings, so they had more responsibilities. Some of these students commented on their need to study a technical career to have an immediate job. The parent’s level of education influences a large part in the student’s decision to choose a technical, technological, or professional career.

5.2 Stage 2: Machine Learning Models

We validate the machine learning models using KNN, DTs, and XGBoost because these models are easy to implement. According to our data preparation, the input variables we consider Saber 11 and Gardner’s test. Output variables were the results obtained in the EVP2 test. Showing the top five careers recommended to each student for each model, we compare this top with our top chosen from the races with the highest score in EVPs. Sometimes some of these algorithms fail in the top one. It is due to the model’s prediction percentage. In the next session, we present the results.

5.3 Stage 3: Recommended System

We use the results of the Gardner test to validate the top five of our system. For this, we had the top of intelligence by each student. Using our top five, we verified the intelligence per student and the intelligence necessary for each career.

Some careers need up to four multiple intelligence to guarantee excellent professional performance in the occupation, so we reorganized our top five according to the careers that included all the intelligence marked by the student.

Also, this helped us choose between careers with the same score in EVP2. If one of the two careers met most of the student’s intelligence, it recommends before the other careers that could not contemplate any of the student’s intelligence.

6 Results

We used for the train/test a Pareto of 80-20 as mentioned above, for the validation, 250 students. We opted to remove 10% of the students to perform the validation of the recommender system. This value of 10% corresponds to a total of 25 students approx. Table 3 show the actual values.

Table 3. Pareto

According to the results in the Gardner test and the Saber 11 tests, we verify the most marked multiple intelligences in each of the students and apply the machine learning models as KNN, DTs, and XGBoost to them. For this model, we obtained a prediction of no more than 75.7%. Exactly for KNN: 71.7%, DTs: 68.7% and XGBoost: 75.7%.

The next step was to validate from the results of the EVP2 test the highest career per student. Unlike EVP2, we did not establish a minimum range of recommendations. After validating the highest careers, we took the top five by students and compared this recommendation with the one obtained in EVP2.

According to the EVP2 test, the predictor’s correctness is 79.2%. However, when validated with the results of the Gardner test, the recommendation correctness increases to 88.2%.

We mapped our top five with the multiple intelligences per student. It contemplates careers were in common with all the most labeled intelligence, and the predictor rearranges the top according to that intelligence. For example, in some cases, three intelligence types predominated for a student. We validated that the top careers needed these three intelligence types, and the system recommends the top one.

Then, we chose to validate only the top one of our improved systems with the EVP2 recommendation. For the top one, we obtained a percentage of precision of 93.3%. However, in the validation test with 10% of the information, the algorithms that we used failed at least three times to predict the top one.

We can say that the few times our system did not recommend the career in the top one, it placed it in the second recommendation. We assigned a score of 100 if our top one coincided with the one recommended by EVP2. In case of being in second place 80, and in third place of 60.

7 Conclusion

This research shows the implementation of a recommendation system based on the Gardner test. We verified that it is possible to implement using familiar variables and the results of basic knowledge tests. In many countries, they are used to guarantee admission to the university.

We obtain acceptable scores in this test, correctness prediction up to 93% in the top careers. It can help confirm long-term success in the career field. Validating our system with the EVP2 test, we ensure that the EVP2 test can sometimes fail with the recommendation since it only focuses on careers with scores higher than 60 points. For example, for a student who did not obtain this score in any career, EVP2 cannot recommend.

Unlike the EVP2 test, we do consider the multiple intelligences per student. These allowed us to make the recommendation of careers to students with the highest scores. We did not establish a range like EVP2, but we were interested in the student.

We plan to design a web application for our recommendation system that schools can use during the vocational orientation for future work.