Keywords

1 Introduction

Assessment is a process of checking objectives, in which what students produce at the school/college level, is compared to a model [1]. Trotter [2] states that assessment or evaluation is a generic term used for a set of processes that measure the students’ learning outcomes. This process contributes to the effectiveness of teaching since it consists of the observation and interpretation of its results. It is in the teaching-learning process that the assessment arises, functioning as a mechanism that verifies whether the intended objectives were effectively achieved [1].

Assessment methods are considered an essential component of any course materials and curriculum provided to concerned students enrolled on academic programs within different higher education institutions all over the world. Such assessments provide a better understanding and solid evidence on the effectiveness of the teaching and learning process [3] and help in assessing and improving the academic performance and achievement of the concerned internal stakeholders, including the students, instructors, and the higher education institution itself. However, many flaws in the process still require further investigation in both theory and practice. In addition, although there are several prior studies that have been conducted to assess the major factors impacting the academic performance, achievement, and success of students (e.g., [4,5,6,7,8,9]), within higher education institutions in various nations around the world. However, it could be noted that there are few studies have been carried out concerning the good assessment practices in higher education, the relationship between summative and formative assessment, self-assessment and peer assessment, and the role of technology in assessment [10], especially within the context of higher education institutions in Portugal. This represents clear arguments and evidence on the existing gaps and indicates that there are several points and issues related to assessment methods in terms of effectiveness and outcomes that still require great attention by scholars mainly in the current global context.

To address the aforementioned gaps, the current paper seeks to analyze the extent to which the assessment methods could impact approval rates of undergraduate students in higher education institutions in Portugal. To be more specific, this study examines how the assessment methods, namely the course field, laboratories, projects, mini-tests, group work, individual work, frequencies, exercises, and presentations work, could affect the academic success of students in a higher education institution in Portugal. By doing so, the present study provides several theoretical and practical implications that could be beneficial for higher education institutions and their associated internal stakeholder. More specifically, this paper contributes to the existing body of knowledge concerning the educational process in higher education institutions by concentrating on the connections between certain assessment methods and students’ final grades. It also provides practical guidelines and managerial implications for the concerned bodies in higher education regarding the impact of elements and methods of assessment on students’ success in higher education.

2 Assessment Methods in Higher Education Institutions

It is evident that one of the top priorities for educators or higher education is students’ performance-related quality, which leads to making a difference between institutions at local, regional, national, and global levels. Academic success and/or achievement is regarded as a metric reflecting the quality and efficiency of the educational management process and its associated activities [4]. As a result, it is crucial to identify the most crucial factors impacting students’ academic performances. In this vein, there are several factors that have significant influences on the academic performance of students, such as status and school background, admission points, and socio-economic factors [11]. Moreover, assessment methods are also considered a substantial predictor of students’ academic success [2].

In their study, Pereira and Flores [12] highlighted that the most assessment methods used by teachers, according to the students’ opinion, are: oral presentations in group classes, written tests or exams, reports made in a group, the resolution of practical group work, the project work carried out in the team and the individual reports. To do so, the authors carried out a study with 254 respondents, of which 165 were integrated master students and 89 were undergraduate students. It should be noted that all students attended the 3rd year of a Portuguese university. In the same study, the authors indicated that many students believe that the assessment would be fairer if at least two different assessment methods were used.

Acceding to several prior studies (e.g., [13, 14]), the student work assessment is divided into two main aspects or categories: formative assessment and summative assessment.

Formative assessment mainly focuses on activities and aims to facilitate learning [15]. In short, students must respond to what is asked without assigning a rating because of their response, to reduce the high levels of stress and anxiety [1], usually detected in summative assessment situations. Fernandes, Rodrigues, and Nunes [14] refer to a literature review article by Black and Wiliam, on formative assessment practices where they show that: a) the systematic practice of formative assessment improves students’ learning; b) the students who benefit most from the formative assessment are those who have the greatest difficulties; and c) the students who attend classes in which the formative assessment predominates, obtain better results in external assessment exams than those who attend classes in which the assessment model predominant is of a summative nature.

The summative assessment has the function of verifying and quantifying what the students have retained. It is therefore a terminal and retrospective assessment. It is applied at the end of the learning process. Besides, it places students at different levels according to the result obtained and is therefore grading [16]. It is in this context that this investigation is centred.

In the last few years, some research has been focused on students’ and teachers’ perceptions about distinct models of assessment and their impact on grades. For example, the study of Seggers and Dochy [16] analyzed students’ perceptions of two types of assessment in a problem-based learning environment: a written exam and collaborative work. The authors report in their results that both types are highly correlated with the final assessment of the course. However, the exam grades were lower than expected by supervisors. Although the students had two weeks completely free to work on the exam, they only used half the expected time, even though, according to Seggers and Dochy [16], they had been supported and motivated to do it, but there were no significant changes. Interestingly, the students’ perception was that the results did not match.

Meijer, Hoekstra, Brouwer and Strijbos [17] analyzed several models of collaborative learning assessment, namely: group assessment, individual assessment and group assessment combined with intra-group peer assessment. The authors emphasize the need for large-scale studies that can analyze and validate the most efficient assessment tools for collaborative work. Day, van Blankensteina, Westenbergb and Admiraal [18] also analyzed the perceptions of students and teachers but they focused on the middle assessment. The authors found that both teachers and students have a positive opinion on the middle assessment. However, although teachers refer to the potential of intermediate assessment as an opportunity to evaluate different knowledge, students prefer that the intermediate assessment measures the same knowledge as the final exam.

2.1 Data Collection

At this stage, it was necessary to proceed to the collection, filtering, and processing of the data. The collection of data related to the averages of the courses between the academic years 2013/2014 and 2017/2018 was obtained through the tool developed in the work described by Miguel, Ramos, Martins and Costa [19] that contained the information regarding the averages of each course in the institution where the investigation took place, among other options. Figure 1 shows the layout of the tool by Miguel et al. [19] that allowed the selection of the respective information.

Fig. 1.
figure 1

Layout of Miguel et al.’s [19] tool.

Then we began the process of manual collection of information on the assessment methods present in all annual course plans. Next, the data were interconnected, joining the evaluation methods present in the course plans and the respective average classifications of all years available in the sample. The variables considered for this study were as follows: Course Name, Academic year, Average Grade, Assessment instruments (Project, mini-tests, single test throughout the semester, two or more tests throughout the semester, group work, individual work, frequency, exercises, participation, labs, and presentations).

The data were processed using the SPSS Statistics software (v.26) where it was possible to verify if the data were well treated, that is, if they had no wrong values or missing values (Table 1).

Table 1. Missing values.

2.2 Statistical Analysis

The study was based on the procedures of Marôco [20] for the modeling of structural equations. The following variables were considered:

  • Lab – Weight of laboratories

  • Proj – Weight of project

  • MT – Weight of mini-tests

  • 1T – Weight of one single test

  • 2T – Weight of 2 or more tests

  • GW – Weight of Group Work

  • Pres – Weight of oral presentation

  • F – Weight of frequency

  • IW – Weight of individual work

  • Part – Weight of participation in classes

  • Ex – Weight of exercises

The significance of the effect of the variables described above on the average grades was assessed using multiple linear regression with the estimation of the parameters by the maximum likelihood method implemented in the AMOS software (v. 25, SPSS, An IBM Company, Chicago, IL). The existence of outliers was assessed by the Mahalanobis square distance (D2) and the normality of the variables was assessed by the univariate and multivariate coefficients of asymmetry (Sk) and kurtosis (Ks). No variable showed Sk and Ku values indicating severe violations of the Normal distribution (see Appendix 1).

Multicollinearity was assessed using the VIF statistic as described by Marôco [20]. The existence of strong multicollinearity between the variables recommends the removal of two of these three variables 1T (VIF = 24.8), GW (VIF = 22.5), and F (VIF = 17) (see Table 2).

Table 2. Multicollinearity check.

3 Results

The final model for positive grades was evaluated only with the predictors Lab, Proj, GW, Pres, and IW as it was the only combination whose estimates were statistically significant (Table 3). Effects with p < 0.05 were considered statistically significant. The model adjusted to positive averages according to the use of laboratories, projects, group and individual works, and presentations explain 25% of the variability of the averages observed in the 792 observations analyzed. All trajectories analyzed between the variables are statistically significant (Table 3).

Table 3. Missing values.

Figure 2 presents the model with standardized estimates of the model’s regression coefficients and the explained average variability. It is also observed that the use of laboratories as an evaluation element is the only variable that has a negative effect on the average value. Group work and the project are the most significant elements in explaining the positive grades.

Fig. 2.
figure 2

Model for positive grades.

To check if the model results would be related in any way to the scientific area of the course, we started by grouping the sample courses by scientific area (Social Sciences, Mathematics, and Technology). The data indicated that about 57.8% of the records belong to the group of social sciences with 458 averages. Mathematics has 92 records and Technologies has 242.

As for the sample distribution (Table 4), it was found that only the data in the Mathematics group follow the normal distribution (p > 0.05).

Table 4. Shapiro-Wilk test.

Figure 3 shows the graph of averages by the scientific area. The Social Sciences group has the highest average and the Mathematics group has the lowest average. To assess the existence of a relationship between the scientific area and the average, the Kruskal-Wallis test was applied. The results (p < 0.05) confirm that there is no significant relationship between the scientific area and the course average (Table 5).

Fig. 3.
figure 3

Average by scientific area (n = 792).

Table 5. Kruskall-wallis test.

4 Conclusions

This study aimed to analyze the relationship between assessment methods and instruments and the final classification of courses. It was possible to conclude that there is no strong relationship but that there is an associated explanatory percentage that differs in the cases in which positive averages are analyzed.

After developing the analysis from SEM, it was possible to find models that have an explanation variation of the average of 25% for mean values equal to or greater than 10 on a scale of 0 to 20. This means that this model does not explain much of school success, but it affects a percentage, and therefore, other external factors can explain the remaining 75%. We also found that there are no significant differences in the averages between scientific areas and therefore it is not one of the factors that affect the results.

According to Afonso [21], factors such as parents’ education, household income, socio-professional category of household class, gender, self-regulation, self-efficacy of learning, motivation for study, relationship with colleagues, and the relationship with teachers are also explanatory factors for school success. Afonso [21] concludes that the socio-psychological model with the factors: classification of the relationship with teachers, motivation for study, female gender, and household income are the important parent factors for academic success, concluding that these variables have an explained variation of 52%. Completing the two studies together, that is, the correct assessment method in line with all socio-psychological factors can explain academic success by 77%. Also, Gil, Martins, Moro and Costa [22] analyzed the academic success of 1st-year students with a 10-year assessment dataset (n = 9652) and concluded that there are 68 predictors of successful academic that is not related to the assessment instruments, including socio-demographic characteristics, social origin, previous academic path, special statutes and educational path.

In Portugal, students have at least two phases to complete a course in the same semester. When they complete the first phase, they do not have to make assessments in the second phase, usually an exam. As a limitation during the development of this dissertation, it was observed that one of the problems detected when data collection started was the impossibility of knowing whether the grids present in the tool by Miguel et al. [19] belonged to one of these phases or the set of both. For this reason, we started from the assumption that all grids were the first phase and it was from this assumption that the averages of each UC were calculated.

The current research provides various implications for both academics and practitioners. To begin with, this paper adds to the extant literature regarding the educational process in higher education institutions by underlining the role of assessment methods in determining the academic success of students. This also produces a better understanding on the extent to which the assessment methods (i.e., laboratories; project; group work; oral presentation, and individual work) could impact students’ academic success in a Portuguese higher education institution. This paper also adds to the limited studies conducted within the higher education context in Portugal (e.g., [23,24,25]). Another prominent contribution of the present paper is related to theory, as it develops a theoretical model which helps allow identifying the multiple dependency relationships between the studied variables. Furthermore, this paper provides valuable practical implications and solid practical guidelines for the concerned managers and other stakeholders of higher education institutions, in Portugal in particular, by highlighting and providing evidence in relation to the impact of assessment methods and tools on the final grades of students in higher education. This could help in evaluating the effectiveness of such methods used in higher education and increasing the call for deploying innovative and unique assessment methods to substantially improve the educational process.

It is hoped that this study will contribute to the choice of assessment instruments in courses that have difficulties with approval. As future work, we suggest the extension of this dissertation based on the analysis of more external factors that may influence the average of students in higher education.

Ramsden, Bennett and Fuller [26] warns that it is difficult to combine learning results from different assessment methods, giving us an example that the classification of practical work or project will be different from the classification achieved by the same student in a test or exam. For this reason, the evaluation systems must become creative concerning the adopted evaluation practices, and it must be possible to use diversified and innovative methods.