Keywords

1 Introduction

Fostering motivation is important for successful learning in any context [38, 39]. Increased motivation has been associated with several important learning outcomes such as persistence [37], retention [25], achievement [10], and course satisfaction [13]. It can influence what, when and how one learns, and has proven to be a significant predictor of performance [33, 34]. Moreover, research shows that motivated learners are more likely to take on challenging activities, stay engaged, be creative, enjoy the learning process, and adopt a deep learning approach [32]. In most online learning scenarios, motivation gains even greater significance, as in such setting students are often left to their own devices and need to learn in self-regulated manner lacking guidance, feedback and stimuli usually administered by teachers [28].

Students’ motivation can be enhanced through various factors (e.g. interest in the subject) and techniques (e.g. gamification). Several investigations have shown that social comparison (SC) can be beneficial in educational settings as a powerful driver of motivation and an effective source of instructional feedback [9, 15, 36, 40]. Moreover, many experiments have demonstrated that SC can have different effects on different categories of students. Olson & Evans [30] found that students who were high in neuroticism reported a greater increase in positive affect after comparing to under-achieving peers, while students high in openness preferred to compare with those who performed better. Leung [26] highlighted the need of individualising information given by SC. In a study on how the position on different types of leaderboards affects learning performance, [3] compared absolute and relative leaderboards and found that students’ motivation and preference differed based on where they stand on a leaderboard.

Yet, most of the existing online learning support environments that leverage SC, while successful in general, disregard individual differences of students. To escape this one-size-fits-all design, we need to achieve a deeper understanding of the interplay between student motivation and SC. We need to explore the ways, in which personal, social and contextual factors may influence students’ attitude towards SC and their perception of educational interfaces based on this technology. Such an approach would allow to organise online learning environments in a way that uses SC to facilitate student motivation as optimally as possible.

The goal of this study is to make one of the first steps in this direction and investigate students’ perception of different implementations of a SC-enabled educational interface in different scenarios. Consequently, the general research question of this study is:

“What is the student perception of SC implementations in online learning environments?”

This question is narrowed down by focusing on the effects of students’ performance and achievement motivation profile, on their perception of SC. The three main research subsections are:

RQ 1: What is the general student perception of different SC implementations?

RQ 2: What is the relation (if any) between student performance and their perception of SC?

RQ 3: What is the relation (if any) between a students’ achievement motivation and their perception of SC?

2 Related Work

SC is a well-explored phenomenon. In his classical work, Festinger [11] stated that there is an inherent desire in people to evaluate their abilities, opinions, and beliefs. He states that this desire is satisfied through SC. Wood [41] extended this definition and concluded that in addition to self-appraisal, people may engage in SC also to self-improve and self-enhance. SC is spontaneous, effortless, unintentional and happens relatively automatically, especially when there is lack of other information. In educational scenarios, when students receive infrequent and constrained feedback from their teachers, they end up engaging in SC with their peers to reduce uncertainty, evaluate their performance, build their self-concept and direct their efforts. It is a major feature of the classroom environment [27] as comparing one’s performance to others results in higher engagement and motivation, as it allows individuals to determine their effort and performance expectations.

In technology-enhanced learning, leaderboards or ranking charts have been widely used as SC tools improving student motivation and engagement [3, 12, 23]. A critical examination of existing Learning Analytics dashboards [24] highlighted that the dashboards use SC to help the students understand the behavior of their peers and use it a feedback on how they could improve. However it was also noticed that some designs may shift the focus on competition between the learners rather than mastering the subject. SC visualizations like Comtella [5] and Knowledge Sea II [1] allowed students to compare their performance with their peers. Such comparative information can also be provided through widgets on a Learning Analytics Dashboard that has been shown [8] to improve course completion rates in Massively Open Online Courses (MOOCs).

There are more advanced interfaces that use ideas from Open Student Models [7] like Progressor [20,21,22], QuizMap [6], Reading Circle [17], and Reading Mirror [4] that combine personalized and social learning to motivate and engage students by nudging them to learn or read more. Mastery Grids [16, 29] is a social Educational Progress Visualization that shows a students’ expertise across various topics and shows how they compare with the class average. This form of visualization was shown to improve the motivation and engagement of the students. Recent variations [2] have implemented learner-controlled SC by studying a novel fine-grained peer group selection interface.

Despite the positive effects, SC has been said to have mixed effects, as it leads to competition and makes individuals aware of their lack of skill, status or position relative to others [14]. It has also been noticed that in some cases, such SC can diminish intrinsic motivation and overall performance [18].

This leads to the question about the factors that decide when SC leads to positive outcomes and when it does not. It is found that mastery-oriented individuals preferred to compare with someone who outperformed them because they wanted to self-improve [31]. On the contrary, performance-oriented individuals were more self-enhance, which in turn led them to prefer less competent peers to compare themselves with. If implemented incorrectly, SC may have detrimental effects [19] on motivation and engagement. It was also found [35] that students’ SC own preferences do not necessarily align with their best interests.

3 Methodology

The aim of this research is to explore how different students perceive different types of SC. The perception is approached to understand how the students’ attitude and performance are affected. We are especially interested in exploring the perceived preference, informativeness, comfort, and motivation. We are also interested in finding out if there are differences in the perceptions of students depending on their academic performances. We have constructed prototypes that provide SC information to students to analyse and provide their opinions using a structured questionnaire. This section describes the methodology of this research in details.

3.1 Questionnaire Design

The goal of the questionnaire is to identify possible relations between students’ motivational profiles and their perceptions of different types of SC implementations. Consequently, the questionnaire has been divided into two parts. The first part contains questions about the basic demographics of the participants and a scale to measure participants’ motivation profile. The second part focused on showing to the participants different SC implementations using predefined prototypes and asking them questions about different implementations.

2 x 2 Goal Achievement Framework.

The part of the questionnaire used to measure the participant’s motivation profile is the 2\(\,\times \,\)2 Goal Achievement Framework by Elliot & McGregor (2001). The texts of the questions have been slightly modified to fit the scenario better. This framework consisted of 12 items using five-point Likert scale. It is divided into four parts, each measuring one of the four goal orientations: performance-approach, performance-avoidance, mastery-approach, and mastery-avoidance.

When students have performance-approach goals, they are focused on demonstration of competence relative to others. In contrast, students having performance-approach goals are concerned with avoiding failing in front of others. Students who have mastery-approach goals are committed to learning and self-improvement. They are not necessarily motivated by the design to show their competence to others. Students who have mastery-avoidance goals are concerned about their inability to master the subject. They might worry that they may not learn all that they possibly could in the class. The students having different types of goals use different strategies to achieve their goals. This emphasises the need of having personalized support in TEL, and thus is a necessary element that should be addressed.

High vs. Low Performance.

For the second part of the questionnaire, a between-subject design was employed in order to compare the perceptions of high performing students and low performing students, and determine whether there are any significant differences between the two groups. We refrained from within-subject design to prevent the questionnaire from being too long and avoid potential interaction between the scenarios. Participants were asked to imagine that they took part in a course for which the midterm exam results had just been posted online. We randomly assigned them to either the ’high-performance’ or ’low-performance’ group. Depending on the group, they either passed with an 8.5/10.0 and ranked in top ten of the class, or failed the exam with 5.4/10.0 and placed at the bottom ten of the class. The effects of the students responding to hypothetical situation were validated and are discussed later in Sect. 5.1. To aid the participants in visualising themselves as a student in the scenario, and to give them more of an idea of what each SC implementation entails, prototypes were used as visual support.

Fig. 1.
figure 1

Upward and Downward Comparison prototypes for High and Low Performance Scenarios.

Prototypes.

The prototypes showed the scoreboard of the midterm exam which displayed the class rankings, grades and performance per exam topic in percentages of five students including the participant. The performance per exam topic is shown in percentages and is colour coded in five colours ranging from dark red for a score below 50% to dark green (above 90%) with yellow representing 70%-80%.

We created four pairs of prototypes for low-performing and high-performing scenarios each, thus resulting in a total of sixteen prototypes. The four pairs represent:

  • Anonymous SC versus Public implementation

  • Close/adjacent comparison versus far/non-adjacent comparison

  • Upwards comparison versus Downwards comparison

  • Results of only friends versus results of friends and other students

The prototypes in each pair were labelled with Design A and Design B and were accompanied with short descriptions of what each prototype represented. The four prototype pairs based on the students’ performance group were shown to all the students in the order presented above. A deliberate decision was made to ask the participants about their perception of public vs. anonymous comparison first and keep all the other prototypes anonymous (with the exception of the prototypes that show the names of friends) so as not to let the issue of privacy interfere with their opinion regarding other implementations. Figure 1 shows the prototypes related to Upward vs. Downward comparison for both high and low performance scenarios. We chose the positions moderately near the top and the bottom of the class to make them look realistic and relatable. Based on the student’s scenario, a corresponding pair would be shown.

The other three pairs are displayed in Fig. 2, but only for the high-performance scenario. We do not include their low-performance counterparts to save space.

Fig. 2.
figure 2

Remaining three prototype pairs for the High Performance Scenario.

Questions About Student Perception.

For each pair of prototypes, four questions were asked to find out participants’ perception about the SC implementation represented by these prototypes. These questions asked the users to choose which of two versions they prefer from (1) strongly prefer prototype A, to (2) somewhat prefer prototype A, to (3) consider both prototypes equal, to (4) somewhat prefer prototype B, to (5) strongly prefer prototype B. Students had to rate these prototypes based on their perception in terms of general preference, how comfortable they would be with using such a prototype, in terms of the prototype informativeness, and in terms of the prototype capacity to motivate. Each of these questions was followed by an optional open-ended question that asked the participant to explain their answer.

3.2 Participants

A total of 56 participants were gathered over a two-week span. 51.8% of participants were male and 48.2% female, with the age of participants ranging from 18 to 29 and a mean age of 23. The majority of participants (69.6%) were Bachelor students, after which Master students were the largest group (25.0%), followed by a small group of high school students (5.4%). 27 participants ended up being placed in the high-performance scenario group and the other 29 were assigned to the low performance scenario group. The gender, age and education of participants between the two scenario groups were well distributed.

4 Results

This section, presents the results of our study by first analyzing the clusters based on students’ response to identify typical motivation profiles followed by details on how users perceive the different SC setups.

4.1 Achievement Goal Orientation Clusters

First, we measured the internal consistency of participants’ responses to the the four subscales of the 12-item achievement goal framework using Cronbach’s alpha. The results varied across the subscales. The internal consistency of the responses for performance-approach (\(\alpha \) = .926) and mastery-approach (\(\alpha \) = .806) subscales was very high. The internal consistency of performance-avoidance (\(\alpha \) = .508) and mastery-avoidance (\(\alpha \) = .698) subscales was rather low. Thus we explored further possibilities of analyzing the groups while considering the achievement goals. We performed a hierarchical cluster analysis of students’ responses to identify typical achievement motivation profiles. Three strong clusters were found. Cross-tabulation between the clusters and the achievement motivation subscales revealed that performance-avoidance and mastery-avoidance were not very useful in creating distinctions between the clusters as their answer distributions were scattered across the three groups. These observations were in agreement with the results of the internal consistency analysis of the “avoidance” subscales. Therefore motivational profiles were created for each cluster based only on the levels of performance-approach and mastery-approach.

The obtained clusters are summarized below:

  • Cluster 1: 12 participants, scored low in performance-approach (score \(\le \) 2.33) and low to moderate scores in mastery-approach (score \(\le \) 3.33)

  • Cluster 2: 23 participants, scored high in mastery-approach (score \(\ge \) 3)

  • Cluster 3: 21 participants, scored high in performance-approach (score > 3)

4.2 General Consensus About the SC Implementations

The four pairs of SC implementations described in Sect. 3.1 reveal students’ perceptions and preferences across the four chosen dimensions that represent the design considerations during development of an SC interface.

Table 1. Descriptive Statistics of participants response to four prototype pairs

Public vs. Anonymous Results.

When the participants were presented with these two prototypes, more than half of the participants (51.8%) strongly preferred the anonymous prototype and 21.4% more somewhat preferred it. Open questions revealed that this preference mostly stems from participants being uncomfortable with other people knowing their grades and position on the leaderboard. When asked which prototype design the participants felt more comfortable with, 22.6% strongly felt ‘more comfortable’ with the anonymous prototype, followed by 35.7% being ‘somewhat more comfortable’ with the anonymous prototype.

In terms of informativeness, when shown the two prototypes, both were considered to be equally informative by most students. Both the designs were found to be almost equally motivating. More details are presented in the table 1.

Close vs. Far Comparison.

The close comparison and far comparison prototypes show very different view of SC. However the descriptive statistics show that for the questions regarding the preference, comfort, informativeness, and motivation the median lies around 3, which means that most students were neutral about both of the prototypes. However, the result become significantly different, when we split students into two groups based on “their performance”. This is discussed in Sect. 4.3.

Upward vs. Downward Comparison.

The response on upward and downward comparison shows that there students in general have a slight preference towards upward comparison design. More than a half of participants (\(55.5\%\)) strongly preferred or somewhat preferred the upwards comparison compared to only 16% preferring the downward option. Almost a half of participants (48.2%) considered upward SC more motivating compared to only 5% considering downward SC more motivating. When answering questions about which of the two designs was more comfortable and informative, participants considered both the implementations to be almost equal.

Showing only Friends vs. Showing Friends and Others.

The general consensus based on the central tendency of the questions was that both prototype designs were equally preferred, comfortable, informative, and motivating in the eyes of the participants.

4.3 Between-Group Analysis: High Performance vs. Low Performance

To check for any significant difference in SC perception between the high performance and low performance scenario samples, Mann-Whitney U tests were performed on student’s answers to the 16 questions evaluating 4 preferences over 4 prototypes comparisons. The results of the tests came out significant only for two questions comparing close SC vs. far SC implementations. Students form the high performance group preferred the close SC more than students from the low-performance group. At the same time, students from low-performance group regarded the prototype implementing far SC more informative than students from the high-performance group. For all other questions, no significant differences were found.

High-performance group strongly preferred the SC prototype that showed the results of peers directly adjacent to the target student on the leaderboard (Mean = 2.4; Median = 2.0; SD = 1.28) compared to the low-performance group (Mean = 3.3; Median = 3.0; SD = 1.22); U = 195.5 (p-value = 0.042). The reoccurring reason in the answer explanations given by the “high performing students” was that they were not interested in seeing people far away from them, especially far down in ranking. Seeing and comparing the results of people that have a “similar score” or are ‘more equal’ to them felt more interesting to them. Students in the low-performance group, were more equally distributed in their opinion regarding the two SC implementations. While some of them felt similarly to the high-performing counterparts and preferred to focus on the results of their “neighbours”, others valued a wider overview of the leaderbaord for enabling them to better understand how the class at large is doing.

In a similar fashion, low-performance students considered far SC significantly more informative (Mean = 3.7; Median = 3.0; SD = 1.09) compared to the high-performance students (Mean = 2.9; Median = 3.0; SD = 1.29); U = 183.5 (p-value = 0.023). On this questions, the high-performing students did not express a strong unified preference; while the low-performance students definitely saw the interface that allows them to compare with the results of a wider range of peers as more informative.

Regarding the feeling of comfortableness, the majority of both groups felt equally comfortable with both prototypes. However, a noticeable number of students in the high performance group felt much more comfortable with the close comparison prototype. Participants elaborated on this by mentioning that they found that the far comparison prototype gave too much attention to the ‘best and worst performers’ or ‘outliers’.

It is worth mentioning that the results on the two remaining questions comparing these two prototypes were similar. High-performance students felt more comfortable with a close SC, while low-performance students did not have a strong tendency either way. At the same time, low-performance students felt that the far SC prototype motivates them better, while high-performance students regarded the prototypes equally motivating. However, these differences were not significant.

4.4 Achievement Motivation and SC Perception

After identifying three clusters based on the achievement motivation scale in Sect. 4.1, a Kruskal-Wallis test was performed to find out if there was any relation between a participant’s achievement motivation profile and their SC perception. Two pairwise comparisons turned out significant. When comparing students’ opinion regarding the informativeness of anonymous vs. public SC implementations, there was a significant difference in answer distributions between cluster 3 “high in performance-approach” (Mean = 2.11; Median = 2; SD = 0.809) and cluster 1 “low in performance-approach and low in mastery-approach” (Mean = 3.18; Median = 3; SD = 0.751); H = 18.27 (p-value = 0.001). Students in cluster-1 found public comparison to be more informative than the other students.

For the upward vs. downward comparison implementations, there was a significant difference between the answer distributions for the question regarding how comfortable the two implementations are. Students in Cluster 3 “high in performance-approach” (Mean = 3.21; Median = 3; SD = 0.787) are significantly more comfortable with downward comparison implementation than students in Cluster 2 “high in mastery-approach” (Mean = 2.59; Median = 3; SD = 0.825); H = 10.54 (p-value = 0.016). They are also significantly more comfortable with downward comparison implementation than students in Cluster 1 “low in mastery-approach and low in performance approach” (Mean = 2.55; Median = 3; SD = 0.688); H = 11.88 (p-value = 0.024).

5 Discussion

This study of students’ perception of different SC design choices can provide interesting insights for the developers of effective educational interfaces using SC. It is quite clear that most students are much more comfortable with anonymous leaderboards compared to public. A reoccurring explanation from students was that though they are interesting in finding out where they stand, they don’t care about who they are outperforming. They also feel uncomfortable by the prospect of being judged by their peers should their performance be publicly observable and identifiable. These results are similar (both quantitatively and qualitatively) to the outcomes of another recent study that focused on this issue [3].

When we looked at Close vs Far comparison, there was no strong general trend. However the between-group tests show that high performing students have a significantly stronger preference of the prototype that shows the results of students who are similar to them. These results are in agreement with the general trend characterising another aspect of SC - upward vs. downward comparison. Answers to the questions about Upward vs Downward comparison show a slightly higher general preference in favour of upward comparison as more preferred and motivating. Hence, when looking at these two aspects together, an argument can be made, that a higher preference in favour of far SC among low-performance students is directly linked to the fact that they prefer upward comparison, as for them, comparing far mean comparing upward. At the same time, for high-performance students, upward comparison means comparing near. However, one important exception here comes from the analysis of students’ achievement orientation. It shows that people high in performance approach orientation found downward SC to be more informative.

Student behavior is complex, and there are indeed, major individual differences that explain how they perceive, and how they get motivated by the design of SC interfaces. This study highlights the need of developing interfaces that adapt to the needs of each individual.

5.1 Limitations and Mitigation

This research is based on a sample size of 56 participants. Future work will address this and we will conduct similar studies in real courses with much larger number of participants.

Another major limitation of this research was that the students were presented a hypothetical scenario. This means that participants might have answered the questions based on a scenario that does not adequately reflect their real-life situation. e.g., a top-performing student must answer questions while imagining themselves to be at the bottom of a class.

To mitigate potentially erroneous conclusion, we conducted an additional study to understand how different the hypothetical low and high performing students are compared to actual performance and expectations of the students. The participants were asked what is a general satisfactory result of a course for them in terms of a course grade. With these answers, another round of analysis was conducted by only selecting the data of participants that were matched with a hypothetical scenario that matches their actual performance or expectations. Thus we limited the data to actual high performing students (or at least striving to be) that were also present in the high performance scenario, and under performing students (or students who are satisfied with their grade as long as they pass the course) that were present in the low performance scenario. This resulted in two groups that both consisted of 12 participants. We examined the answers of these two groups. It turns out, the answers from true low and high performance students were similar to the answers given by students in the hypothetical groups. This led us to conclude that students in fact were quite effective in placing themselves in the required hypothetical scenarios and providing truthful and reliable answers. As a result, perhaps, this limitation of using scenarios to measure the effect of performance might not be as severe as it seems at first.

5.2 Future Work

The future work should focus on uncovering the actual students’ behavior and actual use of SC implementations by executing a longitudinal experiments in the framework of a real course. Such experiments can lead to clearer and more conclusive results when the analysed sample size is larger. Another idea that could be investigated is the temporal changes in the preferences or behavior of the students during a course. It could also be interesting to analyse whether the perception changes after receiving a good or a poor grade for an assignment. Design of SC interfaces is another interesting area that should be explored. It might be valuable to explore what type of SC visualisation is preferred by students or is most effective in eliciting positive learning behaviour.

Another primary direction of future work is to find how personal factors and individual characteristics such as personality type, confidence level etc. affect a student’s SC and behavior. It would be interesting to see what factors make a student more willing and open to engage in SC than others.

6 Conclusion

SC is a powerful motivator. But because of individual differences among the students, everyone responds differently to what and how the SC information is displayed. In this study, we examined the preferences of students with regards to different SC implementations. The four SC implementations were represented through prototype designs of a hypothetical midterm exam scoreboard of an online course and were presented in an online survey. Each prototype represented a choice for a certain SC implementation: keeping the results anonymous or public, showing only upward or downward comparison information, showing people in proximity with their ranking or farther away, and showing only the results of their friends or of friends and other students. We learnt which type of SC interfaces the high performing students and the low performing students prefer, find comfortable, informative, and motivating. We also found that there is an interplay between achievement orientation and the preferences towards the design of SC.

We investigated the validity of using hypothetical scenarios and compared them with participants who actually match the scenarios they were provided. We discovered that the differences in imagined scenarios and responses with students who are actually in high performance or low performance groups might not be as severe as it might seem. We acknowledge that there is a need of uncovering actual behavior of the students in a real course. Identifying students’ perception of SC, provides us with insights that can lead to effective design decisions that help the students improve their motivation, comfort and engagement with learning systems regardless of their individual differences.