Introduction

The fields of data mining, data collection and data analytics are advancing rapidly thanks to the momentum of technology. In higher education, institutions are being challenged to increase their understanding of students’ needs. In response, a number of institutions have developed and implemented diverse technological approaches, which incorporate the use of learning analytics. Learning analytics has been defined as the collection and analysis of data in education settings in order to inform decision making and improve learning and teaching (Siemens, 2011; Van Barneveld, Arnold, & Campbell, 2012) whilst academic analytics is the process used by higher education institutions to support operational and financial decision making (Van Barneveld et al., 2012). It is aimed at governments, funding agencies, and administrators instead of learners and faculty (Baepler & Murdoch, 2010). Educational data mining encompasses both learning analytics and academic analytics (Siemens & Baker, 2012) in order to understand the settings in which students learn (Chatti, Dyckhoff, Schroeder, & Thüs, 2012). Recently, concern has been expressed about the moral tensions (Willis, 2014) and ethical dilemmas associated with the processes of data collection, data mining and the implementation of learning analytics (Drachsler et al., 2015; Ferguson, 2012; Ferguson et al., 2014; Shum & Ferguson, 2012). This gives rise to questions relating to the purpose of data collection and how sensitive data will be handled (Oblinger, 2012). Lack of transparency in data collection and analysis has resulted in development of academic policy, particularly as it relates to the use of analytic data for student retention and support (Open University, 2014; Sclater & Bailey, 2015).

CQUniversity Australia developed an institution-wide engagement system called Early Alert Student Indicators (EASI) using learning analytics to automatically calculate an estimate of student success. The EASI system allows academics to send personalized nudges to students, which can be tailored based on the student’s projected success rate. When reviewing the way in which learning analytics data from the EASI system was being used, three issues emerged.

  • There was an institutional assumption that student data, consensually gathered at enrolment, could be analysed beyond the scope of the original consent.

  • Some academics used the EASI data in a manner inconsistent with the intent of the designers.

  • Academics interpreted the student’s individualised EASI data to label students based on their estimate of success.

Therefore this paper considers the ethical dilemmas arising from the use of learning analytics to support positive student outcomes and the desire of the institution to address attrition and retention rates of students.

The paper has six major sections. In the first section the rationale and description of the EASI project is outlined, highlighting the background of the project and the institution—CQUniversity Australia. The second section exposes ethical tensions that have emerged with the ongoing development of the EASI learning analytics system before addressing, in the third section, an ethical framework based on the work of Slade and Prinsloo (2013). The next section outlines the trends from applying this ethical framework to the EASI project. In the fifth section these trends are discussed in relation to the use of EASI by academic at CQUniversity Australia. In the final section we present the implications of this project and conclusion to our paper.

Description and background of the EASI project

CQUniversity Australia is a regional university that operates in a number of different geographical campuses across Australia and online. There are upwards of 30,000 students studying courses ranging from Certificate I through to PhD qualifications, with the institution becoming a leader in the field of distance or on-line education. Up to 50 % of the student cohort choose to do one or more courses online via the distance mode. The institution also highlights the fact that many of the students are first in family to attend a university, come from low-socio-economic, and indigenous backgrounds. This means that the student cohort represents what was once considered a non-traditional university student. As such there are perceived obligations or assumptions made in relation to support and scaffolding of academic work to promote or enhance successful learning outcomes and graduate success of these students (Adams, Banks, Davis, & Dickson, 2010).

The university has a duty of care in relation to the student. This can be interpreted as care in the teaching process, pastoral care in the social process and catering to the needs of the student in the context of inclusive practice (Demetriou & Schmitz-Sciborski, 2011; Tinto, 1999). All students have the right to access the curriculum even if this may require adjustments, modifications or accommodations. This is mandated within the legal framework of the Disability Standards for Education 2005 (Attorney General’s Department, 2005) and the Melbourne Declaration on Educational Goals for Young Australians 2006 (Ministerial Council on Education, 2008) and is operationalized through workplace policies at CQUniversity. An extension of the university’s duty of care manifests in an obligation to handle information about the students in a sensitive and secure fashion.

CQUniversity developed a learning analytics system called Early Alert Student Indicators (EASI) to help teaching staff identify those students potentially “at risk” of failure or attrition. Data is drawn from a number of university systems and combined with the student’s activity on the online Learning Management System, Moodle. The intention of the EASI system is to identify potentially “at risk” students so that intervention strategies can be implemented early.

The rationale for the development of the learning analytics system was based on CQUniversity’s Indicators Project, which found that the more activity within the LMS, the higher the student’s grade overall, therefore it could be assumed the less activity, the higher chance of failure or attrition (Beer, Jones, & Clark, 2009). Furthermore, it was found that the majority of lecturer-student interactions within the Learning Management System tended to occur with high achieving students because they were the most visible in an online environment (Rossi, Van Rensburg, Beer, Clark, & Danaher, 2013). Based on this, the designers of EASI were motivated to seek ways to elevate the profile of lower achieving students so they could be given greater precedence with lecturers, and thus potentially increase student retention (Beer & Jones, 2014). EASI uses learning analytics to establish a judgement or “Estimate of Success” for each student. The system facilitates academic interventions with students via an in-built mail-merge feature.

The EASI system was piloted in 2013 and adopted by the university in 2014. It was heralded as a great tool in aiding retention of students in an era when attrition rates govern academic strategies, and other universities have expressed an interest in adopting it for their own institutions (Beer & Jones, 2014). Between February 2014 and November 2015, there were 223,979 mail-merge interventions conducted across 898 individual courses. Preliminary and unpublished research by the university revealed a statistically significant correlation between academic interventions and increased student LMS engagement.

Ethical tensions arising from developing the EASI system

The increase in student engagement reflects the university’s and the EASI designers’ intention to enhance student success. However, there are some underlying ethical concerns that have emerged regarding the collection and analysis of student data. As part of the enrolment process all students agree to provide specific personal data and for the university to collect and use subsequent academic data. The university is required to collect certain information for statistical, financial and government purposes. This information includes student name, age, gender, grade point history, enrolment history and study load. In the development of the EASI system, ethical or privacy issues were only cursorily considered, since the university was already collecting data in an official capacity.

The institution had a large dataset which was seen by the institution as being data belonging to the institution therefore the institution was at liberty to use this data in the best interests of the institution. At the same time this data had been consensually collected therefore course designers saw no ethical issues in having access and using this data to assist faculty provide better support to students who might be struggling or at risk of failing. However, it could be argued that further manipulation of the data and subsequent labelling of students by risk level brings about potential ethical dilemmas concerning the abuse of a power relationship, since data is not de-identified and the consent initially obtained not apply (Prinsloo & Slade, 2015; Slade & Prinsloo, 2013).

This paper draws upon the socio-critical perspective of learning analytics, where the notions of power, surveillance, transparency and transience of student identity are acknowledged (Slade & Prinsloo, 2013). In hindsight, possible issues relating to labelling, surveillance and transience simply were never raised as even needing acknowledgement, let alone any requirement for revised student consent, perhaps because of the institutional basis around the perceived ownership of the student data. Most ethical discussions relate to data mining and student privacy (Pardo & Siemens, 2014; Prinsloo & Slade, 2015). The use of student data in research, whether this research be for institutional purposes, publication or other perceived altruistic purposes requires consideration of ethical practice. Ethics should therefore be considered within a framework where benefit outweighs risk, where there is no exploitation of participants, justice and fairness are seen to be practiced and respect, in this particular case, for the welfare of students enrolled in the institution (National Health and Medical Research Council (Australia), Australian Research Council, & Australian Vice-Chancellors’ Committee, 2007).

The growing interest in learning analytics across the higher education sector has led to a plethora of small scale projects that are endeavoring to explore its potential. Likewise, the EASI project was largely exploratory in nature and has spawned a number of associated research publications and presentations (Beer, Clark, & Jones, 2010; Beer & Jones, 2014; Beer, Jones, & Clark, 2012; Beer et al., 2009). At each presentation around the development and operation of the EASI system, an audience member has usually raised a question about potential ethical issues. These questions aligned with growing concerns of the designers in relation to how EASI was actually being used, and to what level students were aware of how their data was being utilized.

The EASI project couples learning analytics information with other institutional data to identify student engagement with learning materials and determine whether individual students are at risk of failing the current unit of study. The most common action taken by teaching staff based on EASI information, is to use the in-built mail-merge facility to send students a personalized email offering support or assistance. For some students this might be the first time they were advised that their level of engagement was monitored and that they were at risk of failure. The designers noticed that in some instances the email message could potentially be demotivating for some students. For example, one email pointed out that the students’ current level of activity was far below where it needed to be and the student was told that they were “more than likely going to fail”. In this example the wording of the email is likely to be more harmful than beneficial to the welfare of the student concerned. It shows a lack of empathy for the student despite the intent to support the student. In other words, the support comes across as a potential threat of failure if the student does not immediately rectify the situation by engaging with more activity within the Learning Management System. It can be seen from this example that care is required in how lecturers then use the resultant analysis of EASI data through their contact with students in their courses. Not only does the use of analytics require ethical examination, but also the subsequent action based on the EASI analyses requires ethical practice in implementing supportive interventions.

Questions relating to the use of EASI information combined with some examples of misuse, plus an increasing number of publications concerned with ethics and learning analytics prompted further research by the EASI team to identify potential issues. While literature around learning analytics is still relatively new, a number of authors have been drawing from other disciplines where the ethical use of data has been more thoroughly interrogated. For example, health, human resource management and national security are areas where extensive research has occurred with regards to the ethical use of data (Slade & Prinsloo, 2013).

Using an ethical framework to understand the use of EASI

The literature points to three types of ethical theories, which encompass a variety of approaches to ethics (Penslar, 1995). These theories are:

  • Consequentialist—which are concerned with the consequences related to actions;

  • Non-consequentialist—intentions of the person making decisions about actions;

  • Agent-centered—concerned with the overall ethical status of individuals.

An ethical theory emphasizes a different aspect of an ethical dilemma that can lead to the most ethically correct resolution (Penslar, 1995). In this instance, decisions that are being made by academics are based on their own world view and positioning, and therefore will vary. It needs to be recognized that an institutional perspective and the views of individuals within that institution can vary through individual interpretation of ethical practice and values. Many institutions adopt a broad based values statement that indicates ethical practice while others appear to develop a more specific framework that guides ethical practice. For the purposes of this paper the framework outlined by Slade and Prinsloo (2013) forms the basis of this analysis of institutional assumptions regarding data collection and usage, plus the implementation and use by academics of EASI information thereby making visible potential ethical concerns and privacy issues when working with learning analytic data.

Slade and Prinsloo (2013) propose a learning analytics ethical framework based on grounding principles and considerations from which context-specific and contextually appropriate guidelines can be developed. This framework aims to better balance the potential for individual harm and greater scientific knowledge with regards to learning analytics. It also acknowledges the inherent unequal power relations between teachers and students, and notes that behaviors are subject to change when agents realize that they are being observed (Slade & Prinsloo, 2013). This aligned with anecdotal observations as the implementation of the EASI system were reviewed.

Slade and Prinsloo (2013) categorized ethical issues into three domains:

  • Location and interpretation of data;

  • Informed consent, privacy and de-identification;

  • Management, classification and storage of data.

Building upon these three domains they developed a number of principles that could be used as a framework around the ethical application of learning analytics:

  • Learning analytics as moral practice;

  • Students as agents;

  • Student identity and performance are temporal dynamic constructs;

  • Student success is a complex and multidimensional phenomenon;

  • Transparency;

  • Higher education can not afford to not use data.

This framework is reflected in policies and codes of practice now emerging in this field (Open University, 2014; Sclater & Bailey, 2015). The Slade and Prinsloo framework was applied to the EASI system to identify potential ethical or privacy concerns related specifically to this system and its operation. Three major issues emerged during the review and analysis of the EASI system: data ownership, the ethics of surveillance and potential harm to students through labelling. In the following section the outcomes of the interrogation of EASI is presented within the framework outlined by Slade and Prinsloo (2013).

Interrogating the use and implementation of EASI

Within the context of the Slade and Prinsloo framework the analysis found:

  1. 1.

    There was an institutional assumption that student data, consensually gathered at enrolment, could be analysed beyond the scope of the original consent (data mining and ownership).

  2. 2.

    Some academics used the EASI data in a manner inconsistent with the intent of the designers (surveillance).

  3. 3.

    Academics interpreted the student’s individualised EASI data to label students based on their estimate of success (labelling).

Data ownership

At several stages during the enrolment process students are advised that their data are being gathered. The specifics of how that data is to be analysed and for what purposes is not articulated. The institutional assumption is that if a student did not consent to the gathering of that data, then the student would not enrol at the university. This includes information such as name, address, date of birth and other identifying features. It also includes information such as grades and academic history. The practice at CQUniversity is that the university warns students in a pop-up box that the information is private. The student must acknowledge this box to continue the enrolment process, and this acknowledgement is used as implied consent to gather the data.

The ethical dilemma at this level relates to institutional power. There is a power relationship between the university and students as learners. When students are asked to provide data, they are obliged to do so or risk having their enrolment cancelled by not satisfying the enrolment requirements. If the university asks for information, a student will feel they have no choice but to comply. Institutions also have the power to then use the data in a number of ways, that are not necessarily transparent to the student. So whilst consent has been obtained, that consent does not necessarily relate to the manner in which the data is analyzed, nor has the consent been obtained apropos. Currently, CQUniversity has the power to collect and analyse student data, but the policy and processes are not transparent in terms of analysis and purpose.

According to Siemens and Long (2011) the ways in which analytics are used by an institution need to be perceived “for the purposes of understanding and optimizing learning and the environments in which it occurs” (Siemens & Long, 2011, p. 34), in other words, for the benefit of student learning. Oblinger (2012) argues the importance of institutions informing students “what information about them will be used for what purposes, by whom and for what benefit” (Oblinger, 2012, p. 12).

Surveillance

The designers of EASI intended the system to encourage academic staff to proactively contact students who might be struggling with their studies. In order to establish whether or not academic staff were using EASI for the purpose that was originally intended, the email nudges that academic staff sent to students were analyzed. Of these email nudges, 223,979 were delivered to 19,915 individual students across 898 course offerings. A text mining exercise was conducted on these emails followed by a thematic analysis using Leximancer®. Leximancer® is a content analysis tool that calculates the frequency and co-occurrence of concepts within large bodies of text.

The vast majority of emails were positive in nature and were intended to motivate or offer support to students. For example “Do you need some help?” and “Please contact me to discuss this situation, as I would very much like to help you”. However other emails could be interpreted as demotivating and contrast with the intended use of the system. For example “If you do not attempt this assessment, you will fail”. The authors also noted that some academic staff were using the email feature provided by EASI to send broadcast emails rather than using the learning management system email facility. For example “Hello everyone, Additional resources relating to Assessment 1 are now available in the…”. Given that the email feature of EASI promotes personalization for potentially struggling students, the use of this facility for broadcast emails falls outside of its intended use.

Academic staff are human and their individual perspectives can result in subconsciously biased interpretations of data (Slade & Prinsloo, 2013). This is further complicated by the fact that the EASI system uses raw data requiring a level of in-context interpretation. The EASI system is used to “nudge” students within the learning context. Siemens and Long (2011) note that using data based on a student’s behavior, such as how many times a student posted on a discussion forum, could mean a return to behaviorism as a learning theory. In the context of a student-centric approach to higher education, this theory is no longer recognised as the most pedagogically appropriate. It is not the quantity of postings, but rather the quality of the content within the post that is more meaningful. There is also a danger of a misdirected intervention from lecturing staff resulting in student demotivation, inefficiency and resentment (Kruse & Pongsajapan, 2012); the direct opposite of the intention, which was to provide better targeted support. The value of a constructivist approach is that it involves the learners, giving them some level of control through a choice to opt in or opt out (Prinsloo & Slade, 2015).

Labelling of students

Slade and Prinsloo (2013) question whether it is appropriate for students to be aware they are being labelled, or to even label in the first place. The Estimate of Success in the EASI system is dynamic and transient. It will change as the level of student engagement changes. This potentially biases and oversimplifies student learning, labelling them as a high risk of failure when in fact the student may prefer to learn asynchronously and episodically.

Students are labeled based on past decisions, choices, failures and successes. Institutions can very easily perpetuate current biases and prejudices by not taking appropriate care with the dynamic nature of student identity, that is, labeling students. Slade and Prinsloo (2013) recognize the plurality of student identity and note “it is crucial that the analysis of data in learning analytics keeps in mind the temporality of harvested data and the fact that harvested data only allows us a view of a person at a particular time and place” (p. 10).

During a student’s time of study within an institution, their identities are in a continuous state of flux as they navigate the higher education landscape. It is important that students be allowed to make mistakes and learn from past experiences without their student profile “etched like a tattoo into… [their] digital skins” (Mayer-Schonberger, 2011, p. 14).

Implications for future practice

EASI Connect continues to be widely used at CQUniversity, despite the ethical concerns. Ultimately, the system continues to do what it was designed to do, provide learning support to students who may otherwise struggle or appear invisible. This approach reinforces the attitude espoused by Tinto (1999) about the benefits and strengths of learning analytics. The volume of nudges recorded and the student feedback received during course evaluations indicates appreciation for an individualised and tailored approach to student learning, particularly for online and distance students. Further, all academics receive ongoing training about the system and its strengths and weaknesses. This training is conducted by the EASI designers, and is available one-on-one with academics if required. Finally, the labelling of students is dynamic and changes from week to week, even day to day. The use of the term “indicator” is promoted so it becomes clear that the intent is not to label or personally judge students. From a practical implementation, the EASI Connect system works well and benefits student learning.

However, the issue relating to consent is yet to be resolved. Students are already made aware that their engagement within the LMS is being monitored. The question is whether the students should be able to opt out of this action. The students have already consented to having their data gathered, so they know the university is collecting the information. They also know that the data is being analysed so academics can assess their engagement during term. However, the students do not have the same level of access to their own data, and the consent relates to collection rather than analysis and reuse. It is feasible that the student may not be clear about which data is used or how that data is being individually analysed, as the process of data analysis is not currently transparent.

Prinsloo and Slade (2015) investigated the issue of consent. They determined that having the option to opt in or opt out of the data analysis was too simplistic. Their recommendation was to increase the level of transparency around the learning analytics activities, so that students could become part of the process, instead of the process and decisions being made around them. Prinsloo and Slade believed that the benefits of data collection may not become apparent immediately, or that students may not have the required information to provide adequate informed consent. Consent could then become a sliding scale, provided (or revoked) “downstream” of the initial collection (p. 90). This approach would require universities to re-think and reconceptualise the notion of privacy, in terms of timing and focus. Privacy becomes “self-managed” by the student, rather than imposed arbitrarily by the institution (Solove, 2013). This is a resolution not easily adopted within a structured hierarchical organisation like a university, where students’ privacy is considered paramount, and where the organisation takes the responsibility to manage a student’s private data seriously.

Conclusion

There is considerable hype around the potential of learning analytics to positively contribute to learning and teaching (Beer & Jones, 2014). This increases pressure for universities to adopt (or develop) the latest learning analytic systems (Beer & Jones, 2014; Gibson & Tesone, 2001). Whilst the hype around learning analytics is warranted in terms of learning assistance and support to students, the rapid development of various systems ultimately distracts from more deliberate and thorough implementations particularly in the area of ethics. Such was the case at CQUniversity, where the potential of learning analytics to contribute to a complex learning and teaching environment compelled the development of the EASI learning analytics application. This paper has identified potential ethical issues, which as a consequence of rapid systems development, were overlooked during to the system’s implementation.

The ethical dilemmas of ownership, surveillance and labelling are issues of which the university and the system designers are now aware. So, whilst the EASI system is still currently being used by academics (and this use is increasing), future research will be undertaken to determine the approach by which these issues can be appropriately addressed within this university’s context. For example, further research into the student and lecturer perception of university access to learning analytics is required. A potential resolution is that when student data is gathered, it should clearly and transparently explain how such data is used and analysed. Further, the notion of consent could become a fluid process. Future research could also develop a framework for the collection and use of data analytics that collectively provide a holistic perspective on a student’s individual performance, specifically student progression and completion, including critical momentum points. This framework could also then be applied to other institutions, which could yield more sophisticated and nuanced information across the higher education sector.

The identification of ethical issues around learning analytics does not mean an end to the potential benefits that learning analytics systems brings to higher education. What it does mean is that higher education institutions need to be aware of how the implementation of such systems takes place, and the impact on the ethical rights of the individual students.