Keywords

5.1 Introduction

5.1.1 The State of Data-Driven Student Support

The rise in use of technology mediation in learning scenarios is providing unprecedented amounts of data about how educational institutions work and how students participate in learning experiences. At the same time, learning scenarios are becoming increasingly diverse and complex. The areas of educational data mining (EDM) and learning analytics (LA) have emerged to address the issue of how to use data to improve our understanding of learning, and enhance the overall quality of the learning experience for the student. Although EDM and LA researchers and practitioners maintain a similar focus (Baker and Siemens 2014), they differ in their approach to data generated in educational settings. Researchers in EDM frequently focus their analyses on the formulation or improvement of data mining algorithms designed to detect and predict important factors in a learning scenario. LA, on the other hand, focuses on how these algorithms can be deployed and integrated in learning designs, used by teachers, and provide tangible improvements for students. However, in their initial stages, both disciplines placed their emphasis mostly on how data can be collected and used by algorithms and not so much on how these data can then lead to actions that have a positive effect on students.

Prior to the availability of massive amounts of data, the areas of intelligent tutoring systems (Corbett et al. 1997), educational hypermedia (Bra 2002), and adaptive hypermedia (Kobsa 2007; Brusilovsky 1996) used technology mediation to increase the support students receive while participating in a learning experience. But this recent increase in the number of data sources about events and information produced while students learn has prompted the use of new types of algorithms and techniques to achieve these improvements through more comprehensive understanding of how students work in these contexts.

The first initiatives in the LA space were conceived by comparing education with other fields such as business intelligence in which massive data sets were processed by algorithms to discover knowledge in a specific context. The term academic analytics was used initially to describe the application of business intelligence techniques to analyze the admission process in higher education institutions (Goldstein and Katz 2005). The objective was to use information about high school transcripts and previous tests to better understand student enrolment and retention during their first year at an institution. Campbell et al. (2007) later defined the steps involved in using student data, the stakeholders in this process, and the ensuing support that could be provided to students. Shortly after these initiatives, numerous decision-making processes in higher education institutions were reconsidered in the presence of data and algorithms. Long and Siemens (2011) further divided this area and provided the name “learning analytics” to those initiatives targeting improvements at the departmental or course level that specifically target learners.

One of the challenges addressed by early initiatives was the detection of so-called students at risk. These students are enrolled in an educational institution but are likely to fail, drop a course, or abandon their studies entirely. Numerous institutions have deployed LA initiatives to detect these students and offer additional support before they decide to abandon their studies (see Norris et al. 2008 for a review), thereby reducing the dropout rate and improving retention.

In more recent years these initial support actions have been extended to address other common difficulties faced by students while participating in a course. These systems are generically known as early warning systems (EWSs) and usually rely on a combination of demographic datasets and data derived from academic environments to identify students who need extra support (Lonn et al. 2012; Jayaprakash et al. 2014). The output from EWSs typically include notifying teachers which students are at risk (and perhaps suggesting a range of ways they could further support these students to stay at university), as well as actions directly proposed to the students (Krumm et al. 2014). Nowadays, this application of LA has grown to encompass a wide variety of sub-areas to provide student support through a variety of methods (Ferguson 2012b). For example, some initiatives provided the information derived from predictive algorithms directly to students to alert them about the possibility of failing a course (Tanes et al. 2011).

Other initiatives consider the social dimension of learning using data retrieved from discussion forums to deduce patterns of interaction among students. These patterns are represented as networks, and social network analysis algorithms used to derive certain features and to visualize their topology (Dawson 2010; Dawson et al. 2010). Students can then be advised to re-assess their participation, or simply to reflect on their position in the network. The text exchanged by students in discussion forums is also a valuable data source for more recent techniques known as discourse-centric analytics that seek to detect evidence of learning, and language usage patterns that are associated with positive academic outcomes (Ferguson and Buckingham Shum 2011; De Liddo et al. 2011; Knight and Littleton 2015). The characterization of these discussions offers the possibility to provide highly detailed and potentially effective feedback for students to increase their performance.

Making data available to teachers can assist them in better understanding and designing learning. For example, data visualizations are often used as artifacts to either help teachers gain insight about how a learning environment unfolds (Verpoorten et al. 2011; Verbert et al. 2014), but can also be offered directly to students to help them reflect on their approach to learning (Kahn and Pardo 2016; Corrin and de Barba 2015). Also, some authors have identified the need to consider LA techniques during the learning design stages and propose how to integrate the data collection, analysis, reporting, and interventions in a unified workflow (Lockyer et al. 2013; Bakharia et al. 2016). In this case, increasing the quality of learning designs indirectly supports students.

Although these initiatives can all be connected to improvements that affect students, their focus is primarily on the steps to collect, analyze, and report data. Wise (2014) identified the need for the LA community to focus more precisely on the actions derived from the use of data. We argue that considering these interventions as personalized learning support actions is a very effective approach that connects the collection of data to tangible and effective changes in learning experiences, which then translate into quantifiable improvements. For example, dashboards that are available to teachers may well provide valuable insight about aspects of a learning experience that were never observed. However, the benefit of the initiative is only realized when teachers deploy actions derived from these observations. Indeed, using technology only for the steps of collection and analysis, and ignoring the actions, may have a serious impact on the overall effectiveness of LA initiatives (Clow 2012).

Many existing approaches to driving actions in response to student data tend to take a one-size-fits-all approach (e.g. Jayaprakash et al. 2014), building models to predict student engagement and success and then applying these models to detect and contact aberrant students. To increase predictive power, these approaches typically seek out large datasets from a range of courses or even across institutions. The innately contextualized nature of different courses means that the variables that are common across courses and institutions (and therefore able to be used in such models) are predominantly based on demographics and educational background. At best, this risks limiting our view of students’ ability to their past performance and, at worst, perpetuates stereotypes (Slade and Prinsloo 2013). Further, such analyses ignore the more granular nature of ongoing learning processes. Even when current learning data such as interactions with the learning management system (LMS) are available, the highly contextualized nature of learning environments and instructional designs emphasizes the risks with one-size-fits-all data-driven approaches (Gašević et al. 2016). Therefore, a key argument of this chapter is that the data that drive support actions must be locally contextualized.

5.1.2 Local Contexts Influencing Data-Driven Student Support

With the costs for students of higher education increasing, and participation widening, there has been an increased and understandable focus by institutions, as well as government scrutiny, of dropout and attrition rates. In the Australian context a decrease in government funding to the higher education sector has meant that universities themselves see increasing retention rates as a financial necessity, in addition to the moral imperative most feel to maximize the learning experience and success of all the students they enroll.

Concurrent with these sector-wide structural changes, there has been an increase in the range of available data sources and computational methodologies, which has led many institutions to identify LA as a strategic priority and to invest, sometimes heavily, in software solutions (Colvin et al. 2016). At the time the system reported in this chapter was initially developed, LA was not a priority at our institution but there was already an active network of teaching leaders and central student support staff concerned with the experience of first year students and their transition to university. Their efforts to improve the experience and outcomes of first year students had been informed by the notion of the transition pedagogy.

Building upon extensive research into students’ social and learning experiences by researchers including Tinto (2006), Kift (2009), and Nelson and Clarke (2014), the transition pedagogy articulates the importance of a unified design of the undergraduate first year curriculum and co-curriculum and stresses the role of engaging teachers in proactive, just-in-time academic and pastoral support. It thus highlights the need for a whole-of-institution approach where student success and retention are “everybody’s business” (Kift 2008) including support staff, teachers, and institutional leaders. Unconnected work from any one single area may be un- or even counter-productive. For example, excellent institution-wide support services may be underused or wasted if students are disengaged by impersonal teaching or swamped by poorly designed or aligned assessment regimes. However, timely and personalized feedback and support, directly connected to each student’s own learning data, can positively influence student engagement (Bridgeman and Rutledge 2010).

The transition pedagogy promotes the value of learning communities with active teacher-student interaction. In addition, it highlights the role of formative evaluation, feedback, monitoring, and timely interventions. Given the increasing role of online learning, this requires engagement with data by teachers and course coordinators—those most experienced with the particular stress points in their courses and able to intervene during semester. It also requires this work to be joined up with institutional support and wellbeing frameworks and services. Particularly when enrolments are large and students are taking a wide variety of subjects including electives and service courses out of the enrolling faculty, ready access to relevant engagement and success data enables effective and personalized interventions at the point needed.

Divorcing the teachers from the process through an overly centralized approach has the potential to lead to the usage of easily-obtainable but generic data. As well as excusing or even excluding the teachers from the analysis, such an approach is unlikely to reflect the importance and unevenness of the learning experience. Similarly, without some degree of central coordination, efforts can be duplicated or unaligned with each other and the support systems. For students, this can cause frustrations and disengagement.

5.1.3 Our Approach to Data-Driven Student Support

Here, we present a case study of an LA platform, the Student Relationship Engagement System (SRES), at The University of Sydney that is centered on student-teacher interactions in an attempt to connect teachers with their students through data. We describe the design and development of the SRES, which enables teachers to leverage data that are meaningful to them to provide scalable and contextualized personalized learning support with students in large cohorts. These cohorts typically consisted of 600–1800 students in a single course, which contributed to our desire to not only reduce the substantial amounts of money lost to attrition but also improve students’ learning experiences in a normally highly depersonalizing environment (Krause 2005).

In the rest of this chapter, we outline the needs, principles, and philosophies that guided its development, and then provide a description of the system itself. We then highlight some real applications of the SRES and the impact it has had on students. Finally, we conclude the chapter with a discussion of potential limitations and affordances of the current system, and avenues for wider institutional impact and development.

5.2 The Student Relationship Engagement System

The SRES started as a small-scale initiative in 2012 that initially sought to improve the efficiency and accuracy of in situ data collection during face-to-face staff-student interactions. At the time, the LA field was in its infancy and was primarily on a different trajectory; that is, finding algorithmic meaning in masses of pre-existing data. Although our approach also involved data, it was starkly contrasted because it presumed that teachers would know the most appropriate data and their meaning, and they needed a platform to collect, analyze, and perform actions on these data at scale. As such, the SRES started with relatively small datasets that were created by teachers, and has gradually expanded to provide for more ‘traditional’ learning analytics functionality as the data appetites and capabilities of teachers have grown.

5.2.1 Supporting Pressing Needs in Local Contexts

The SRES was initially developed to address a simple need to which most teachers in face-to-face and blended environments are resigned: the perennial scraps of paper or malformed spreadsheets for attendance gathering and grading. These are usually followed by manual transcription and collation into a central spreadsheet, a process that usually ranges from non-existent to error-prone. Even then, teachers could do little with the spreadsheet apart from providing simple numerical grades to students.

An argument could be made that these data are perfunctory as opposed to pedagogically meaningful (and by extension, valuable for LA). Although interim grades and other performance data are often ignored by, or unavailable to, LA systems (Clow 2012), large-scale analyses have shown that they can be one of the most important predictive variables in models of academic risk (Jayaprakash et al. 2014). Similarly, in the context of face-to-face education, class attendance has been positively associated with improved student outcomes (Rodgers 2001; Massingham and Herrington 2006; Superby et al. 2006), and although being a frequently requested data source for teachers, it is notoriously difficult to collect (Shacklock 2016; Dyckhoff et al. 2012). Additionally, a large proportion of meaningful student-teacher interaction and assessment may occur outside of the LMS, which is a blind-spot for typical LA approaches (West et al. 2015).

Beyond data collection, interventions are a key part of LA (Clow 2012), and it is important that affordances for such actions are closely associated (Jones et al. 2013). In this chapter, we adopt a high-level understanding of intervention, involving “any change or personalization introduced in the environment to support student success, and its relevance with respect to the context” (Macfadyen et al. 2014). While direct student contact is certainly not the only intervention that should arise from LA, the affordances of an electronic system to accelerate this process was critical in our context.

5.2.2 Approach and Philosophy for Design and Development

There appears to be a lack of connection between the capabilities of extant LA tools (which, as we have argued, focus on data collection, analysis, and reporting), and the data needs of teachers to act (for example, by connecting with their students at scale). In light of this, a pressing and tangible need for our teachers was therefore a platform capable of allowing efficient and accurate collection of desirable data, and action based on these data.

To address this, we took a participatory design approach similar to that of others working to design and develop LA that would be practically useful and meaningful for teachers and other staff (Lonn et al. 2013; Dyckhoff et al. 2012). From 2012 on, a basic platform that recorded attendance via a web-based, mobile-friendly interface and saved data to a central database was iteratively designed and refined based on user feedback to become the SRES.

Throughout this process, we followed a set of basic design philosophies to guide development. These were fundamentally LA-contextualized reflections of the attributes of diffusible innovations, in particular the notions of relative advantage, compatibility, complexity, trialability, and observability from Rogers (2003), and we ground the following discussion on these attributes.

Teacher-centered. A truism is that “faculty have, for the most part, relied on their intuition and hunches to know when students are struggling, or to know when to suggest relevant learning resources, or to know how to encourage students to reflect on their learning … these hunches are not going to disappear with the advent of learning analytics, nor are the actions derived from them” (Dietz-Uhler and Hurn 2013). Additionally, given that (i) LA is “not an elixir for ineffective teaching, nor does it reveal an ideal pedagogy” (Pistilli et al. 2014), (ii) teachers have pre-conceived notions of meaningful data about their students (as argued above), and (iii) compatibility of innovations with existing ideas and felt needs are positively related to adoption (Rogers 2003), an LA innovation needs to address the contexts that real-world teachers face.

Part of a possible solution lies in a system architecture that corresponds with teachers’ conceptualization of data and how to work with it, helping to address issues around compatibility with existing ideas. Additionally, a solution should be cognizant of, and tangibly address, concerns around academic workload (Macfadyen and Dawson 2012), yielding a high level of relative advantage and being compatible with felt needs.

Human-centered. Despite the LA field claiming to differentiate itself from EDM by highlighting the centrality of leveraging human judgment (Siemens and Baker 2012), a large proportion of LA work appears to focus on propelling data at algorithms in order to extract meaning. Ryan Baker’s recent propositions of moving these fields towards the amplification of human intelligence are instructive here: “Humans are flexible and intelligent. Humans cannot sift through large amounts of information quickly… But once informed, a human can respond effectively” (Baker 2016).

Lack of human-centeredness in LA also extends beyond approaches to analyses and pervades implementation. A concern that should be raised more frequently is that “the focus of LA appears fixed to an institutional scale rather than a human scale” (Kruse and Pongsajapan 2012). These somewhat condemning perspectives remind us of one of the seminal principles of good practice in higher education, namely encouraging the human relationship between teachers and students (Chickering and Gamson 1987). Solutions addressing this problem must keep humans using the system at the center instead of data and analytics.

Customizable, flexible, and scalable. A substantial amount of learning interactions and data exist outside traditional sources (typically LMS and SIS databases) that LA systems can and do interrogate (West et al. 2015). Beyond the obvious challenges around data warehousing and integration (Bichsel 2012) and despite the best intentions of designers and developers, there may be several pieces of offline or other system data that cannot be automatically integrated. Additionally, teachers often demand the freedom to teach how they wish, which has important implications for the affordances of LA tools (West et al. 2015). Therefore instead of coercing teachers into a system with pre-defined (and possibly limited) data, a different solution lies in building avenues that allow teachers to define and bring in their own local and contextualized data (Graf et al. 2011).

Transparent. In this age where opaque algorithms run so many aspects of our lives, algorithmic accountability has become an important ethical challenge (Diakopoulos 2015). Learning analytics is not immune to this trend. Distrust of data and their analyses can lead to significant barriers for LA adoption (Bichsel 2012). Nevertheless, large-scale deployments of LA systems have typically relied on opaque algorithms to predict student performance (e.g. Arnold 2010; Jayaprakash et al. 2014). A possible solution to avoid such algorithmic black boxes lies in simplifying (perhaps even oversimplifying) the analytics to the extent that it is completely controlled by teachers (Liu et al. 2015). This may help to reduce perceived system complexity, and enhance the ability for teachers to experiment with analytics.

Actionable. In keeping with a human focus, the predominant avenue of intervention arising from LA appears to still be teachers or other staff interacting with students. In an Australian study, personal responses through emails, phone calls, and consultations were the preferred mechanism of data-driven actions (West et al. 2015). In another study, tools that “manage data inputs and generate outputs in the form of actionable feedback” were found to be the most adoptable (Colvin et al. 2016). Even large-scale implementations that involve opaque algorithms eventually involve teachers contacting students based on the outputs of these algorithms (e.g. Arnold 2010; Jayaprakash et al. 2014). In an exemplar of intelligence amplification (Baker 2016), LA provided the means to focus discussions that students had with their academic advisors, and to target help where it was most needed (Lonn et al. 2012). A possible solution includes the provisioning of customizable actions to promote and support teacher-student interactions. These tangible outputs may also help to promote the observability of any LA innovation.

Ethical and secure. An LA system that augments the ability of teachers to provide data-driven student support can help to simultaneously balance ethical and operational issues around irrelevance and intrusiveness. Decontextualization of data and consequent generalizations about students can lead to invalid assumptions and unhelpful data-driven support (Slade and Prinsloo 2014). One possible solution is to leverage the data on students’ studies that teachers already have access to and use (perhaps in an inefficient, distributed fashion). If these data were easy to curate and act upon at scale, such an LA solution may not overstep students’ existing expectations of privacy. This is in keeping with our design philosophy of augmenting teachers’ intelligence and abilities.

Data protection must be a core value in any LA venture and helps to build trust in LA systems (Drachsler and Greller 2016). This may involve ensuring that all student data are encrypted during transit, and stored on secure university-owned and -controlled servers (Slade and Prinsloo 2013). Removing identifiable records after a set timeframe in line with university record retention policies may also help address some concerns over data security. Beyond the critical ethical and legal issues surrounding data security, any negative occurrences could have severe repercussions for the adoption of future innovations (Rogers 2003).

Working with these philosophies, we sought to design and develop an LA system that met real and pressing needs of teachers in our contexts. Our approach was to build a platform that required active input from teachers but provided them the ability to personalize student support at scale and gain insight into their cohorts while saving time in the execution of these processes. We purposely designed the data and system architecture to support these goals.

From a teacher’s perspective, an electronic spreadsheet is one of the most common ways to handle student data—it is inherently customizable and extensible, has no hidden algorithms, and typically represents rows of students with corresponding columns of data. The issue with spreadsheets is that they are not immediately actionable, and deriving meaning at scale is difficult. Nevertheless, this matrix structure of student data seems to be eminently accessible and understandable by teachers and other staff. As Rogers (2003) points out, “[o]ld ideas are the main mental tools that individuals utilize to assess new ideas and give them meaning. Individuals cannot deal with an innovation except on the basis of the familiar. Previous practice provides a standard against which an innovation can be interpreted, thus decreasing its uncertainty” (p. 269). Since this matrix structure of data is familiar and flexible, we opted to ground the data architecture on the idea of students (in rows) and data (in columns representing different variables or features) belonging to courses (in tables).

5.2.3 Flexibility in Importing Students and Data

While connection with enterprise student information systems have allowed some LA developers to leverage institutional data warehouses (Lonn et al. 2013), in our context this was not possible, which encouraged us to design an interface that allowed teachers to import student enrolment information semi-automatically. This required them to download an enrolment list from another (enterprise) university system and upload it to the SRES; we made this process as streamlined as possible in the importer interface (Fig. 5.1). The benefits of this included that staff (i) could combine lists from different courses, (ii) could add non-regular students (such as those from outside the university as often exists in bridging courses), (iii) could record other details such as a preferred name which was not possible using enterprise systems, (iv) could have as many lists (tables) with as many students as they liked, and (v) could work in the system safely without affecting data on other enterprise systems. Obvious drawbacks included the need for semi-manual updating of course lists when enrolments changed, and duplication of some data on multiple university systems.

Fig. 5.1
figure 1

Data and system architecture of the SRES. The SRES core data model is based on familiar tabular student records. Various user interfaces can bring data into the SRES, such as an importer, mobile application, and a web-based interface. Flexible querying and messaging interfaces and engines power the majority of the data-driven student support, allowing teachers to build and deploy highly personalized interventions. Experimental features (shown as dashed boxes) take advantage of the data that are already stored in the SRES. Teachers can build customized student views to conditionally show students their own data and other information. A machine learning interface guides teachers through preparing and analyzing data within the SRES using various machine learning algorithms in order to uncover hidden patterns

Once the students (rows) were in place in a list (table), teachers could update these as necessary and also specify an unlimited number of columns. The data in these columns could also be brought into the system through an importer interface (Fig. 5.1), which accepted comma- or tab-delimited plain text files [e.g. comma-separated value (CSV) files] and guided the teacher through mapping a student identifier column and data columns.

5.2.4 Mobile Interface for Staff

One of the primary drivers for developing the SRES was the need for live, in situ, data storage and retrieval in face-to-face learning scenarios. To enable this process, we built a mobile-friendly web-based interface that would allow staff members interacting with students to select columns for which they wanted to input data, specify the data entry pattern, and save data for each row (Fig. 5.2). An example of one of the workflows for this mobile interface is:

Fig. 5.2
figure 2

An example of the SRES mobile interface, at step 4 of a timestamp data entry pattern (see text). The upper section of the screen is fully customizable and can display data from other columns. The lower section provides alternative means to identify students, such as scanning 1D or 2D barcodes, and searching

  1. 1.

    A teacher authenticates and accesses a pre-defined column in the SRES by scanning a 2D barcode or following a link on their mobile device.

  2. 2.

    The column receiving the data is shown to the teacher.

  3. 3.

    The teacher identifies a student by (i) scanning a code that uniquely identifies the student (e.g. a student card, or a 2D barcode produced by the SRES), or (ii) searching for the student using identification number, email, or name.

  4. 4.

    Once the student is identified, the interface displays a set of values pre-defined by the mentor for that column, and/or allows entry of a custom value.

  5. 5.

    The value selected or entered is saved to the table.

Aside from the benefit of capturing information in an efficient and secure way in face-to-face scenarios, the mobile interface allows a coordinating teacher to choose the information displayed after a student is identified (Fig. 5.2). This display could include any user-editable hypertext, as well as data drawn from other columns in the table, and identifying information about the student (such as their preferred name). This functionality proved powerful because it allowed teachers to (i) define the important data related to a column, and (ii) have immediate visibility of these data when interacting with students. In a similar way, Lonn et al. (2012) found it powerful to provide mentors with up-to-date data to inform in-person discussions.

By providing data in situ (indeed, as data were being collected), the SRES can support teachers to build better relationships with their students, and engage with them on a deeper, data-driven level. In a case of data systems that augment and leverage the intelligence of humans (Baker 2016), teachers can initiate the necessary conversations and bring in contextual and other factors to which data may be blind. As Rogers (2003) points out, “[w]ords are the thought units that structure perceptions” (p. 276), leading us to name the system the Student Relationship Engagement System, in order to emphasize this criticality of engagement and relationships in data-driven student support.

5.2.5 Querying Interface and Messaging Engine

In addition to the face-to-face data-driven support catalyzed through providing pertinent and accurate data, the SRES was also built to be massively scalable and allow teachers of even very large courses to personalize interactions with their students. As we have described, providing a mechanism for efficient data-driven actions addresses a felt need for such teachers. The course size does not necessarily need to number in the hundreds or thousands; it is already a significant workload imposition to personalize regular electronic contact with a cohort of 70. To be effective, the SRES needed to provide a relative advantage for these teachers compared to the alternatives of manual efforts or even not contacting their students. Indeed, relative advantage is one of the strongest positive predictors of whether an innovation will be adopted, and its contributing factors include the saving of time and effort, immediacy of reward, low initial costs, and economic profitability (Rogers 2003).

To provide this, we built a fully customizable querying interface and messaging engine into the SRES (Fig. 5.1). This allowed teachers to use Boolean and other operators to combine condition statements on data stored in the SRES, similar to advanced search engine queries. To increase compatibility with existing ideas, we mimicked the filtering terminology of spreadsheet applications, and built a graphical user interface where teachers could select any column, choose a comparison operator (e.g. less than, contains, not equals to, is empty, etc.), and a comparison value (Fig. 5.3). These conditions could then be combined to form a complex query. For example, a teacher could query the SRES to find students who had a low performance on a test, and had not attended class, and had not logged into the LMS for a while (Fig. 5.3), while another teacher in a different context could ask different questions.

Fig. 5.3
figure 3

Screenshot of the simple querying interface showing how filter conditions are built by selecting columns and specifying comparisons

Another advantage was that a teacher (with the appropriate system permissions) could query across columns from multiple tables; for example, allowing a program coordinator to quickly find high-performing students across a number of courses.

The querying interface was closely linked with a messaging interface and engine, where teachers could compose a personalized message to selected students (Fig. 5.4). This interface allowed the user to bring in any data from the SRES database, including user information (e.g. preferred name) to assist in composing the message to each individual student, drawing on information personally relevant to them. The messaging engine was connected to an email server as well as an SMS service; the former is common practice in LA interventions, while the efficacy of the latter is starting to be explored (Goh et al. 2012).

Fig. 5.4
figure 4

Screenshot of part of the messaging interface and engine. Fully customizable messages could be personalized using students’ own data, and sections could be variably included based on conditions in the available data

To help teachers gain confidence in this entire process, we built preview functionality into the SRES so that before anything was committed (e.g. messages sent to students), users could preview and download tabulated results of their query, as well as preview each individual personalized message. This also served to enhance the trialability of the system (Rogers 2003) in that users could safely experiment.

5.2.6 System Adoption, Usage, and Impact

From early on in the LA story, one of three key factors proposed for successful analytics initiatives was a flexible technology platform that allowed users to warehouse data from a variety of sources into a common structure and to perform analyses on these data (Campbell et al. 2007). While the architecture of the SRES is not a data warehouse in the traditional sense and as intended by Campbell et al., our argument here is that the nature of allowing teachers to efficiently select, combine, and apply data of their choice that is relevant to their contexts can be a powerful alternative for LA.

As Colvin et al. (2016) noted, “implementers require an analytic tool or combination of tools that manage data inputs and generate outputs in the form of actionable feedback. The capacity to implement is crucially linked to the quality of these tools and the data they rely on and generate… As these increasingly meet the ‘real’ needs of learners and educators, organizational uptake is accelerated” (p. 2). Additionally, West et al. (2015) highlighted an instructive comment that underlines the need to be flexible and context-sensitive: “a lot of the things that you have to do at the moment have to be done manually and they all take time. Anything that can automate the process of that information is beneficial. I suppose there also needs to be some ability to modify it to your own requirements because each course and each cohort of students may differ” (p. 20).

By designing the SRES to encompass the philosophies of flexibility, customizability, and scalability, we have seen considerable uptake in the University of Sydney community, with a variety of applications and impacts. Some representative examples are presented below.

Undergraduate laboratories in the pure sciences. An administrative function of the SRES was to print identity cards, which could be customized to the needs of each course. At the start of each semester, students were given an SRES-generated sticker or card with their unique barcode and other personalized laboratory information (e.g. session, venue, and schedule). This was scanned before or during each laboratory to record attendance and also to initiate conversation between teachers (who perform the scanning) and students (whose relevant data are displayed in the customizable display region of the mobile interface; Fig. 5.2). Marks for laboratory notebooks and reports were also saved directly using the SRES by scanning barcodes on SRES-generated personalized coversheets that students downloaded and printed. Students were typically intrigued by the efficiency and reliability of this approach, which saved hours of staff time in transcribing and correcting records and tracking attendance patterns.

At-scale student support in health and pure sciences. The SRES was used to build and send regular, personalized emails to segments of each cohort. The data that were used to filter and segment the cohorts included attendance recorded through the mobile interface, data imported from the LMS grade book, as well as data imported from third-party adaptive tutorial systems (outside the LMS). One teacher reported that efficiently recording attendance using the SRES was associated with increased attendance at Friday afternoon lectures. Teachers also used the filtering interface to segment cohorts (e.g. into no-, low-, and high-risk categories), and used the messaging engine to send regular personalized emails to all students in each category.

These helped to keep students on track, feel connected to their teacher, and gave students an easy way to contact the teacher by simply replying to the email. One teacher reported that most students identified as high-risk early in the semester ended up passing the course, with a considerable reduction in students who did not complete compulsory work, in comparison to previous cohorts. Other teachers reported reduced attrition rates and improved distributions of students towards higher grades.

Heavy personalization in philosophy of science. To personalize messages with a cohort of students with lower average university entrance scores, the teacher used the SRES to import quiz scores from an LMS-exported CSV file, as well as other custom fields that the teacher had generated in an offline spreadsheet. Multiple complementary conditions were generated for each of a number of filters to differentiate the emails that different segments of the cohort would be sent. Students received specific feedback based on their up-to-date achievement in the course, and suggestions on how best proceed in the course. Using the SRES, the teacher also identified students who he considered were most at risk, who were then followed up with a phone call from central student support services. The teacher reported a substantial reduction in attrition.

Feedback and follow-up in clinical laboratories. A proposed use of the SRES in clinical settings is for a teaching assistant to record feedback for an individual student as a short piece of text into the SRES, which can then be automatically emailed to the student as part of a customizable message triggered upon saving data. This feedback can then be seen the following week by another teaching assistant working with the same student, via the customizable display in the mobile interface. The teacher suggesting this envisages that students will be more likely to act on feedback if there is an expectation of specific follow-up.

Adoption of the SRES, since its initial pilot in one department and four courses in 2012, has grown to 78 units of study over 19 departments (Fig. 5.5). We believe this successful wider adoption, a result of recommendations by colleagues, is a reflection of the observability of the operation and impacts of the SRES.

Fig. 5.5
figure 5

Uptake of the SRES at the University of Sydney. Since an initial pilot in 2012, the SRES has been adopted by more teachers in more units of study (courses) and departments, and is being used to provide data-driven personalized student support data for an increasing number of students

A number of factors have likely contributed to this: (i) the SRES was designed from the ground up as a teacher-focused platform that addressed a felt need and offered tangible relative advantages compared to existing methods; (ii) its architecture was compatible with how teachers commonly use and manipulate data; (iii) it sought to reduce complexity and enhance trialability; (iv) regular communication between the developers (who are also teachers) with users meant the system was able to be updated or extended relatively quickly in response to user feedback.

5.2.7 Experimental Interfaces

As the SRES has expanded in reach, the data appetite of staff using the system has grown. For example, we have seen this in terms of how data may be reported, as well as how data may be analyzed. In keeping with our iterative and teacher-centered design philosophies, we have gradually added new experimental features to the SRES to address emergent needs.

The existing paradigm of delivering data-driven support to students through personalized messages from the SRES characterized a ‘push’ mechanism; that is, teachers set up and activated a personalized message to students. We have recently been trialing a ‘pull’ mechanism where teachers set up a customized ‘student view’ of students’ own data. In a similar way to the personalized messages, teachers use a graphical interface to easily write and format information to be shown to students, and use data that exists within the SRES database to either conditionally display relevant information and show the data itself. These ‘student view’ pull-type interventions can then be embedded into the LMS so that students see pertinent information from their teachers when they log in. We see this approach as a rudimentary but highly customizable reporting engine that could help to put learners back in control of their data (Drachsler and Greller 2016).

As the SRES encourages and makes it more efficient for teachers to curate local data in one place, i.e. within the SRES, the issues with disconnected data silos start to be eroded. As a result, there are more locally meaningful data available on which to perform analyses. Leveraging this situation, we have started exploring various machine learning approaches to help teachers analyze data stored in the SRES and uncover hidden patterns that may influence their curriculum and teaching approaches. Our philosophy is to provide teachers with an easy-to-use interface to perform fully customizable explorations themselves (Liu et al. 2016).

Specifically, we are leveraging web-based machine learning application programming interfaces (initially experimenting with www.bigml.com) to build decision trees, mine for association rules, and cluster students based on data in the SRES. For example, cluster analysis of data may indicate that attendance is not substantially different in clusters with lower-performing students, while concurrently highlighting that early summative quizzes and exams may be important in identifying these cohorts (Fig. 5.6). In a similar example, decision tree analyses may reveal particular characteristics of students with different outcomes; for example, demonstrating that online formative quizzes may differentiate students who fail with those who pass (Fig. 5.7). These analyses open a new dimension to the SRES, as the system in its entirety gives teachers the flexibility to bring in data they want, query and action these data, and now interrogate these data using typical LA and EDM techniques to uncover hidden patterns.

Fig. 5.6
figure 6

Example screenshot of the output of a cluster analysis from within the SRES on data that a teacher has brought into the SRES. The numbers represent the cluster centroids. Test_1 is the first of three mid-semester exams, formative_quizzes are non-compulsory online quizzes, early_attendance is the count of attendance at the first three practical classes, early_prework_avg is the average mark in the first three compulsory online pre-work quizzes, and Piazza is the online discussion forum

Fig. 5.7
figure 7

Example screenshot of the output of a decision tree analysis from with the SRES, generated through the BigML application programming interface, on data that a teacher has brought into the SRES. This interactive interface, powered by BigML, allows teachers to hover over subsets of their cohort (cursor shown hovering over a group of students who achieved a fail grade) and examine the decision points that the algorithm has identified

5.3 Discussion

The decisions made during the design process of the platform presented in the previous sections have provided deep insights into two very relevant aspects of LA. The first one is the need to explore the space between fully manual and fully automated solutions, addressing the real needs of teachers and focusing on the human elements of learning. The second one concerns the measures that can be adopted at the institutional level to foster the use of these types of platforms and make sure they provide a tangible improvement to all stakeholders. In the remainder of this section we further elaborate on these two areas.

5.3.1 Amplifying Teacher Intelligence

Increasingly, researchers and practitioners in LA are identifying the importance of context, especially when addressing the factors and interventions that impact student success. This context includes aspects such as educational history, instructional conditions, and course revisions, as well as the complex and largely hidden (to machines) realm of students’ cognitive, emotional, and social processes (Gašević et al. 2015, 2016). At the same time, thought leaders in these fields are turning to the idea of amplifying and supporting human intelligence, as opposed to blindly following machine outputs (Baker 2016).

Could it be that in the age of big data, we are becoming tantalized by data and potentially neglecting the personal teacher-student interactions that are so crucial to learning and teaching? To purposely contort an idiom, are we missing the trees for the forest? To reconcile these, we have described in this chapter an LA approach to addresses teachers’ real needs that aligns with their understanding of their students, courses, data, and student support.

At the same time, there are pressing needs driven from institutional contexts that cannot be ignored. As massification in higher education continues, one-on-one teacher–student interactions have become less common, and personalized student support increasingly challenging. Many LA efforts to date have been focused on trying to algorithmically triage support resources to the most ‘at risk’ students. This focus on maximizing retention in LA has meant there is a real risk that we lose the human element in higher education and replace it with predictive models based on large datasets but perhaps a limited number of variables with dubious meaningfulness. Further, by retaining a narrow focus on the at-risk portion of the student population, we fail to support and maximize the learning experience and outcomes of all students. In LA, we need to continue to push ourselves to develop and use systems that take research-informed actions to support and challenge all students.

We see SRES as a first step towards a possible solution. Data-driven personalized learning support may be positioned between machine models and personal relationships; by leveraging machines and humans, it can capitalize the best of both worlds and be scalable as well as contextualized. To do this well, teachers need a system capable of scaling their natural workflows (which are exceedingly context-dependent) to large student cohorts while providing a level of student personalization. The SRES, which we have presented here, addresses teachers’ needs of efficient and centralized data curating, augmenting their abilities to personalize student support using these data. This is afforded through providing highly customizable push and pull intervention mechanisms.

Arguably, the SRES currently deals with small data as opposed to the traditional view of big data (Berman 2013), and the data points that lie within its matrix-like data architecture may represent aggregated data and therefore mask rich complexity. Other associated risks with this design include the oversimplification of metrics and the possibility of missing potentially meaningful data that the teacher did not consider including, both of which can be partly alleviated through the sharing of good practice. As long as some meaningful data are collected, teachers could use the nascent machine learning interfaces to uncover hidden meaning and possibly use this to inform intervention or learning design decisions. However, all of these are teacher-facing in terms of data collection and reporting, and providing affordances for action.

We envisage that future developments would also include student-facing interfaces that could be customized by the teacher so that their students could input data (e.g. ‘pulse’ data, psychosocial variables, self-assessment of skill attainment, self-reports of perceptions) and visualize data (e.g. performance compared to the cohort, self-progression through tasks) directly to and from the SRES table(s) via a web or mobile interface. This way, the data outputs could be contextualized by the teacher instead of relying on a one-size-fits-all dashboard across all courses. Further, building application programming interfaces into the SRES itself would allow easier data interoperability with other systems, and potentially be able to expose limited datasets for interested students to analyze themselves. Together, these emphasize our focus on practical LA systems that are customizable, flexible, scalable, actionable, and human-centered.

5.3.2 Enabling Scaling-Up of Data-Driven Student Learning Support

In the Australian higher education context, Colvin et al. (2016) noted that sustainable LA adoption relies on (i) organization strategy that lays the groundwork for LA, (ii) implementation capability that integrates teachers’ practices with data and tools, (iii) tools that address real needs, and (iv) a capacity to monitor and improve the quality and usability of implementations. Our journey with the SRES has serendipitously approached this from the bottom up and nevertheless has seen increasing adoption at our institution.

Now, the increasing data appetite of our colleagues, catalyzed through the use of LA tools such as the SRES, are fueling a number of top-level challenges including organizational culture and complexity (Macfadyen and Dawson 2012; Macfadyen et al. 2014), automating data workflows and interoperability (Lonn et al. 2013), stakeholder engagement and expectation management (Ferguson et al. 2014), connecting LA with learning design (Lockyer et al. 2013), and developing an ethical framework to benefit learners (Ferguson 2012a). Additionally, as the user base of a technology innovation expands, expectations for enterprise-standard system reliability, user experience design, and user support and training begin to grow. To address these challenges associated with scaling up LA innovations, the framework applied by Arnold et al. (2014) is instructive and corresponds with institution-wide steps that we are taking as LA becomes a strategic priority for the University of Sydney.

Technology infrastructure, analytics tools, and applications. We are brokering connections between data owners and users and central information technology and business intelligence units, looking to characterize and integrate data that are currently collected, and identify gaps in collection of meaningful data. At the same time, we are working towards tools and business processes that allow LA to be embraced by the academic masses, in a range of roles, and making LA not just the domain of data and technology enthusiasts. Part of this involves creating a space for bespoke software development by LA researchers and practitioners.

Policies, processes, practices, and workflows. More widespread use of data to drive timely interventions understandably causes anxieties in staff and students. Alongside issues of invasions of privacy and even surveillance, real-time data are necessarily incomplete and potentially inaccurate. There are therefore legitimate concerns in the ways that data are obtained, held, and used that must be addressed in parallel to the development of software and data collection tools. To address this, we are establishing LA principles that align with legal requirements for student privacy and the values of the institution.

Values and skills. We are working to connect people across the institution with relevant skills—both academic and professional staff, and those in departments and central portfolios. This will aid the evaluation of LA, particularly the technology and methodologies, the allied support services, and learning support actions used in individual courses. Supporting this will be professional learning around using LA systems effectively and clarifying roles for faculty and staff. This will become increasingly important as more agile access to relevant learning data allows teaching staff to personalize and target support en masse. Unsupportive teachers can cause large-scale damage and disengagement if personalized student support is delivered in ill thought-out or destructive ways, or if based on an overreliance on data which are messy or not meaningful.

Culture and behavior. Connected with building values, we are working to inspire and support LA research and innovation by funding EDM and LA projects, establishing networks and research groups, brokering research ethics arrangements, and connecting the institution with groups such as international societies (e.g. the Societies for LA Research and EDM) and local interest groups.

Leadership. To provide strategic support for all of this, we are establishing governance and strategy groups, as well as providing coordination, leadership, and advocacy for LA at the whole-of-institution level.

Using the SRES as a case for scaling up a bespoke LA innovation, we have already started to think about how organizational resources could be exploited, such as enhancing the connectivity of the SRES with data warehouses, growing its institutional profile, providing professional learning opportunities for teachers about effective pedagogical strategies and learning support actions, and fostering an SRES community of practice.

As an institution, we are seeking to actively encourage innovation in EDM and LA and let a thousand flowers bloom. To support subsequent scaling, we need to have a process to identify which new innovations hold promise for wider use, and how to further support, develop, and implement these at the enterprise level by engaging with institutional infrastructure, resources, and personnel.

5.4 Conclusion

In this chapter we have suggested that the EDM and LA communities have to reflect on how to better achieve the ultimate goal of improving students’ overall learning experiences. We propose increasing the focus on systems that enhance the decision-making skills and support abilities of humans (i.e. teachers, students, and support staff) and truly achieve personalized learning scenarios. The rich set of existing data sources, sophisticated algorithms to discover knowledge, and complex visualizations still need to be considered under a lens that brings the human to the center of the design and formally leverages the effect of technology in their day-to-day operations. Humans are in the best position to establish the required connection between data, algorithms, and educational underpinnings. We propose the notion of personalized learning support actions as the focal point where contributions should aim in order to make a quantifiable difference. Our argument is that this approach allows for greater relational connection between students and teachers.

The SRES has been presented as an example of a tool that seeks to connect the existing expertise of teachers with their students’ data-rich learning environments. Our approach centers on the relationship between teachers and their students, both in terms of collecting and curating meaningful local data as well as supporting actions based on these data. This is in stark contrast to prevailing approaches to learning analytics. These predominantly focus on warehousing a plethora of existing data such as from learning management, student information, and library systems, followed by applying statistical and other modeling approaches in order to predict student performance. We posit that these approaches can potentially miss out on the rich pedagogical expertise of teachers, ignore the relationships between teachers and students, and fail to encapsulate local data that teachers may find more meaningful.

Based on these needs, we have discussed the application of a series of human-centered design philosophies, rooted in the notion that teachers need decision-support tools that can accommodate the diversity of context-specific data sources present in learning environments, and the variety of possible vehicles to provide personalized support. The SRES has been deployed at a large higher-education institution in Australia with a significant uptake. Its trajectory has served to highlight the main adoption barriers at both staff and institutional levels, and how these may be addressed.

The future avenues to explore offer a promising landscape in which data, algorithms, staff, and students all interact to effectively combine data richness and algorithmic efficiency with human intelligence to yield tangible improvements in the overall learning experience.