Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

The global refocusing of higher education on outcomes is prompting considerable curriculum innovation and the rethinking of teaching and learning practices. It has changed the discourse around degree programmes to foreground explicit learning outcomes, the development of programme-wide attributes and threshold standards. However, with a few exceptions it has had less impact on assessment whether for certification or learning purposes. This is surprising as the implications for assessment are profound. In a standards-based approach, what is central is that standards can be assured and appropriate criteria addressed by all students. Such an approach implies that it needs to be made clear what students should be able to do as a result of particular episodes of study (the learning outcomes) and that assessment demonstrates and reports that students have met these requirements. This can be at odds with conventional marking and grading systems that aggregate information by subjects or course units and elide differences of outcome. These systems allowed students effectively to pass by overachieving in some areas and underachieving in others without ensuring that basic standards have been met. Baik and James (2014) started to identify some of the issues that arise for assessment in the context of learning outcomes. This chapter takes some of these further and works through the necessary implications of what this involves.

As participation in higher education increases throughout the world, and as students are expected to contribute increasingly to the cost of their studies, new expectations are arising about what universities should do and how they should do it. In particular, transparency of purpose and process is being required. If students choose to enrol in a given programme or course unit, it should be of the highest quality and do what it claims to do. This has been manifest by a greater focus on student satisfaction surveys, on explicit statements of learning outcomes and on the use of quality assurance processes. Shortcomings are no longer apparent to a limited group of specialists, but are paraded in national media. International league tables are much consulted for research ratings, which provide an invalid account of the quality of programmes, but ratings which focus on students’ learning experiences are already available in many countries and are being used for national league tables.

Publicly available learning outcomes are a necessary prerequisite for transparency. These take the form of statements about what students will be able to do as a result of successful completion of their studies. They are more explicit than previous lists of subjects or syllabuses and enable judgments to be made of the institution and programmes as well as of their graduates. Assessment judgments about students only have meaning in the context of what a course seeks to do and the extent to which a student has met its requirements. Lists of course content or a traditional syllabus does not sufficiently communicate what completion of a programme or course unit signifies. The adoption of learning outcomes has now become a widespread practice in higher education. These describe what a student should be able to do as a result of successful completion of a period of study. While they were introduced at the level of the unit of study, programme learning outcomes describing entire degrees are becoming more common.

While the assessment of students was not the first area to be the focus of moves towards transparency, it has now become a key consideration. After all, if institutions are to be judged on the quality of their graduates, it is the assessment system that guarantees this quality, and it must be fit for this purpose. And the assessment system is what focuses students’ attention on that which is most important, whatever other claims might be made about a programme.

Scaling up assessment or scaling up good practice in assessment is not only a matter of dealing with increasing student numbers, but addressing the wider context of where a programme is situated within a discipline, an institution, a country and globally. It must be located within a discussion of the macro changes that are occurring globally. Assessment has conventionally been seen within the context of an individual course unit in which assessment decisions commonly take place. Marks and grades may for convenience be aggregated across a programme, but such a process says little about how a student meets the outcomes for a programme as a whole. Scaling up assessment at the course level means looking at the implications for assessment decisions on a wider front than the course unit and examining what the implications are when the learning outcomes for a programme are considered in a global context.

This chapter focuses on the tensions created for institutional assessment policy and disciplinary practice of a standards-based approach. It examines why and how conventional assessment assumptions and practices need to change so that assessment becomes more transparent and defensible in the context of global scrutiny of curriculum provision. It questions whether an attachment to conventional assessment practices is conceptually compatible with a standards-based approach to higher education or perhaps acts to undermine it. It suggests that a new focus on assessment is needed that places programme outcomes as central and the fostering of long term learning as a key feature of programmes. In addition to discussing what requirements assessment practices now need to meet, it also considers how assessment can be used in the fostering of student judgment, how students can be involved in the curation of outcomes and the meaningful portrayals of achievements for different audiences.

Academic Standards in a Global Context

The iconic discourse of education focuses on ‘academic standards’. These have been taken-for-granted as the sine qua non of what it means to be educated. High academic standards are unquestioningly a good thing that students and educational institutions should strive for, but it is less clear to what they specifically refer. Are they general aspirations for excellence or do they mean something in particular? In the slowly moving shifts towards transparency, we are seeing academic standards now less as positional rhetoric—‘we are committed to maintaining high academic standards’. Rather, they are something we plan to meet—‘are our assessment practices up to standard?’—and intend to judge our students by: ‘have you met the standard yet?’ The international discourse of higher education has become that of learning outcomes which map on to academic standards. They are not rhetorical constructions but a tangible element of the design and planning process for courses represented in concrete terms.

Academic standards are generated from many sources. They are intrinsic to the structure of disciplinary knowledge. The nature of disciplines and the frameworks they have built to hold knowledge represent a key reference point. They are supplemented by the consensus views of experts. Most competency standards for professions or occupations are generated from consultation with those who practise in a particular area and who can judge what constitutes competence or capability in that area. Less commonly, standards can arise from empirical analysis of actual professional practices—what can it be observed that practitioners do when they perform in the domain of their expertise?

The New Landscapes of Academic Standards

Interest in standards is widening and transcending disciplines, professions and indeed educational institutions. It has taken on a dual focus. Firstly, assurance processes that enable standards to be monitored have been introduced, and secondly, alignment processes to generate common standards across jurisdictions have been mooted. While there is some overlap in the organizations responsible for these, the former mostly occur at a national level through national qualifications frameworks and national quality assurance agencies (e.g. the Tertiary Education Quality Standards Agency in Australia, the Quality Assurance Agency in the UK). Each country or jurisdiction has its own framework of qualifications and standards and agencies that ensure that standards of programmes are maintained across institutions.

The second focus mostly takes place in cross-jurisdictional spaces through accrediting agencies, international organizations and specific cross-country projects. There are international accrediting agencies for particular disciplines that operate worldwide. The longest standing is the Association to Advance Collegiate Schools of Business (AACSB). It provides ‘internationally recognized, specialized accreditation for business and accounting programmes at the bachelor’s, master’s, and doctoral level’ (http://www.aacsb.edu/accreditation). University business schools seek accreditation from the AACSB as it provides a well-established and high-status benchmark for quality through its rigorous assessments of programmes.

Other governmental and quasi-governmental bodies are also increasing players on the international scene. The project on ‘Tuning Educational Structures in Europe’ sponsored by the European Commission:

seeks to identify generic and subject-specific competences for first-cycle degrees within the European Higher Education Area. .... These reference points, which address workload as well as learning outcomes, support the objectives of the Bologna Process to establish compatibility of qualifications across Europe.

Tuning focuses not on educational systems, but on educational structures with emphasis on the subject area level, that is the content of studies. Whereas educational systems are primarily the responsibility of governments, educational structures and content are that of higher education institutions and their academic staff. (http://www.unideusto.org/tuningeu/)

In recent years, the OECD (2016) has been carrying out a feasibility study for the Assessment of Higher Education Learning Outcomes (AHELO). The purpose is to judge if it is practically and scientifically feasible to assess what students in higher education know and can do upon graduation. AHELO aims to be a direct evaluation of student performance at the global level, valid across diverse cultures, languages and different types of institutions. This approach has been trialled in two disciplines, economics and engineering, but has been stalled further by lack of agreement by some key governments. The OECD claim is that:

a full scale AHELO would be a ‘low stakes’ voluntary international comparative assessment designed to provide higher education institutions with feedback on the learning outcomes of their students which they can use to foster improvement in student learning outcomes. (http://www.oecd.org/edu/skills-beyond-school/testingstudentanduniversityperformancegloballyoecdsahelo.htm)

However, institutions that have high success in attracting students worldwide are vulnerable to having their claims exposed by empirical investigation.

An important part of a standards framework is the learning outcomes established in various countries. One of the first of these was the development of the UK Subject Benchmark Statements that:

set out expectations about standards of degrees in a range of subject areas. They describe what gives a discipline its coherence and identity, and define what can be expected of a graduate in terms of the abilities and skills needed to develop understanding or competence in the subject. (http://www.qaa.ac.uk/assuring-standards-and-quality/the-quality-code/subject-benchmark-statements)

Originally developed by the Higher Education Academy, they now form part of the UK Quality Code for Higher Education (QAA, 2016) to which all higher education institutions are expected to subscribe.

A substantial development on academic standards for the disciplines was undertaken by the Australian Learning and Teaching Council in 2010–2011 (ALTC, 2011). Threshold learning outcomes were established through lengthy consultations within each discipline or discipline grouping. These were standards that indicate the threshold or minimum level expected of all graduates of an Australian programme in the given discipline. These have been used by the national regulatory body (TEQSA) as an indicator that can form part of the statutory requirements that all higher education institutions need to meet to gain recognition of their awards. The Australian government has established a Higher Education Standards Framework for Students. These standards represent the minimum acceptable requirements for the provision of higher education in or from Australia by higher education providers registered under the TEQSA Act 2011. The statement states that:

The Standards also serve other broader purposes in Australian higher education including:

  1. 1.

    an articulation of the expectations for provision of higher education in Australia as:

    1. a.

      a guide to the quality of educational experiences that students should expect

    2. b.

      a reference for international comparisons of the provision of higher education

    3. c.

      a reference for other interested parties, and

  2. 2.

    a model framework which higher education providers can themselves apply for the internal monitoring, quality assurance and quality improvement of their higher education activities. (Higher Education Standards Framework 2015 Explanatory statement)

Relevant to our discussion, it requires that:

Methods of assessment are consistent with the learning outcomes being assessed, are capable of confirming that all specified learning outcomes are achieved and that grades awarded reflect the level of student attainment (Higher Education Standards Framework 2015, 1.5.7)

The Changing Ground of Assessment: Focus on Multiple Purposes

Before turning to the implications of this new architecture of standards for the scaling up of assessment, we need to examine what it is that assessment seeks to do. While its public image of certifying student achievements is often the most prominent, assessment has several distinct tasks to perform, and some of these are in tension with each other. Each purpose influences what students do, what they learn and how they approach their learning whether or not it intends to do so.

The three main functions of assessment are, firstly, to contribute to certifying student performance. This is often known as summative assessment. It represents how an institution judges student performance. It is formally recorded and validated as representing a set of achievements. Secondly, assessment provides students with useful information to aid their learning during their courses. They undertake a number of tasks, and feedback processes enable students to receive information that helps shape their subsequent study within the unit of study. It is commonly known as formative assessment. Finally, assessment builds students’ capacity to make judgments about their own learning. To be effective as a learner, and subsequently as a practitioner in society, students need to be able to determine what they know and don’t know and what they can do and can’t do. If students only experience the judgment of others, then their ability to judge their work for themselves may not be developed. This purpose of assessment is known as sustainable assessment, as it is assessment that helps sustain learning over time and beyond the end of the programme (Boud & Soler, 2016).

Unfortunately, these functions can rarely be pursued simultaneously in any given assessment task. The intrinsic demands made of different purposes of assessment tasks and the information provided to students about their conduct of them differ in time and in content. If we take summative assessment, this needs to occur when students have completed their study of what is being assessed. Such assessment records what a student knows and can do at a particular point of time. To make this judgment while the outcomes being judged are not complete and learning of that matter is still occurring is invalid. It doesn’t produce a worthwhile representation of what the graduate can do, only what they were able to do at some point in time prior to completion of the programme or course unit. This suggests that summative assessment needs to occur in the later stages of units or programmes rather than earlier. This is in conflict with the requirements of formative assessment. If this is left until after a student has completed their studies in a particular area, then it can’t influence their performance as a student. They can’t improve as a result of feedback because the timing of the task is too late in the sequence of study. The purpose of formative assessment needs to be pursued therefore at points in time when students can use useful information about their work when they still have an opportunity to apply it to their learning. That is, formative assessment needs to occur earlier in a unit or programme rather than later.

There is another tension between summative and formative assessment that needs to be considered: the kind of information each generates. Typically, assessment for certificatory purposes generates marks or grades. This may be sufficient as a summary of performance over a series of tasks, but the specific information it communicates about what a student can do, or more importantly for formative purposes, what a student needs to do to improve their work, is minimal. What information do marks like 72 % or ‘C’ provide to enable a student to do better? The information content of a mark is very low. To improve their work students need rich information about what they could do better, exemplars and illustrations of good work. If they are provided only as a supplement to marks and grades, they are often too late for students to act, as the task is typically timed too late for formative purposes. In summary, summative purposes need tasks that are loaded towards the end of the semester or the year; formative purposes need tasks loaded early in the semester or year.

The purpose of sustainable assessment is important throughout. Only by practising making judgments of their work on multiple occasions and being given assistance on calibrating their judgments will students develop the capacity to judge different kinds of work (Boud, Lawson, & Thompson, 2013). So, sustainable assessment activities need to be spread across the semester and, indeed, the programme. Success in this process can be determined if students are able to make good judgments about their performance in later summative tasks. If they cannot do this, then the outcome of sustainable assessment has not been met.

We can see from this discussion that one set of tasks cannot meet all the requirements of assessment, no matter how well they are constructively aligned (Biggs, 1996). This is not a simple matter of having a diversity of assessment methods, but of the timing of tasks, the provision of different types of information and the kinds of activity that surround tasks (feedback processes, self-judgments, etc.). That is, assessment be designed in the context of the overall educational design of programmes. In any given instance one purpose may need to be dominant, and this needs to change over the progress of a programme or course unit. Some tasks might be used predominantly for feedback, some predominantly for grading and others predominantly to promote self-regulation. Because grading has become such an all-consuming concern of students, careful design is needed to ensure that all the purposes of assessment fit together and can be successfully and compatibly pursued.

What Are the Implications of the Standards Agenda for Assessment?

Assessment today needs to be seen against the backdrop of the adoption of explicit academic standards and discipline learning outcomes. Within this context of stated learning outcomes, assessment is an activity that judges whether students can demonstrate attainment of these learning outcomes to a given standard. This is needed to assure that the learning outcomes have been met by the time a student graduates, to enable a student to judge how they are progressing in meeting the learning outcomes and to provide useful information to students to enable them to meet these outcomes. Assessment now is necessarily focused on what students can do in relation to these yardsticks.

Addressing Outcomes Is Fundamental

This new focus has powerful implications that have not yet been fully realized in many higher education institutions. The most important is that assessment in the context of learning outcomes needs to start from considerations of how well the outcomes have been met. Emphasis here is relentlessly on how well has a specific standard been addressed: has it been met for purposes of certification, how much of a gap is there between what a student can do now and what they need to do to meet the standard and how will the student be able to judge if they are meeting the standard? The discourse of teachers and assessors needs to be about the characteristics of student work in the language of outcomes and standards, not about marks or doing better than other students.

In terms of the construction of assessment tasks that contribute in any meaningful way to final certification, they need, firstly, to identify the appropriate standards for the tasks students undertake and how they will be applied to the work at hand, that is, the criteria to be used to make judgments. This is not just the notion of criterion-referenced assessment of old (which tended to focus on criteria at the level of a particular unit of assessment or assessment task), but of standards-based assessment (which focuses on programme-wide suites of tasks). Secondly, there need to be assessment methods suitable for judging the particular learning outcomes being pursued. A range of approaches is needed beyond conventional tests and examinations. Thirdly, the balance of assessment approaches must reflect the range of learning outcomes. Overuse of particular methods has to be avoided, so, for example, if examinations are involved, then they must be restricted to judging the learning outcomes that can be best assessed by the particular kind of examination used. Finally, and most important of all, all necessary learning outcomes must be met by all students. The threshold for achievement for each outcome must be reached by all students for them to be able to complete the unit or be awarded the qualification. As Sadler (2015) puts it:

The definitive measure of the adequacy of an institution’s standards is whether the lowest performing students who gain credit for a course achieve higher order objectives to a sufficient degree. (p. 7)

No compensation through overachievement with respect to other outcomes is permitted. Of course, many students will also meet outcomes beyond the minimum and may be recognized for so doing.

The Metrics of Assessment Relate to Standards

Several things about the form in which assessments are reported follow from the centrality of outcomes to assessment. Does the result of an assessment act relate directly to an outcome and standard? Does a grade signify what a student can or can’t do? Setting a pass mark is not setting a standard. Unless a pass means something real in terms of what is a particular learning outcome, it is irrelevant whether it is 40 %, 50 % or 60 %. Indeed marks that do not address an outcome are rendered meaningless in this conception of assessment.

The averaging of marks is particularly problematic. On what basis can marks related to different learning outcomes be averaged? This may be justifiable for different measures of the same outcome, but it is meaningless across outcomes. Such an act would assume that all outcomes are equivalent and that performance in one is the same as performance in another. When marks are averaged, information is lost, and the result no longer communicates information about what a student can do. Grade point averages are residues of a norm-referenced system in which students were compared with each other. These do not translate into a standards-based environment. This is not an argument that marks and grades should not be used at all. But they can only be used when they can be directly related to a desired learning outcome. New ways of thinking about progression are needed that are embedded in the language of standards and outcomes, not an inappropriate normalized metric.

Assessment Reporting Is Only as Fine Grain as the Judgment of Outcomes Will Allow

Consideration of marks raises another important issue. Marks and grades cannot be meaningfully reported to a greater level of accuracy than the judgment can stand. This is not a new observation; we have known for 80 years or so that essays, for example, cannot be marked to percentage level accuracy (Hartog, & Rhodes, 1935, 1936), but it has particular significance in an environment of transparency. The question to be faced is: to what level of accuracy can attainment of a particular outcome be judged? If only three or four categories of difference can be ascertained, then this is all that can be legitimately reported. Any finer grain of reporting is spurious and cannot be justified. Just because a test has 100 items, it does not mean that the results can be meaningfully reported to percentage accuracy. It is the attainment of the learning outcomes represented in the test that must be reported, and the level of accuracy for each of these is likely to be considerably less. The traditional Australian university classification of Pass, Credit, Distinction and Higher Distinction is probably the maximum level of granularity that judgment of most tasks against learning outcomes can stand.

As Sadler (2009) has argued, it also means that setting a generic set of standards for a programme or course unit is not enough: they need to be localized in terms of the particular outcomes of a programme or unit. The use of terms such as good, superior, and excellent does not indicate a standard or communicate a level. They need to be replaced by specific descriptions of what a grade means with respect to a particular standard. Grade descriptors are required to be outcome-specific: what in particular is required for a particular grade?

What Does a Standards-Based Approach Not Determine?

There are other matters that a standards-based approach does not restrict. These include that standards be unilaterally applied. That is, students have no say in the standards used. The threshold standards may be non-negotiable, but this says nothing about others. Students should be involved in assessment, through identifying and applying appropriate standards to their work. Such an approach does not imply that all learning can be predetermined. Outcomes may be set which can be addressed in diverse ways through many different products. What counts is whether the outcomes are met, not the particular form in which they are met. There is scope for creativity and meeting varied needs and interests through assessment tasks. Neither does this imply that all learning can be easily judged or is worthwhile. A limitation of conventional assessment methods is that they are often used to test that which it is easy to test, rather than that which is most important to test. A predominance of tests of memorization is an indicator of this. A standards-based framework draws attention to the need for an extended range of approaches that can be used for more outcomes that cannot be encapsulated in simple questions.

What Does a Standards-Based Framework Facilitate?

Many desirable educational features are inhibited by norm-referenced systems but enabled in a standards context. Students can more readily track their performance towards meeting major outcomes that are distributed over many course units. When assessment is recorded against each outcome, improvement can be plotted and thus ipsative assessment permitted (Hughes, 2011). Feedback processes are more likely to influence learning when it is clear that information provided on one task related to one outcome can be utilized in subsequent assessments for the same outcome (Boud & Molloy, 2013). Involvement of students in judging of their own work is more straightforward when they know what outcomes their work seeks to meet. They may need practice in identifying and utilizing criteria, but there are no additional manipulations of marks and grades that distinguish and inhibit knowing what grades stand for.

In summary, a standards-based approach to assessment would necessarily involve:

  1. 1.

    Reporting against standards and learning outcomes

  2. 2.

    The generation of as many reports as there are programme learning outcomes, thus tracking achievement by each outcome

  3. 3.

    Focus on each student meeting every threshold

  4. 4.

    Avoidance of averaging across different learning outcomes

  5. 5.

    Students appreciating what a grade signifies in terms of what they are seeking to be able to do rather than simply what they should know

It also permits the issuing of degree transcripts that relate to meaningful units of activity such as programme learning outcomes, rather than an arbitrary division by subject or course unit. The question: ‘what can this graduate do?’ can be reflected directly in the information provided in transcripts.

It may also encompass other features beyond the minimum demanded. For example, higher grades may be awarded for achievements beyond the simple meeting of learning outcomes; particular recognition for outstanding achievement on particular major tasks (like Honours projects) or for meeting additional outcomes (e.g. through the use of digital badges) may be given (Oliver, 2016). Higher thresholds may also be set for admission into advanced classes or for selection into higher degrees. All of these are readily encompassed in a standards-based framework.

Most importantly, a standards-based approach enables students to create multiple validated portrayals of achievements for different purposes. These permit students themselves to scale up beyond the standard portrayals that all students get on graduation so they can present themselves in different ways to different audiences (e.g. for employers, for higher degrees). A necessary addition to programmes to allow this is the use of programme-wide portfolios as repositories for all work and assessments or for those that are essential to their future professional work. These need to be designed to facilitate both summative purposes of assessment and formative ones and be able to be curated by students for different purposes (Clarke & Boud, submitted for publication).

Like any approach to educational programmes, a standards-based approach can be misused to centralize decision-making, limit the exercise of professional judgment and inappropriately control legitimate diversity of approach. The major trap in any outcome-oriented approach is that outcomes are viewed behaviouristically and in an overly operational form. To go to these extremes is to miss what a standards approach is seeking to achieve. It is not to produce students who are clones of each other after completing the same programme, but to represent holistically what they can do at an appropriate level of aggregation and to ensure that minimal outcomes are assured in all areas of importance. It is interesting to observe the application of such an approach in some Australian universities over many cycles. The first iteration tends to be compliance driven, and outcomes are written in stereotypical forms that are not aligned with assessments. This is a pretence of a standards-based approach. The second phase begins to take the process seriously, but if not carefully monitored it can lead to excessive specification of outcomes, many checklists and a proliferation of summative assessments. It is probably a necessary stage in the process as academics come to terms with what a standards-based approach can offer. The third phase, some years later, involves a manageable number of programme learning outcomes and learning outcomes for each unit and the beginnings of good alignment of assessment at both the level of the programme and the course unit and the effective portrayal by students of what the programme has enabled them to demonstrate. As in any major reorientation, unless an institution is prepared to persist through several iterations, then the value of the approach will not be realized.

Progress to Date

Although there are many entire higher education systems that have moved in total or in part to a standards-based or outcome-oriented framework (Australia, Hong Kong, UK), and thus to a more transparent assessment regime, not all of the implications discussed above are yet manifest in all programmes in these jurisdictions, and some have yet to be taken up in major ways. Many of our assessment conventions were created for a different era. The specification of learning outcomes for course units is now commonplace, and the identification of programme level learning outcomes is in progress. When these outcomes are written for the first time, they rarely fully represent what is desired as they may be prompted by the need for compliance rather than an embracing of the framework. However, in the second and third iterations, more robust outcomes can result.

What is much less common is the working through of the implications for marks and grades and the aggregation of results. It is still normal for a single mark to be calculated for a course unit from a weighted average of marks in that unit and for these to be averaged over units in ways that are educationally unjustifiable. Tradition bears particularly heavily on the marking process, and it is often easier to implement change in curricula and learning outcomes than in the assessment process (Deneen & Boud, 2014).

Progress has also been made on the use of learning portfolios, although their use for whole programmes is much less common that it is within course units. It is only as the electronic portfolio becomes commonplace that the digital affordances it provides can enable their use for the purposes described here (Clarke & Boud, submitted for publication). Hard-copy portfolios are too unwieldy to allow for the multiple transformations for different purposes needed for a full scaling up effect on assessment.

Conclusion

As these changes in policy frameworks, assessment policies and curriculum design are adopted, transparency becomes an increasing feature of assessment. Standards and learning outcomes are explicit. The ways in which they are judged can be seen to address them, and the results of assessment acts are clearly related to what students are expected to do. The connection between what is claimed for a programme and what a successful graduate of the programme has achieved is readily available for inspection. Assessment results then can be used as part of quality enhancement and assurance processes in ways that are often not available at present.