Keywords

Introduction

As discussed in Chap. 4, Iceland was a part of Denmark until the first half of the twentieth century, but was in many ways treated as loosely connected, e.g., in the field of education. Thus, the Icelandic educational system did not follow the evolution of the Danish system in the eighteenth and nineteenth centuries. This becomes particularly clear when comparing the statutes and accountability mechanisms in Denmark (Ydesen & Andreasen, 2014) and in Iceland (Guttormsson, 2008a). The general influence was, however, the same. A surprisingly well-developed notion of accountability was an underlying feature of early action, developed by the Pietist church. This included the use of inspection and testing. In Iceland this lasted in one form or another into the twentieth century and both these aspects of the accountability mechanisms have been a part of the Icelandic discourse right up to 2020. We will gauge the development and character of these tools, used to ensure quality education and explore to what extent we see a clear evidence of Nordic or other influence. The focus of this exploration will be on evidence and action. We also endeavor to make the point that a historical perspective on the development over recent decades is crucial when attempting to understand the nature of the accountability efforts during the last few decades. We distinguish on the one hand between the basic principles behind attempts to ensure quality education, which we argue are fairly stable, and on the other, the very different discourses, which come from different directions, that shape the approach during each time period. Before we discuss the twenty-first century, where we start in the 1990s, as the present accountability movement took off, we discuss two major earlier efforts. Therefore, our approach will be historical to a considerable extent as this enables us to assess what is special and novel about the current push for evidence. There is no attempt made in this paper to examine the apparent substantial influence of modern rhetoric on education, such as neo-liberal ideas or new public management. Our focus is narrowly on the issue of accountability and evidence with the distribution of responsibility also in the background.

Three Important Purposes of Evidence Use in Education

The current debate on the importance of data and the emphasis on evidence is emerging as new, modern and exciting (Jónasson, 2019), and turns out to be closely interwoven with policy and practice, in particular with the idea of accountability. However, when we look more closely, this connection is not so new. It is recurring and based on clear principles of care and responsibility and the demand of professionalism (which implies delegation of action and responsibility). The emphasis on evidence has grown and faded again in certain periods, partly because the arguments have already served their purpose or have not worked. What is particularly interesting is that the rhetoric surrounding evidence has sounded quite a different tune in different periods. That is why we will visit the previous arguments and actions and allow them to enlighten us in the current situation. We are inspired by historians of education, in particular Tyack and Cuban’s (1995) exploration of the development of US education throughout the twentieth century.

Within the Icelandic educational discourse, the debate on evidence has three strands. The first refers to accountability. This involves assembling evidence to ensure that a system is functioning and upholding necessary standards, essentially defined by those in charge. Accountability requires that various kinds of data are collected to evaluate input and output in order to demonstrate a well performing system or to discover where its weaknesses lie. Student performance, a crucial ingredient of the first strand, has also been used for tracking purposes, which we classify as the second strand. Tracking played an important part in the Icelandic system for the best part of the twentieth century, even though it became steadily less important, as we moved towards its close and will be marginal in our discussion. The third strand is the process of using evidence for formative purposes, i.e., mainly to guide student learning, but has not obtained the same status as the accountability strand. Here the focus is on the direction students are given, based on an analysis of their performance (Jónsson, Smith, & Geirsdóttir, 2018).Footnote 1 With reference to whole systems, the first strand is dominating, as opposed to the use of evidence to guide learning of individual students, even though the latter gradually seems to gain strength.

Evidence, Accountability and Tests in Icelandic Schools

Many types of data are used for evaluation purposes within the compulsory school system in Iceland, and come from an increasing number of sources, most of which are domestic. We have good access to indicators accumulated by Statistics Iceland. Moreover, a fairly well developed system of internal evaluation is in place, which relies on various data. These are, inter alia, tests and various other data constructed and collected by individual schools or teachers and the national tests, administered to students in grades 4, 7 and 9. Regular questionnaires to students, parents and teachers are conducted in compulsory schools as part of the internal evaluation system, mostly delivered by Skólapúlsinn, a privately owned data collector. Some municipalities or individual schools use other tools, some are homemade. The system of external evaluation is gradually developing (Ólafsdóttir, 2016) and is based on data from the internal evaluation, existing documents, interviews and in situ observations. In addition, various diagnostic tests are used (Sverrisdóttir et al., 2020) to find students with learning difficulties. Some are used by the teachers (e.g. reading related tests) and some by professionals who work with various expert services or within the schools. The PISA results are by far the most prominent of currently used international data and they have influenced Icelandic educational debate since PISA started. There are also the TALIS data and the indicators provided by Eurydice.

It is relevant to distinguish between input data, which describes what goes into the system, and output data, i.e. what comes out in terms of student gain from the educational process. Those two categories are tightly interwoven, especially when the main concern is quality of education. Then the output often becomes the primary indicator of the sufficiency or quality of the input. Historically, the educational output, notably the ability of children to read, was primarily used to assess the input, i.e., how the families or the clergy or the teachers performed their duties. On this basis, strand one, accountability, for a long time dominated the other two, and apparently still does.

Accountability – And Evaluation in Icelandic Education

Here we focus on different historical periods and perspectives with the aim of shedding light on the development of accountability mechanisms. We will begin with a discussion of the Harboe inspection in the 1740s. Next, we turn to the school and study inspection period in the 1930s. We then briefly examine the status and use of national tests within compulsory education from the 1970s to the early 1990s, during which the accountability became temporarily less emphasized, and tracking largely disappeared. We then proceed to the last three decades. We will gloss over a host of interesting and important developments and details in the evidence and accountability arena (Proppé, 1999).

The Harboe Inspection

Icelandic education was in the hands of families and the clergy for many centuries and the authority’s emphasis was religious education. The attention paid to education of the young gradually increased, but very slowly through the centuries. Inspections, e.g. the visitatores in 1307, which led to the defrocking of some priests and edicts, in particular on confirmation, in 1635, bear witness to some concern for education (Sigurðsson, 1842/1994). It is clear, however, that only occasionally was Iceland in synchrony with our governing partner Denmark (Guttormsson, 2008a, pp. 75–89) and to a limited extent (see on the developments in Denmark, Ydesen & Andreasen, 2014).

Under the influence of the Pietist movement, a major effort was made to ensure that Icelandic children, boys and girls equally, were taught to read. The Harboe mission (1741–1745) is perhaps the most thorough, and truly external, evaluation of Icelandic education ever undertaken. Harboe and his associates travelled most of the country and checked the knowledge, attitudes and general competence of priests and the reading proficiency and knowledge of youths aged between 12 and 17 years (Guttormsson, 2000, 2008c). This was a serious and ambitious evaluation project and was well known among the lay people and it led to interest in establishing schools (which had at best a minor effect) and providing teaching materials. Moreover, a number of progressive edicts were issued and the accountability of families and the clergy were clarified in the process. There was also a shift in governance, as more authority was transferred to the secular heads from the bishops (Jóhannesson, 1945). There is no doubt that matters did improve in the following decades and it seems that this progress was, at least partly, attributable to the evaluation process and the subsequent action. A number of other developments were, of course, taking place during the same period and some (perhaps much) of what Harboe intended to happen, did not transpire (Ólason & Jóhannesson, 1943). Even though the output variables, reading proficiency and knowledge of the young, counted as major indicators in the process, extensive observations and discussions, especially concerning the state of the clergy also took place. The effort was about quality, responsibility, and thus governance; what facilities and competences were required and who should be accountable. There is no question that an ambitious and competent external evaluation had taken place and the authorities took (some) subsequent action to improve things.

Evaluation for Equity in the 1930s and 1940s

Iceland underwent gradual urbanization during the first part of the twentieth century, and concurrently the value of education was increasingly being recognized by the people at large. The country was, nonetheless, still a rural country and many children received little education and often in home-schooling settings, taught by teachers who travelled between farms (Guttormsson, 1992). In 1918, 48% of school age children were in these loose school settings, compared to 39% in 1928. There were, in the early decades, two merging currents of thought gaining momentum, with respect to the development of education. A growing impatience with the unequal educational conditions in the country and a very strong argument for using transparent fair testing procedures, inter alia to demonstrate the apparent inequality. The latter showed a clear US influence on the educational discourse through Steingrímur Arason (Indriðadóttir, 1995), studying at Columbia University, who emphasized written and preferably national tests.

The massive effort developing to enhance the quality of Icelandic compulsory education in the 1920s was partly driven by teachers who demanded fairness and equal provision for all children. This included at least four lines of action (Guttormsson, 1992, 2008d). The first was to strengthen the responsibility of the central government to take action by expanding the inspection role of the National Education Director’s office (Lög um fræðslumálastjórn, No. 474/1930). An educational board was also appointed, tasked with deliberating on the curriculum for compulsory education, among other things. The second was to clarify the responsibility of the school districts around the country. The third was to appoint a number of inspectors, who were normally well respected teachers. Their role included visits to each school district once a year as well as sketching reports intended for the authorities. The fourth was to administer a number of national (written) tests, see Table 9.1.

Table 9.1 List of subjects tested nationally 1929–1937. 7–14, 12–14 and 10–14 refers to the age range of the children tested

The idea is reminiscent of the Harboe effort nearly two centuries earlier, now of course with a modern twist and with more subjects than reading. The output measure, i.e. the student performance, was again central to the effort and also the teaching, but the facilities were now also perceived as important. Moreover, it became increasingly clear that the intention was to ensure the equality of opportunities, not least with regard to where people lived and thus forms of schooling. The form varied greatly between sparsely populated areas, villages and towns. In the adopted arrangement all pupils were offered the same tests, and the marking should not be swayed by any potential teacher bias. Table 9.1 shows the testing effort that was set in motion, but also partly indicates its fading, even though the financing of the inspectors faded sooner.

The inspector system was dismantled for financial reasons – but was soon revived in a much weaker form, and some of the tests are still in place. There is, however, no doubt that the original plan was very ambitious and extensive. Apparently, it had many of the features of our modern external evaluation system, which we would characterize as soft governance, with no high stakes, as it was not punitive in any way, and the inspectors, who were respected teachers, clearly had a formative role.

National Tests 1974–2020

The testing mechanism established during the 1930s was retained to some extent until 1974 (Proppé, 1999; Sverrisdóttir et al., 2020). Given the original intention, which was primarily to observe the quality of input, summative data was sufficient. It is also noteworthy that the explicit inspection role weakened as the school system continued to develop, but the tests remained in place and gradually acquired a tracking or sorting role, which was largely removed in 1974. However, the tests still kept their place in the 1974 law and a new official role emerged, but not very clearly, i.e., to make the tests serve a more formative function. Now, some decades later, we still seem to be moving, but slowly in that direction. A recent extensive study on various aspects of compulsory education is, however, not conclusive on this score, even though the ways in which teachers assess pupils seem to be quite varied (Sigurgeirsson, Björnsdóttir, Óskarsdóttir, & Jónsdóttir, 2014). Thus, we still have the interesting situation that much data is collected in the school system, in a variety of ways, with little evidence available about its actual use or usefulness.

Table 9.2 shows the proportion (%) of students taking the national tests, some of which could be opted for or were voluntary as the grade 10 tests were optional between 2000 and 2007. The table is meant to indicate three things. First, even though both the number and formal status of the tests oscillated (Sverrisdóttir et al., 2020), their presence persisted. Secondly, that the proportion of students taking the tests towards the end of compulsory school did not vary depending on the formal status of the tests. The stakes were essentially low throughout the whole period, even though their value for selecting schools or tracks in upper secondary education persisted and was important for some students. Thirdly, the assessment in grades 4 and 7 was introduced partly for ensuring school quality, but still mainly for formative use by teachers and schools (Ministry of Education, Science and Culture (hereafter MoESC, 1998).

Table 9.2 The percentage of students who took the national tests in the period between 1977 (when the 1974 act came into effect) and 2018

During the last decade of the twentieth century there was a revival of the previous emphasis on quality through evaluation. Even though testing was important and introduced for the early and middle classes in the compulsory school (see Table 9.2), it was not a major priority. Rather, testing consisted primarily of internal evaluation by the schools themselves and then external evaluation of schools that was introduced by law in 1995 (see further discussion in Chap. 4).

As noted above, it is not clear to what extent the national tests, or any tests, are pointedly used for a formative purpose vis-à-vis individual students, but it probably varies among schools and teachers. However, there are indications that they have directly influenced the general curriculum approach in the final classes of compulsory school in the subjects tested (Sigþórsson, 2008; Þráinsdóttir, 2010).

Accountability, Evaluation and International Comparisons in the 1990s – The Initial Steps

In the 1990s, at least three important developments affected the educational discourse in Iceland. First was the development of international and ostensibly coordinated tests (now PISA, see Sellar, Rutkowski, & Thompson, 2017), that (finally) allowed (or at least invited) comparison between countries. The second is the gradual emergence of a neo-liberal rhetoric connected to education, which has influenced education in various ways, both globally as well as in Iceland (Dýrfjörð, 2011; Magnúsdóttir, 2013; Skúlason, 2008). The third relates to the older, but increasingly accepted notion that education should mainly serve the build-up of a powerful economy (e.g. Brown, Lauder, Ashton, Yingje, & Vincent-Lancrin, 2008). The tests together with the competitive element of the neo-liberal ideology created the grounds for an international competition in which states sought to prove who could produce the most economically proficient citizenry.

As a precursor to the PISA study, the first international study that Iceland participated in was the 1991 IEA study on reading skills (Elley, 1994). In all, 32 countries took part and it was interesting that already here the primary comparison in Iceland was with the Nordic countries. For the nine-year-olds, Iceland underperformed compared to the other Nordic countries. The Minister of Education commented on the results, in response to some critical comments. He stated that the Nordic countries set the bar for any meaningful comparison and noted further that even though the nine-year-old children did relatively badly in the test, the same did not apply to the 14-year-olds. Actually, they came on top of the 32 countries, in the part on expository text (Einarsson, 1992; Valgeirsdóttir, 1997). Following the IEA study, Iceland participated in PIRLS in 2001 and 2006 (Jónasson, 2008b). Iceland did comparatively well in those and there was seemingly little debate or discussion about PIRLS among the general public, policy makers, or academics.

Iceland participated in the 1994/1995 TIMSS study and those results had influence on the policy discourse and the general education debate. Public dispute arose on teachers’ education, school development, curriculum, and study material in mathematics and the natural sciences, as Icelandic children did rather poorly in the study (e.g. Aðalsteinsdóttir, 2007; Diego, 1997). The debate centered on the need for better preparation of teachers, both in terms of the length of the teacher education program and the contents of it, as argued by e.g. leaders of teacher education institutions and the teachers’ union. Other points of view were discussed, such as implementing a performance related wage system for teachers, and the need to again track students based on their grades. According to Aðalsteinsdóttir (2007), no actual formal work was done to explore and contemplate the evidence provided by the results of TIMSS. Even so, TIMSS had considerable impact on curriculum-making in Iceland. Indeed, Jónasson (2008b) has argued that no other international study had at that point generated as much direct impact on curriculum-making in Iceland and Sigþórsson (2008) indicates that considerable changes were made to mathematics and natural science curriculum and teaching materials in the aftermath of TIMSS. Furthermore, Bjarnason (1998), who was the Minister of Education at the time, stated that the curriculum in mathematics was being revised to respond to the poor TIMSS results. All this would indicate the social technological influences of international organizations.

A Variety of Evaluations and Reform Discourses

In the above, we have emphasized formal evaluations of the system and the use of testing, some of which are leftovers from the earlier evaluation efforts. There have been numerous other evaluations, notably by OECD in 1986 (OECD, 1987), which was an external evaluation of the complete educational system, with recommendations, based on interviews and visits, but not of the scale of the previous major evaluations. There have also been several evaluations focusing on specific aspects of Icelandic education at the national level, such as of mathematics education (Þórðardóttir & Hermannsson, 2012), and Icelandic (Sverrisdóttir & Valsdóttir, 2012). Perhaps the most ambitious of those was an extensive evaluation of arts and crafts teaching, conducted by Anne Bamford (2011) in 2008/2009. It is very difficult to figure out the impact of these evaluations, except the last noted, which apparently has not been attended to yet. In regular OECD reports on the Icelandic economy, there are often chapters on education, which could thus be classified as an indirect external evaluation of the system located at the national level (e.g. OECD, 2019, see thematic chapters).

The Second Decade of the Twenty-First Century – Moving Towards 2020

Since the first PISA assessment in 2000, Icelandic students have scored lower in reading literacy each time, except in 2009 (Directorate of Education, 2019, Fig. 1.1). The results usually generate considerable debate (though normally temporary) about education and the quality of the Icelandic school system. Comparison to results from the other Nordic countries is always prominent and usually dominates the discussion, at least when the results are presented and sometimes when norms are considered, e.g. when it is suggested that study hours in Icelandic should be increased in accordance with the mother tongue curriculum in other Nordic countries (Directorate of Education, 2019; MoESC, 2011, 2014).

The White Paper published by the MoESC (2014) was substantially and explicitly influenced by the PISA results. One of the main initiatives introduced in the paper was a literacy project, whose importance was underscored by OECD’s international comparison (MoESC, 2014). Consequently, a long-term, national agreement on literacy was signed by the Minister of Education, the Association of Municipalities and Home and School – the National Parents Association. The project received considerable funding from the state’s budget and involved inter alia more emphasis on regular testing throughout the compulsory school level and extensive literacy counselling services located at and coordinated by the Directorate of Education. The main aim of the project, according to the White Paper, was that at least 90% of Icelandic students would reach level 3 in PISA reading literacy, from 79% in PISA 2012. The score was 78% in PISA 2015 and in PISA 2018 it was 74% (Directorate of Education, 2019, Fig. 1.2). Thus there is some way to go.

In addition to partaking in PISA, Iceland participated in TALIS in 2008, 2013 and 2018. The results have, for example, shown that Icelandic teachers receive comparatively little formal feedback on their work, which has been used as an argument for a greater emphasis and structuring of external evaluation of compulsory schools. Results from TALIS have also been used to rationalize the lengthening of the teacher education programs (MoESC, 2016). Further, in a report on teacher professional development, the authors repeatedly use the TALIS surveys to underpin their deliberations (MoESC, 2019).

In addition to literacy, inclusion is a major issue in Icelandic education. The inclusion discourse has taken on many guises for well over a century (Jónasson, 2008c). It started with a focus on special groups, e.g. deaf and blind children, and then gradually included children with learning or behavior problems. The law set in 1946 for primary education asserted that everybody had a right to education, but some in special institutions. An inclusive step was taken with the law in 1974, but it needed a real push, which arrived in the form of government regulations in 1991. Even so, inclusion is still a contended issue, both inside and outside the school system (Magnúsdóttir, 2016; Marinósson & Bjarnason, 2014; Sigurðardóttir, Guðjónsdóttir, & Karlsdóttir, 2014). An evaluation was undertaken by the European Agency for Special Needs and Inclusive Education (2017), with extensive consultation, and came up with a number of recommendations. Among the conclusions was that the notion of inclusion was very unclear within the education system and thus takes on the status of a floating signifier (Krejsler, 2017), which is problematic, as these issues would benefit from firm understanding and action. The government is currently working on plans to respond to the recommendations (see MoESC, 2019), but it is still unclear how this will play out in the long run. It is most noteworthy how the evaluation part of this discourse has remained totally outside the regime or culture created by OECD or PISA.

The Use of Research

In recent decades, there have been a growing number of research projects that have looked at various aspects of the conduct of education. A number of national and large-scale international research projects have been undertaken, e.g. on compulsory school practices (Óskarsdóttir, 2014), diversity and social justice in education (Lefever & Ragnarsdóttir, 2018), Icelandic language education (Jónsson & Angantýsson, 2018), and upper secondary school practices (Óskarsdóttir, 2018). Large scale national studies have been carried out regularly on student well-being, in particular by Rannsóknir and Greining (n.d.). Iceland has also participated in an international study monitoring health and behavior in school-aged children (HBSC) (e.g. Arnarsson, 2019). In the last 15 years, a great number of PhD studies have been undertaken on education and hundreds of master’s thesis targeting all school levels, many of which contain evaluative or potentially formative material. The available evidence does not indicate that research is much used in policy making or action (Ragnarsdóttir, Jóhannesson, Jónasson, & Halldórsdóttir, 2020). Despite a large majority of those involved coming from the educational field, either in the education system or are engaged in educating the professionals, there is possibly a challenging disconnection between the worlds of research and practice.

A Return to the Question of Influence in an Evidence-Based World

We have noted the clear rhetorical emphasis on the use of evidence within the Icelandic educational arena. We have highlighted this by exploring several different arenas, which all point in the same direction.

For many decades, data on various aspects of the education system have been accumulated and published by Statistics Iceland. These show e.g., the growth of the system in terms of student numbers, the money spent per pupil, the number of teachers, the number of pupils with foreign background or how many opt for vocational studies. Most of this data is essentially input data used to describe the system. There is also output data, e.g., indicating how many drop out of a certain level in the system or how many graduate. The data is readily accessible and often presented, but its actual utility in underpinning change is rarely transparent, even though its descriptive use is clear. It is normally output data that is used to rationalize that action is needed, e.g. test scores or dropout, but input data occasionally used to justify its direction, e.g., the number of hours used to teach the mother tongue are relatively few compared to those in other countries rates (MoESC, 2014). Moving away from the statistical indicators, there are the national test data for grades 4, 7 and 9 in the compulsory school, which is meant to inform the pupils and their parents, the teachers of course, but also the school leadership, as well as the municipalities about the pupils’ performance. This is supposed to guide their actions, but the extent to which this is used is not clear. In addition, there is various performance evidence produced by individual teachers to their students (see above). As far as individual students are concerned, tracking has largely disappeared and those arguing for a policy on individualized teaching and inclusion have for a number of decades promoted the idea of using tests and other student data for individualized formative purposes.

Thus data, but mainly output data is meant to underpin action. Both the national indicators and the national tests have a clear institutional base, i.e., Statistics Iceland and the Directorate of Education. This indicates that the attention paid to data, and its use as evidence, is apparently gradually growing in the Icelandic educational culture. When we add to this the various data collected for the sake of internal and external evaluation of schools, and domestic research on youth and education, in addition to data collected by individual teachers as discussed above, we start to glimpse a stronger image of how Icelandic data on education is structured.

The international part is no less important as it seems to be a driver for change. Iceland, now increasingly, ensures that data on all aspects of the education system, required by UNESCO and Eurydice, is supplied in order to allow comparison with other countries. We have already mentioned participation in international assessment, of which PISA is the dominant one. The White Paper published in 2014, based its goals largely on international data (MoESC, 2014), partly from PISA studies. Thus, the international participation and influence is apparently strong and here we intend to explore it further.

Nordic or Other Influences on Icelandic Education

In order to understand the trends and fluctuations in the use of evidence, we search for indications of external influence, in the educational arena, in particular from the Nordic countries. From the discussion in Chap. 4, there is an obvious and natural Danish influence in Iceland, even though it is not always as strong as might be expected on the basis of the close historical ties between the two countries (which were for a long time the same country). The history of Iceland shows multifaceted and close ties with all the Nordic countries over the last two centuries, where the recent ties are perhaps best symbolized by the establishment of the Nordic council in 1952 (with Finland joining in 1955). But that only tells a very small part of the story. The Nordic communication has existed on multiple levels, but the extent has not been mapped by research, at least not in the arena of education. There are, however, numerous indications that the Nordic interaction continued to grow in various directions toward the latter part of the twentieth century and into the 21st. In order to obtain an overview of the development of the relationship in the recent decades it may be relevant to focus briefly on different, but overlapping arenas.

We suggest four spheres or arenas for Nordic communication and their influence on Icelandic education, even if we are not able here to disentangle the influences for the different levels of education. Neither does this analysis determine or even indicate what the impact is in terms of practices or policies. But we presume it is substantial in some cases. These spheres are: The national policy arena, which perhaps normally receives the most attention; the administrative arena, involving administrators, also at various levels within the systems; the practical arena, including meetings of practitioners, also at various levels in the systems; and finally the scientific sphere, including academics attending conferences, publishing papers in journals and networking.

In the policy arena, there are principally three related categories of operation. One is the annual Nordic Council meetings among members of parliament from all the Nordic countries. The second is under the umbrella term of the Nordic Council of Ministers, where an elaborative infrastructure has been set up and reaches, inter alia, all levels of education. The third, is the Nordic cooperation at the international level, such as within the OECD and UNESCO, but also with various other agencies. In the administrative arena, often closely connected to the policy arena, there are meetings at various levels. Administrators at the ministries meet regularly, and so do those at special governmental agencies, such as the Directorates of Education. The education officers within municipal administration meet and there are also regular meetings of the teacher unions. In addition, there are various ad hoc groups. In the more practically oriented arenas, there are nearly 400 Nordplus programs initiated every year, with probably half related to compulsory education. The programs support various activities, such as visits and Nordic conferences with a practical orientation. Iceland is an active partner in both types of projects. In the scientific or academic arena, there are the conferences (e.g. NERA and Northern lights), journals and networks. Searching the internet, we find over twenty regularly held Nordic conferences (held every year, or every second or third year) within various fields of education. Similarly, at least twenty educational journals have the term Nordic or Scandinavian (2) in their title. There are several research projects conducted within the Nordforsk framework (e.g. within the Education for Tomorrow Program). Iceland participates actively within all the four arenas.

The point is that when we start to describe and analyze the visible and formal ties, and attribute influence, as shown by formal acknowledgement, in documentation or in line with certain formulations, we have to deal with three types of problems. One problem is that the underlying principles, e.g. of a program of evaluation and quality control can harmonize with different rhetorical or political discourses. That does, however, not necessarily imply that the discourses directly influence the programs. The causal relationship may be difficult to establish. The second is, as has been implied above, that the developments may be influenced by a host of interactions that are not clearly visible, especially not if we look primarily at the policy level and neglect the plethora of other levels of active interactions. Such negligence may leads us to undervalue the very substantial other ties that are not formalized within a national institutional structure. Thirdly, it may be difficult to establish what changes, planned by policy did in fact materialize in the end, and which of those changes that did take place, were largely in line with long-term underlying developments, no less than expressed policy.

Connections and Influences of International Organizations

Alongside strong Nordic influences, robust ties have been forged with international agencies, particularly the OECD and the EU (through EEA). At the same time, the Nordic countries are also important participants in both organizations and may often have considerable influence, sometimes successfully promoting coordinated views. This may also hold for other important organizations, such as the Council of Europe. Such influences may be difficult to evaluate, even when they are considerable. Here, we mention three examples of policy changes that have explicitly been influenced by international forces.

First, we note the explicit influences from the OECD, which are usually grounded in large scale indicators and comparison between nation states. Iceland has submitted data on education for publication in reports and comparisons by the OECD, such as Education at a Glance, since 1996. The comparison published in these reports have been used in arguments for greater financial contributions to the education system, and influenced discussion on forms of operation of schools and the content of the mathematics curriculum, to name some. Several suggestions from an OECD report from 1986 (Jónasson, 2008b, p. 266) have been implemented (some only several years later), such as making the first year of primary school obligatory, adding time to teacher education, and shortening the length of studies, leading to matriculation in upper secondary education. It may still be problematic to pinpoint the exact causal links.

Secondly, discussion about the aims and policies of UNESCO have been included in Icelandic educational journals since the foundation of UNESCO (Halldórsson, 1947). Those messages have not translated directly into policymaking and mainstream practices but recently, following the approval of the Sustainable Development Goals, UNESCO seems to have gained a stronger influence in Iceland as the Government has adopted several priority targets for implementation (Sustainable Development Goals, n.d.). A few compulsory and upper secondary schools have qualified as UNESCO schools (UNA Iceland, n.d.).

Thirdly, some European influences are visible in recent policy changes in Iceland, at all school levels. The government has worked towards identifying a National Qualification Framework (NQF) for all school levels, compatible with the European Qualifications Framework (EQF). Also, the European discourse includes eight key competences for life-long learning, some of which have been included in the Icelandic curriculum formulation (e.g. visible in MoESC, 2013).

Exploring Nordic influence on Icelandic culture, and education in particular, is a fascinating project because it is so multifaceted, and the results are in many ways paradoxical. From one perspective there is a wide-reaching interaction, but at the same time the visible influence is less than this might indicate. A large part of the problem is that some of the potentially most interesting influences occur under the surface and their actual impact is therefore difficult to assert. The main conclusions of our exploration are fourfold. The first is that our education system is by descent a Nordic system. Right up to the present moment, the Nordic legacy, influence and connections can be found practically everywhere. In the current PISA discourse, it has been suggested, for example, that we assimilate ideas from the Swedish project Matematiklyftet (Directorate of Education, 2019, p. 81). Secondly, in some important ways we have developed differently simply due to geographical and cultural differences, but that applies by no means to all parts of the intricate educational process. Thirdly, there have been quite substantial and visible influences from elsewhere, as discussed above. The fourth conclusion is that the influences depend very much on where one looks, i.e. at policy (e.g. inclusion or tracking) or rhetoric; at system issues or at the school level, curricular or pedagogical issues, where one finds clear signs of influence.

To conclude, from the Icelandic perspective, the inspection and evaluation of Icelandic education, especially in the 18th and 19th centuries, were simply Danish procedures adapted to the local setting. It is rather obvious that the modus operandi of our schools was Danish even though deliberations about education became Icelandic (i.e. as of an independent state), even while we were a part of Denmark (Briem, 1900, 1901; Guttormsson, 2008b; Sigurðsson, 1842/1994). Nevertheless, the evaluation and inspection regime engineered in the 1920s and 30s, came as much from the US, in particular the institution of inspectors and the use of written tests. The essence of the task was nevertheless very similar to what it had been before. The third wave of formal evaluation that we saw originating in the 1990s, did not come from the Nordic countries, but from the OECD, and OECD countries that were strong on inspection and tests.

Discussion

Governments often act responsibly (also the Danish monarch). They also understand, based on an ongoing discourse, that education is important and it is partly (or wholly) their responsibility to ensure high quality for every pupil (which is perhaps still the most important feature of Nordic educational thinking). The Nordic governments have on the whole, at least up to the twenty-first century, not taken the view that the market will ensure quality education and thus they have instituted an inspection mechanism of the public system that keeps its performance in check. We have noted three essentially similar efforts in Iceland, far apart in time, with essentially the same ingredients, but with different rhetoric and operational rationale.

Returning to the Question of the Impact of Evidence on Policy

We have touched on this crucial issue but cannot really tackle it, mainly because the relevant evidence is not available. This is the question of the use of data in the actual implementation of policy or other guided action (Ball, Maguire, & Braun, 2012; Coburn, 2004). There are ample cases which show that a policy was formed on the basis of inspection, but there is little formal evaluation available of the direct and exclusive long-term impact. A system is always developing and changes may take place in line with intended policy action, but might have taken place anyway. Methodological constraints may thus hamper attributions to the evidence based policy in question.

The main responses visible are those that stem from demands that action should be taken, something needs to be done. The extent to which the focus through the centuries is on reading is interesting (and understandable). This was the chief concern in the 1740s and clearly in the 1930s and is a priority in the twenty-first century. In all cases the evaluations or results have produced lively and possibly very important debates, but whether the extent of these or the policies that emerged had an overriding or a long-lasting effect in the classroom has not been established.

The Nordic and International Connections

We have indicated that the numerous and multifaceted Nordic connections work at many levels, and are often only visible at very close quarters. There is no doubt that the basic characteristics of the Icelandic school system are Nordic (Danish), but the system development and details are different in many ways. There are probably four main reasons for this. First, the conditions in Iceland meant that the system started to develop much later than the other systems, and thus had a starting point in a cultural climate different to those that characterized the beginning in the other countries, despite the close cultural and political ties. Secondly, the influence of the rural character of Icelandic society lasted long after the school system took off, which crucially affected its composition and development far into the twentieth century (Guttormsson, 2008a). Thirdly, while the system was being shaped, we received influence from outside the Nordic sphere; from other parts of Europe, and also from the US. The 1920s and 30s were an important formative period, when these influences had their effects. And fourthly, when we moved towards rethinking some of our modus operandi towards the end of the twentieth century, we had come under a strong influence of the OECD and also the neo-liberal turn, rather than looking mainly towards the Nordic countries (even though a similar rhetorical influence was seen there). Thus, in the developments in the later accountability and evaluation phases, we happened to take controlling mechanisms, at least to an important extent, from the US in the first half of the twentieth century, and we did not look to the Nordic countries when forming the last accountability phase in the 1990s and into the twenty-first century.

Having noted that our action plans do not necessarily stem from the Nordic countries, at least not in the accountability realm, it is clear that when we look for the rationale for improvement, we compare ourselves, practically exclusively, to the Nordic countries. In recent years, largely using data from PISA (see e.g., MoESC, 2014). This is also clearly evident in the Northern Lights conferences and in the parallel publications.

Three Emerging Questions

In our analysis, three important questions emerged. The first, which is implicit in our approach is to ask how novel the emphasis on data, evaluation, accountability, tests or other monitoring of student progress, truly is. We have shown that this is definitely not very new. A move towards district and school independence in the 1930s was very explicit and went hand in hand with demand for equality in delivery of education. It was therefore accompanied with a call for a measurement and inspection mechanism, which was set in motion at the time. Its major ingredients did, however, not last for much more than a decade (and for some parts much shorter), but given the difference in context, it was initially no less ambitious and extensive than what we have seen in recent years. The analysis of this period, from the perspective of evidence use, firmly suggests that when looking at development in education we have to take the long-term perspective, in order to understand both continuity and change.

The second question that emerged is about the use of all this evidence. How is it used? By whom? The implicit question is how useful it is to spend considerable resources on the collection and presentation of various types of evidence. It is always assumed, of course, by those responsible for the distribution of money, that evidence is important, even crucial, for developing education, in particular helping pupils to get the most out of their education. During both the 1930s and the recent decades enormous amount of data, largely test data, has been collected. The actual use and influence for the development of education is, unfortunately, difficult to ascertain, in particular in the former period. But we suggested, that even though our educational facilities, professional knowledge and teaching and learning conditions have been constantly and ambitiously improved, we have little clear indication that the evidence accumulated played a substantial and specific role, except in the superficial (but admittedly important) way of spurring the actors on: we must do better. An interesting development in this connection deserves further study. As expertise and professionalism develops, it is possible that one sees different and relatively closed arenas of expertise forming. Noticeable examples would be experts analyzing the massive data available (e.g., test data), experts using diagnostic tests (e.g. psychologists or kindred professionals), the teachers who are directly responsible for internal evaluation, but perhaps most interestingly, the vast army of educational researchers at the universities and special research institutes. Some of these may have problems communicating the relevance of the evidence they accumulate to those who would benefit by taking it into account. All these groups are perhaps also hampered by their own lack of understanding that the material they have, gives far less direction for action than is thought to be implicit in the evidence collection exercise (Jónasson, 2019).

The third question is about the origin or roots of the notion of using evidence to enhance the quality of education and what mechanisms must be in place for it to work. Where do the ideas mainly stem from? Are we, in the field of Icelandic education, chiefly influenced by the other Nordic countries, e.g. because of close cultural bonds, or are we preoccupied with ideas from other directions? This turns out to be a very complex issue. It is clear that the Icelandic connection with these countries exists at many levels. It is definitely also within the policy arena, but several other arenas seem to be no less important when exploring the collaboration, cooperation or influence. Wherever we look, the connection to the Nordic countries seems to be strong. Nevertheless, focusing on the arena of evaluation and accountability, including testing, we have argued that both the initiative in the 1930s and in the recent decades was clearly much, if not dominantly, under the influence from other directions. Thus, to the extent that the Icelandic system of evaluation and testing may be similar to the other Nordic countries, it is still partly homemade, but also draws substantial influence from outside the Nordic countries.

Thus, we conclude, when the focus is on the use of evidence, in particular evaluation and testing, we are in many important ways moving towards the ambitious, accountability and testing mechanisms that were in place, for a while in the 1930s and 40s, and even much earlier. We also conclude that the Nordic influence, despite permeating all levels of our education enterprise, is not the major contributing factor to the way we construct our action plans for the collection of evidence as well as its use.