1 Introduction

Research on the teaching and learning of statistics emerged in the late twentieth century (Zieffler, et al., 2018). Much of this early work came from the college level and was spurred by the changing nature of technology and its impact on the discipline and consequently on what was taught and how it was taught. Of note was the work of Tukey (1977), who introduced “exploratory data analysis”, an approach involving exploratory techniques, many of them visual, to uncover patterns, underlying structure, and exceptions in a dataset. Key papers by Cobb and Moore (1997), Moore (1997), Wild and Pfannkuch (1999), Gal (2002), and Cobb (2007) laid the groundwork for further thinking about statistics education and helped the field come to some consensus about important topics related to the development of statistical understanding. The groundwork included the acknowledgement that statistics was not a branch of mathematics but an independent discipline with its own distinctive modes of reasoning and that technology had shifted statistics “from overtly mathematical approaches to data analysis toward computationally intensive approaches” (Wild, et al., 2018). Statistical investigations (Wild & Pfannkuch, 1999) and literacy (Gal, 2002) were considered important foundations for learning statistics, including developing dispositions such as a critical stance. Furthermore, in a data-rich evidence-based world, there was recognition that all educated people needed to understand statistical ideas whether they were a professional user of statistics, a consumer of statistics, or a decision maker. To enculturate students into the thinking and practice of statisticians, Lehrer and English’s (2018) proposal that students should engage in authentic activities that are approximations to professional practice and co-construct the practice of investigative inquiry alongside conceptual development has gained traction in statistics education research and many curricula (e.g., Ministry of Education, 2007). At the senior level of high school, computer simulation-based approaches such as bootstrapping and randomization tests (Cobb, 2007) instead of mathematical inferential methods began to be adopted in statistics education. For younger students, exploratory data analysis and statistical modelling tools (Biehler, et al., 2013) became more prevalent as statistics education and research sought to provide opportunities for students to investigate multivariate datasets and model situations and to develop statistical reasoning and conceptual understanding. However, statistical literacy in Gal’s (2002) definition, whereby people, in response to diverse media reports, activate a set of critical questions to evaluate and communicate reasoned opinions on claims made, has gained limited attention in research and curricula.

Pfannkuch (2022) contended that statistics curricula are shaped and evolved in response to changes in technology, changes in the discipline, the needs and demands of society, research into students’ learning and reasoning, and statisticians’ calls for reforms. Building on previous research, Pfannkuch (2022), in reimagining curriculum approaches and anticipating possible changes, emphasized the need for fostering statistical reasoning and argumentation including understanding the data context, interrogating data, and using data as evidence for making and supporting claims for both inquiry and advocacy. On the assumption that technology was an integral part of statistics curricula, she argued curricular approaches should include (1) immersing students in data-rich environments, statistical investigations, and modeling, (2) critically evaluating data-based arguments in diverse media, including risk, and (3) facilitating accessibility of statistical concepts through interactive visualizations, learning how to scaffold students’ reasoning, and providing coherent conceptual pathways. Technology has continued to advance since Pfannkuch (2022) reimagined curriculum approaches, and hence the landscape of statistics education has undergone considerable change as new directions and learning contexts emerge. Building on the work of Pfannkuch (2022) and the groundwork of the five key papers we identified above, this paper focuses on four emerging trends from 2017 to mid-2022.

2 Method

Using a narrative review based on Delphi methods (Puig & Adams, 2018) we selected an expert group composed of 26 members of the statistics education community from 15 countries based on their publication history, involvement in projects related to the teaching and learning of statistics, contributions to International Association for Statistical Education conferences, and mindful of geographic diversity. These individuals were asked to describe current trends they have observed in the field and to identify interesting and relevant papers related to those trends. The experts we consulted identified over 200 resources as relevant including articles published in statistics or mathematics education journals, book chapters, conference proceedings, and curricular documents as well as publications in press (e.g., Ridgway, 2022) and projects currently being conducted (e.g., ProDaBi project, https://www.prodabi.de/en/). In addition, we looked at special issues such as those from Statistics Education Research Journal (2017, 2021, 2022), ZDM (2018), Teaching Statistics (2021), Journal of Statistics and Data Science Education (2021), and books such as Ben-Zvi et al. (2018) and Leavy et al. (2018). We focused on future directions for statistics education research and thus included articles based on opinion or principles if the arguments made a strong case supported by evidence as to why the idea was important. To help narrow the candidates for inclusion, we established the following criteria for selecting papers (Wilson, et al., 2001). The papers had to be:

  • Published within the period from 2017 to mid-2022.

  • Significant, challenging current thinking.

  • Relevant to the identified trends noted by the expert group.

  • Evidence based.

  • Scholarly.

Studies were discarded for four reasons: (1) not directly related to one of the identified emerging trends; (2) focused on ideas that had been well covered in papers prior to 2017; (3) did not challenge current thinking; or (4) were based on a single program or country problem that was not applicable across the field of statistics education. In the end, we included 50 papers in this review that highlight the relevance and importance of each theme, challenge what should be taught, or suggest new ways of thinking about the teaching and learning of statistics. A particularly important paper for each theme is annotated in the references.

We used a reductive approach to analyzing the 24 responses received by way of inductive category formation (Mayring, 2014). A first pass through the suggestions, consistent with an open coding technique drawn from grounded theory (Strauss & Corbin, 1998) and referring to the five key papers and Pfannkuch (2022) as described above, enabled us to identify broad categories for emerging themes. Through many discussions and more detailed text analysis, we identified four categories, with the first three aligned to Pfannkuch’s reimagined curricular approaches: Data Science, “Visibilizing” Statistical Concepts, Social Statistics, and New Learning Contexts. The impact of technology drives these trends as well as the dual idea of data consumer and data producer, recognizing that to participate as an informed citizen in today’s world students must experience both. By data consumer we mean those who interact with data-based information produced by someone else’s efforts, and by data producer we mean those that engage in empirical investigations, interpret their own data, and report their conclusions (Gal, 2002). Figure 1 illustrates these emerging trends with associated subthemes.

Fig. 1
figure 1

Emerging trends identified by expert group of statistics education researchers

As indicated in Fig. 1, the data science trend is characterized by continued advances in software,the connection to computational thinking, and a renewed emphasis on modeling, in particular predictive modeling. The visibilizing statistics concepts trend revealed the strong research focus on young learners, the power of interactive visualizations and the continued attention to inference. The social statistics trend highlighted the importance of attending to risk, critical literacy, and communication, whether in the media or for awareness and advocacy. Trends in the contexts for statistical learning include new methods of data collection and analysis, different forms of data, new representations, and curricula changes. The discussion below describes the four emerging trends in more detail and highlights the relevant papers.

3 Emerging trends

3.1 Data science

The increasing capacity of technology to collect massive amounts of data and to organize and manage these data has led to an increasing demand for what is called data science, which was identified as an emerging trend by over half of those we consulted. While most agree that what is meant by data science is a blend of statistics, mathematics, coding, and context, data science is not well defined (e.g., Gould, 2021). Gould argues that data science encompasses all of statistical reasoning and synthesizes major aspects of computational thinking as, “an approach to problem solving that attempts to make problems computable” (p. 516) by breaking a problem into smaller steps that can be executed by a machine. Gould (2021) described a data science course offered to secondary students that engaged them with real data and introduced them to coding, arguing that coding can compactly and efficiently communicate statistical models but acknowledged the tension for students between computational thinking and statistical thinking. Supporting the importance of computing, Horton and Hardin (2021) strongly urge the integration of statistics and computing at all levels in their introduction to a special issue of the Journal of Statistics and Data Science Education on Computing in the Statistics and Data Science Curriculum. Further clarifying the relationshipbetween statistics and data science, Erickson et al. (2019) and Gould (2021) described elements of data-specific thinking that characterize what data science adds to statistics. For example, data collection has expanded from random sampling and random assignment to accessing secondary data, data that were primarily collected for another purpose, often unknown.

While researchers continue to use the data investigation process PPDAC (problem, plan, data, analysis, conclusions; Wild and Pfannkuch, 1999), several other organizing structures, closely related to PPDAC, have been suggested for implementing the data science process. Building on the work of practicing data scientists, Lee et al. (2022) identified key practices, processes, and dispositions integral to data investigations resulting in a six-phase framework for implementing data science: frame problem, consider and gather data, process data, explore and visualize data, consider models, and communicate and propose action. Attending more to the pedagogy involved in teaching data science, Burrill and Dick (2022) used a set of design and implementation principles for scaffolding understanding when collecting data by simulations to investigate herd immunity. From the perspective of the increasing collection of data about every aspect of people’s lives, Utts (2021) addresses general principles related to ethics and what educators can do to instill sound ethical behavior in students to avoid adverse impact on people and society. Raising some of the same concerns as Utts, Weiland (2018) argues from the perspective of both the data consumer and producer that students also need to learn about and control their personal data trail.

In addition to the introduction of computational thinking with the emergence of data science, technology is essential for the management of large data sets. Wild (2018) discussed how to combine software such as TinkerPlots™ and CODAP (common online data analysis platform, a free online tool for data exploration and visualization) with point and click interfaces specifically designed for learning statistics and software that relies on the user communicating with the device through a language (coding). Given the expanding needs for teaching data science at the secondary level, researchers have adapted packages such as R or Python. For example, students in the Mobilize Introduction to Data Science project (Gould, 2021) analyzed data using Rstudio and Shiny apps. From a slightly different perspective but still making use of a simplified approach to coding, Fergusson and Wild (2021) discussed APIs (application programming interfaces), pieces of code from different sources that can be put together like building blocks to enhance applications, as a method for making data from an online database accessible to a broad range of secondary students.

The importance of modeling is supported by Lee et al.’s (2022) framework for the data science process, which identifies modeling as a key component of the process. From the statistical perspective, Pfannkuch et al. (2022) argued that statistical modeling provides a way to bridge data, chance, and context as a holistic approach for developing key statistical ideas and described the research knowledge base before 2018 with respect to statistical modeling, giving examples of research related to data modeling and modeling data observations with random generating devices to produce simulated distributions that mimic real data distributions. With the advent of data science, however, algorithmic predictive modeling, a process such as regression or decision trees that transforms the data, where the focus is on whether the model is effective in predicting outcomes (Fleischer et al., 2022), has become prominent. Fleischer et al. (2022) and Podworny et al. (2022) have embarked on the multigrade curriculum ProDaBi project in which secondary students worked with samples from large data sets to design a “training” model, beginning with hands-on experiences using data cards to build decision trees (algorithms for classifying data according to specified conditions), tested the model with another sample, then used classification rate data to evaluate the model. Students transitioned to CODAP and eventually used Jupyter Notebooks (a programming environment related to Python that enabled them to write and execute code and that appear menu driven to the user) to apply an algorithm to generate decision trees and to evaluate their results (see https://youtu.be/9ol9HuTlXLw). Their instructional sequence is similar to the principles advocated by Burrill and Dick (2022). From the same perspective, Horton et al. (2022) described a form of supervised learning or predictive modeling by providing an overview of four approaches, based on the use of different technologies, to engage students in thinking about classifying email messages as spam or non-spam.

Several researchers have engaged teachers with algorithmic models. Zieffler et al. (2021) involved secondary teachers in algorithmic modeling activities designed to introduce classification trees (decision trees where the variables are discrete) and highlighted aspects of the statistical modeling process that appeared to be understood by the teachers and aspects that were challenging, in particular evaluating a model. They argued that algorithmic models offer opportunities to expand multivariate thinking, something typically not done in the past when activities involved small carefully designed data sets. Also working with secondary teachers, Fergusson and Pfannkuch (2018) illustrate how they introduced the teachers to predictive modeling involving simple linear regression models. The teachers used data from an API to develop and train their prediction models, which they then tested on a different set of data.

Because data science is such a new field, research has tended to focus on understanding what data science is and on related curricular materials. Investigating how students develop an understanding of algorithmic predictive modeling, effective ways to integrate computational thinking into the curriculum, and identifying ways to support teachers as they engage their students in data science activities are questions that need answers as we move forward.

3.2 Visiblizing statistical concepts

Researchers continue to show the difficulties students have in understanding statistical ideas and, in particular, they are identifying statistical concepts and argumentation that have been absent in traditional curricula and instruction (Biehler et al., 2018). That is, researchers are making visible previously unknown concepts such as posing a statistical investigative question (e.g., Arnold & Franklin, 2021) or creating dynamic interactive visualizations of concepts to make them visible to learners (e.g., Burrill 2018), which we call “visibilizing” statistical concepts. For example, statistical inference procedures and the concept of sampling distribution were traditionally introduced in the final year of high school, but technology has made it possible for students to be introduced much earlier to inferential ideas (e.g., van Dijke-Droogers et al., 2020) and to a richer and bigger array of fundamental concepts such as sampling variability, sample distribution, and uncertainty. As Makar and Rubin (2018) demonstrated, statistical inference ideas and concepts can be nurtured and grown across the curriculum toward non-mathematical simulation-based statistical inference methods. Capitalizing on advances in technology, researchers have attended to inference by developing student reasoning and concepts through “visualizations, simulation, and powerful problem contexts” (Makar & Rubin, 2018).

The adoption of interactive visual approaches for learning statistics has led researchers to understand more about students’ reasoning and conceptual processes. For instance, because simulation-based inference made 17–18-year-old students’ conceptions visible, Case and Jacobbe (2018) were able to identify the difficulties students had in coordinating “the population …, the distribution of single sample, and the distribution of statistics collected from multiple samples”. Similarly, Noll et al. (2018), using narrative theory, found that when students constructed models through interactive visual means, they were influenced by the story of the situation, guessing music notes correctly, and hence created and preferred models that were narrative in nature. The use of interactive visual data analysis tools to reveal students’ intuitive reasoning is captured in Dvir and Ben-Zvi’s (2018) analysis of two 12-year-olds reasoning from a scatterplot. Their analysis revealed that the students were holding conjectures about what the data would show, which they compared and assessed against the data they observed. Dvir and Ben-Zvi believe that focusing on students’ acts of comparison will be helpful in enhancing their learning.

Visibilizing statistical concepts and actions in very young learners is a burgeoning area of research (e.g., Leavy, et al., 2018), an area that has had little attention since the early 2000s (e.g., Watson, 2006). The use of visual means to assist very young students to engage, reason, and interact with data is exemplified in the work of Leavy and Hourigan (2018), who explored 5–6-year-old students’ inscriptions when collecting data by tracking the appearance of zoo animals in a video. They contended that the inscriptions served as a record of the event and represented the beginnings of abstract thoughts. Within the predictive modeling arena, Oslington et al. (2018) illustrate how high-ability 6–7-year-olds used self-portraits drawn by children in Kindergarten and Year 3 to develop a rule-based classification model. Students tested their model on larger sets of self-portraits and developed their own illustrations to support the rule-based model. The students used graphical representations, reflective statements, and tabular displays to inform their judgements regarding the strengths and weaknesses of their models.

Introducing data modeling or statistical investigations to younger students is emphasized in the work of Lehrer and English (2018), who synthesized

diverse research studies investigating the potential of inducting elementary grade children into the statistical practice of modeling variability in the light of uncertainty. Data modeling allows young students to co-construct concepts such as distribution, center, and variation and to reason visually using a combination of pencil-and-paper, and data analysis tools. Fielding-Wells (2018) explored this direction as a precursor to more formal work with inference, providing an example of how this worked with 10–11-year-old students engaged in constructing paper airplanes. Software platforms such as TinkerPlots, have also allowed young students to create visual models rather than equations, using random generating devices or factors that account for variation in order to produce simulations that mimic real data distributions. Kazak et al. (2018) showed the beginnings of this statistical modeling process in their study involving 11–12-year-old students, who collected and modeled the distances jumped by different sized origami frogs.

Because technology can remove learning barriers (e.g., constructing graphs by hand and mathematical procedures), all students can engage in activities that are authentic to the discipline, access statistical ideas from a very young age, and experience reasoning from and about data and modeling (Lehrer & English, 2018). With students using digital tools or other forms of visualizations and meaningful contexts, researchers have identified and explicated new statistical conceptual underpinnings and students’ intuitive reasoning and conceptions or productive resources (Findley & Lyford, 2019) on which to build better conceptual infrastructure across the curriculum. Technology-rich visual learning environments are opening new avenues for researchers to explore and gain better understanding of students’ reasoning, actions, and conceptual development and to identify new concepts, which often require the invention of new language for articulating reasoning and argumentation, that need to be part of the teaching repertoire.

3.3 Social statistics

Statistical literacy, which was elucidated by Gal (2002) from the perspective of societal needs, was recognized but gained only limited action in researching and implementing curricula changes (e.g., Callingham & Watson, 2017). The ProCivicStat Partners (2018) project, however, recognized the urgency of introducing Gal’s ideas into the curriculum to strengthen democracies and that the nature of statistical literacy was changing. They argued the future of the world is at a critical juncture and that statistics educators must take up the challenge of improving student understanding of statistics about important societal concerns such as inequality and injustice. Building on Gal’s work they claimed that students need the knowledge, skills, and disposition to engage with civic statistics from a wide variety of information sources, where data are often “multivariate, aggregate, dynamic, and communicated through rich text and data visualizations and embedded in a social context” (p. 5). They suggest students need to develop a habit of mind to interrogate data-based information, to know what questions to ask, to communicate a reasoned opinion about whether the evidence provided is convincing, to discuss the implications of the findings, to make decisions in the presence of uncertainty, and to advocate for social action. Their research resulted in the provision of instructional resources and conceptual frameworks for educators to design tasks based on the needed skills for today’s world (Ridgway, 2022).

Aligned with ProCivicStat Partners’ (2018) concerns about improving student understanding of statistics in the media is the lack of attention to risk in the curriculum. Engel and Wilhem (2021) highlighted this need saying, “The Covid-19 crisis has impressively raised the general awareness that our social coexistence and political decisions are essentially based on data, the weighing of risks and thus on probability estimates”. Brown et al. (2021) concurred, arguing that students should develop risk-know-how to protect themselves “against misinformation and misperception and encapsulate the concepts that equip people to access and deal [with such] information”. There is a noticeable gap in research in this area at the school level (Bargagliotti et al., 2020). Martignon et al. (2022) delved into the area of risk literacy, focusing mainly on conditional probability, and illustrated how the use of specific iconic and interactive dynamic representations can make the ideas accessible. The understanding of risk, however, is a bigger field than conditional probability and relative risk, and work needs to be done on mapping out important ideas and a developmental pathway for enhancing students’ notions of risk. A good place to start could be the risk-know-how framework of Brown et al. (2021) that identifies eight focal points for making sense of risk and includes the importance of knowing what is being discussed with respect to concepts such as numerator, denominator, or population and ideas such as recurrence intervals.

Given that citizens form opinions based on information from diverse communication channels, Gal and Geiger (2022) updated Gal’s (2002) work and researched current statistical demands in the media. A main finding was that the primary means of communicating statistical information was text-based, that is written and spoken text, not graphs, and that students need to identify and comprehend statistical information embedded in text, which is often conveyed implicitly and in everyday language. Regarding everyday language, ProCivicStat Partners (2018) stated learners should be able “to deconstruct the diverse rhetorical and argumentative styles they will encounter when reading and interpreting statistical messages”. Urgent attention, however, needs to be given to the increasingly worrying societal issue of disinformation. Souza and Araújo (2022) suggest educators need to broaden their perspective on how to interrogate data in an era of fake news and consider how to alert students to methods of communication that appear evidence based. They deconstructed the speeches of a journalist and the techniques he used to continually spread statistics-based disinformation about the pandemic and to shape public opinion including the specific language and tone of voice used to emotionally engage and persuade people to adopt his beliefs about the pandemic.

The necessity for statistics education to address social justice issues is raised in the work of several researchers. Weiland (2017) conceived the data consumer and data producer as “reading the world” and “writing the world”, which he classified together as statistical literacy. He argued that critical literacy such as “interrogating social structures in the world” and “actively influencing and shaping” (p. 41) those structures should be combined with statistical literacy to form a new conception of statistics education as critical statistical literacy. Other researchers suggest that the purpose of students learning about statistical inquiry should be broadened to include the social context and how to advocate and argue from findings. From a socio-cultural perspective Zapata-Cardona (2018) argues students should conceive of themselves as being in an inextricable relationship with the world outside and their culture. In this view students are immersed in tasks where the data context allows them to “develop awareness of their surroundings and participation in the world” to become a critical citizen. To illustrate, Zapata-Cardona described a lesson with 12–13-year-old students who investigated a question about the nutritious value of the food they brought to school for lunch. In the analysis she contended there were nascent signs that students were not only developing statistical knowledge but also were becoming aware of their world by personalizing their findings with comments such as “we are overconsuming calories”, that is, students were learning to become critical citizens.

Becoming aware of the world in which they live and adopting an evidence-based critical stance is promulgated further in Souza et al.’s (2020) research. Based on the theory of creative insubordination, they urge educators to realize the potential of statistics education to provide students with the tools to become political activists. Souza et al. demonstrated how two students investigated an issue from their environment, namely school canteen food. The teacher supported the students in repositioning their concerns about not liking the food into a social concern about food wastage. The students used the findings from their investigation to advocate for change. Souza et al. contend that statistics educators need to assist students to develop the skills “to challenge the status quo, address inequity, and promote change”. Rubel et al.’s (2017) research extends the idea of students becoming aware of their surroundings by considering the geo-mapping of data, visual displays that have traditionally not been part of statistics curricula, but which can graphically show spatial patterns on maps. Through the students geo-plotting their photographs of shops in their community, they realized that their community, compared to other neighborhoods, had different patterns, for example, more pawn shops, fewer bank facilities. Considering spatial data provided opportunities to open students’ eyes to inequities in society and to groups who have power.

Such research reinforces the urgent need for educators to consider their role in assisting students to become knowledgeable citizens and advocates by learning how to investigate data-based situations as a producer and consumer of data. Multivariate thinking, nurturing the disposition to question and interrogate data-based information, and recognizing the need for data to judge a situation are some aspects where a research focus is needed. Moreover, there is an urgent need for research into how people interpret and understand statistical information embedded in text, both written and spoken. For students to navigate the world of data in which they live, curriculum developers and researchers need to pay attention to critical statistical literacy from a social statistics perspective.

3.4 New contexts for learning

The continuing evolution of technology offers new contexts for studying how students learn, for developing understanding, and for engaging students in statistical investigations. As noted by many of those we consulted, data today are multivariate and may consist of a variety of formats: images, text, sounds, dates, coded symbols, and locations. For example, Lee and Wilkerson (2018) discuss new forms of data accessible through emerging technologies. They specified four forms of data: data that are (1) collected through automated means, (2) algorithmically-generated, (3) non-quantitative, and (4) curated and publicly-available. These include data coming from the use of wearable sensor technologies, computational log data, such as clicks on websites or keystrokes on a personal computer, and remote laboratories. For each form, they discussed the implications for classroom practice, teacher preparation, and educational research. As another example, the Oslington et al. (2018) study described above used student art as data.

Several researchers recognize the potential of technology to create new forms of visualization. Andre and Lavicza (2019) identified opportunities to leverage technological innovations, offering examples of how to integrate data visualization and Open Data from primary through secondary school. Engel et al. (2020) illustrated how technology today provides tools for data visualization that allow the user to explore data without requiring deep mathematical knowledge and how interactive data visualizations can be used to promote conceptual understanding far beyond the graphical representations typically part of the school curriculum. They described data visualization as a rapidly evolving mix of science and art that opens up new avenues for communicating both in video and print, particularly with respect to data about the social and economic well-being of humans and the realization of civil rights. From avery different perspective, Rubel et al. (2021) looked at data representations from an equity viewpoint. They used examples related to the pandemic, defined dimensions of critically reading data and described how “data visualization’s interrelated formatting, framing, and narrating might resonate with privileged perspectives, uphold certain hierarchies of power, perpetuate particular values over others, and guide towards certain decisions” (p. 265).

To improve and understand the conceptual infrastructure needed across curricula, researchers themselves are beginning to adopt technology-based analytical techniques for qualitative data to probe not only students’ reasoning but also their actions when using data analysis tools. Frischemeier and Biehler (2018) used computer supported data analysis and stringent coding to determine pre-service teachers’ reasoning and actions when they were comparing groups. Gould et al. (2017) described the coding of videos of teachers’ reasoning and actions as they progressed through a statistical investigation centered around participatory sensing data in an interactive dashboard data analysis environment. Using discrete Markov chains, the researchers graphically depicted the strategic decisions the teachers made, which revealed the “importance of frequent questioning and crafting productive statistical questions” (p. 305). By delving into research data in a unique way, Gould et al. reinforced the notion that developing students’ disposition to ask questions during statistical investigations was an important aspect, which Arnold and Franklin (2021) illustrate in their work. Other promising methods of uncovering previously inaccessible students’ thinking and reasoning are the use of eye-tracking technology to collect data and machine learning algorithms for data analysis. Through eye-tracking 16-year-old students’ strategies when interpreting histograms, Boels et al. (2019) found that students used a case-value plot interpretation, which is well-known in research, and a computational strategy, which is rarely reported.

Pfannkuch (2018) argued for reimagining the statistics curriculum, and other researchers agree. The importance of preparing students to deal with real data to address real questions has led educators such as Gould (2021), to claim there is a need to rethink secondary-level statistics education. Erickson et al. (2019) argued that specific data moves such as filtering, merging two data sets, or making hierarchy should be introduced into the K-12 curriculum to prepare students for working with large data sets. At the middle school level, a study by Wilkerson and Laina (2018) raised issues of what should be in the curriculum when students repurposed a large publicly available data set to investigate questions about their local communities, reasoning about data and context through storytelling and explicit construction of hand-drawn or digital inscriptions. The repurposing raised issues related to data collection, selection, and manipulation such as recalculation, summarization, and data merging and purging but provided opportunities to connect to students’ lives and to bring a new perspective to the role of statistics in classrooms. Burrill (2020) makes suggestions related to both mathematical and statistical content for integrating data/statistical literacy into the school curriculum.

The emphasis since 2018 on multivariate data and on computational thinking has opened new horizons for the teaching and learning of statistics, which has consequences for what should be taught. It also raises questions about how students learn in these new contexts and how researchers can utilize new contexts and tools to better understand the ways in which students learn.

4 Conclusion

Statistics education research began to flourish in the 1990s alongside the recognition that statistics was an independent intellectual discipline with its own unique ways of thinking and arguing. The changes wrought by technology in the last few years, however, have been so profound that school curricula and thus research need a major paradigm shift to educate future generations (Andre & Lavicza, 2019). Furthermore, statistics education needs to recognize its unique position in school curricula for developing students’ awareness of exploitation, injustices, and inequities in society. Technological advances have thrust data science to the forefront, a major trend that will irrevocably change the statistics education landscape. At the same time the pandemic has provided the impetus to bring social statistics into prominence as a major trend to address. New learning contexts and visibilizing statistical concepts are two other major trends underpinned by the increasing prevalence of a wide array of tools to integrate into the learning environment, which open up many avenues to explore. Across the trends we emphasized that curricula need to enculturate students into multivariate and computational thinking, to deal with new forms of data, and to develop their disposition to ask questions.

We acknowledge that the trends are based on the perspectives of the experts that we approached, which are reflected in our final choice of articles. We recognize that every country is at a different stage in the development of their school statistics curricula, in their access to technology, and the research that is possible to conduct. However, as many of the articles show, it is still possible to introduce data science ideas using unplugged activities with data cards (Podworny et al., 2022) and social justice issues in a pencil-and-paper environment (Zapata-Cardona, 2018).

Research is needed to inform curricula in the next five years based on the four trends we have identified, trends that challenge what should be taught in statistics, how it should be taught, and the role of statistics in school curricula. We hope that the trends we have identified will provide a catalyst for rethinking school statistics and for identifying research opportunities. Some avenues for future research including some questions that could be considered are:

  • More in-depth studies on introducing a coherent pathway of data science ideas across the curriculum. As data science becomes integrated into curricula, what new tools and techniques will enable learners to use secondary data efficiently and purposefully, and make sense of and reason from such data? Given the increasing amount of content necessary to prepare students for a future in which data and statistical reasoning are essential, what are feasible strategies for incorporating these ideas into the curriculum? Should statistics curricula go beyond the boundary of a single subject and if so, how might this affect learning opportunities? As new methods for data collection emerge, what ethical principles do students need to activate and use?

  • More insight into fostering student statistical argumentation, advocacy, and risk-know-how. What environments encourage learning how to make claims in a data-rich world, and how to critically evaluate written and spoken data-based evidence in diverse media and risk arguments? What is the impact of engaging students in social statistics? In the long term, how do student views change about the world in which they live? What is the developmental path for understanding risk? How do we include risk in the curriculum, so all students have sufficient knowledge to understand how to make informed decisions? How do we build students’ intuitive conceptions into productive conceptions for understanding risk?

  • Visibilizing how students interact with and interpret data and conceptual visualizations. How do learners interpret and interact with non-traditional interactive data visualizations? In technology-rich visual learning environments, what reasoning, actions and conceptual development are occurring and what new concepts need attention? What new language for articulating reasoning and argumentation might be needed? When young learners are introduced to core statistical concepts, how do these concepts become cultivated across the grades and become connected and deepened in students’ reasoning processes? How do we support the development of and make widely available software such as TinkerPlots and shape research into the learning opportunities provided by such software?

  • Impacts of new learning contexts and the development of learning content and pedagogical pathways.How do learners interact with new forms of data? As new learning contexts emerge, in what ways do statistical thinking, reasoning and critical statistical literacy develop? What professional development preparation in terms of content, pedagogy, and technology do teachers need to engage with and teach new learning contexts? How can teachers from a wide range of countries and backgrounds be very quickly upskilled?

As Pfannkuch (2018) stated, curricula are shaped and challenged by technology changes, new societal perspectives, research findings on student reasoning and the constant evolution of statistical knowledge and practice. The advances in technology since 2017 are profound and challenging. We anticipate these advances will continue at an increasing rate and will require a concerted and rapid effort by researchers and curriculum developers to stay abreast of current and future developments that will be relevant to students’ lives.