Introduction

In recent years, the unprecedented convergence of substantial computing power, vast volumes of data, and enhanced machine learning algorithms have led to notable advancements in Artificial Intelligence (AI) technologies. AI is considered a field of study in the field of computer science. Artificial Intelligence involves the development of computers, machines that can be improved to assume some concepts that are usually considered to be similar to the capabilities of human intelligence, such as learning, reasoning, adapting, self-correcting, and so on (Dobrev, 2012; Kok et al., 2009; Simmons & Chappell, 1988). Consequently, recognizing the significance of AI’s pervasive presence in contemporary society, it has become increasingly imperative to integrate AI education into curricula, with the aim of equipping individuals with the necessary competencies to effectively navigate and engage with an AI-driven world (Eaton et al., 2018; Pedro et al., 2019). Therefore, the inclusion of AI teaching and learning from the earliest stages of education is paramount (Heintz, 2021). The significance of incorporating AI in education has garnered considerable attention, particularly in its application across diverse academic disciplines, including language subjects (Pokrivcakova, 2019), mathematics (Gadanidis, 2017), biology (Perrakis & Sixma, 2021), physics (Cheah, 2021), and beyond. AI teaching assistants (Kim et al., 2020) or learning agents (Petersen et al., 2021) have been utilized in online education, while ChatGPT has demonstrated the capability to offer personalized and interactive learning experiences (Baidoo-Anu & Owusu Ansah, 2023). Additionally, machine learning (ML) is a significant and crucial field of study (Alam, 2022). These elements enable tailored guidance, interactive experiences, and the utilization of AI technologies to enhance the learning process for individuals. While some scholars have undertaken studies examining the integration of AI in primary and secondary education (Akgun & Greenhow, 2021; Amo et al., 2020; Xu & Ouyang, 2022), a comprehensive and systematic review of the application of AI technology in science subjects has yet to be formulated. Thus, the need persists for a rigorous examination and analysis of the current trends and practices employed in the teaching and learning of AI within science education at the primary and secondary levels across global educational contexts.

To address this critical gap in the literature, this paper proposes an extensive investigation aimed at comprehensively examining and analyzing the development and implementation of AI in science education at primary and secondary levels. By examining a wide range of case studies, scholarly articles, and relevant educational initiatives, this research endeavor seeks to offer robust conclusions and evidence-based recommendations to inform and guide future educational initiatives within this domain. Ultimately, the findings of this study will contribute to the advancement and refinement of AI educational practices, ensuring the effective integration of AI technology in science education, and facilitating the preparation of future generations for a society increasingly shaped by AI-driven innovations.

Literature Review

Artificial Intelligence and AIED

AI, as a scientific-technological domain, has emerged relatively recently, spanning only a few decades. Its name was coined in 1956, and since then, it has evolved through collaborative efforts from diverse disciplines such as computer science, mathematics, philosophy, neuroscience, and psychology, which have converged to advance its interdisciplinary focus (Kok et al., 2009; Simmons & Chappell, 1988; Wang, 2019). The overarching goal of AI is to comprehend, model, and replicate human intelligence and cognitive processes through the development of artificial systems. Presently, AI encompasses a wide range of subfields, including machine learning, perception, natural language processing, knowledge representation and reasoning, and computer vision, among others (Sabharwal & Selman, 2011).

The interdisciplinary field of Artificial Intelligence in Education (AIEd) focuses on the application of AI to enhance instructional and learning processes, with the ultimate goal of transforming and promoting educational system advancement (Holmes et al., 2023). Traditionally, the instruction of AI knowledge has been limited to university-level courses targeting students with backgrounds in computer science and information and communication technology (ICT). However, in recent years, AI education has expanded to include diverse study backgrounds within university programs (Kong et al., 2021), and even at the primary and secondary education levels (Kandlhofer & Steinbauer, 2021; Tedre et al., 2021). Experts emphasize the importance of developing an AI education curriculum at the early stages of education to equip future generations with a comprehensive understanding of the technology that will permeate their daily lives (Su & Zhong, 2022; Touretzky et al., 2019). Such an initiative would enable children and teenagers to grasp the fundamental concepts of AI, including its potential, limitations, and societal and economic impacts (Li et al., 2022; Su et al., 2023). Moreover, AI education in primary and secondary science settings can also cultivate interest in the field, inspire students to pursue further studies in AI, and potentially nurture future creators and developers of AI technology (Ali et al., 2019; Heintz, 2021). The integration of AI technologies in education aims to revolutionize instructional and learning design, processes, and assessment methodologies.

Research on the application of AI in different disciplines demonstrates its potential to transform and enhance educational experiences. From STEM education (Jang et al., 2022) to language learning (Pokrivcakova, 2019), mathematics education (Gadanidis, 2017), computer science (Di Eugenio et al., 2021), history (Bertram et al., 2021), arts (Kong, 2020) and music education (Zulić, 2019), etc., AI holds promise in providing personalized, interactive and innovative learning opportunities. Despite its increasing popularity, the integration of AI into primary and secondary science education encompasses a broad range of implementation approaches and remains a relatively limited number of empirical studies in comparison which presents significant challenges to draw definitive conclusions about the effectiveness and best practices of AI implementation in science education.

Existing Review Studies of AIED

In recent years, the field of AIED (Artificial Intelligence in Education) has gained significant attention, as evidenced by several notable studies (Chen et al., 2020; Holmes et al., 2023; Hwang et al., 2020; Ouyang et al., 2022). However, existing literature reviews in the AIED domain have primarily focused on exploring trends, applications, and the effects of AIEd, primarily from a technological standpoint (Chen et al., 2020; Tang et al., 2023; Zawacki-Richter et al., 2019). Furthermore, these reviews have primarily centered around various educational levels, fields, and contexts, such as higher education (Zawacki-Richter et al., 2019), e-learning (Tang et al., 2023), special education (Drigas & Ioannidou, 2013), STEM education (Xu & Ouyang, 2022), and language education (Liang et al., 2021), etc. There remains a significant gap in the literature regarding the investigation of AI’s application in general science education contexts, specifically at the early stages of education, and the integration of AI technologies with detailed learning outcomes and overall effects. It is crucial to address this gap and conduct a comprehensive literature review to systematically evaluate multiple facets of AI and its overall impact on science education, given the complex nature of AI applications and technologies.

To fill this research gap, our study aims to extend previous literature reviews by focusing on early-stage science education and shedding light on the current global strategies employed to integrate AI into educational settings. This study seeks to provide valuable insights into the pedagogical approaches, learning outcomes, and overall effectiveness of AI integration in science education.

Theoretical Framework

General System Theory (GST) is a comprehensive theoretical framework that postulates the existence of diverse organ systems with dynamically interacting elements and mutually interdependent relationships (Rapoport, 1986; Von Bertalanffy, 1950). At the core of GST is the fundamental principle that a system transcends the mere summation of its constituent elements, encompassing emergent properties and interactions (Drack & Pouvreau, 2015; Von Bertalanffy, 1968). To comprehend the intricate nature and universal principles governing systems, GST emphasizes the adoption of a holistic approach, encompassing internal elements, their functional relationships, and external influences (Crawford, 1974). This theoretical framework has found extensive application across various domains, including the physical, biological, social, and educational realms, enabling the exploration of diverse systems (Drack & Pouvreau, 2015; Kitto, 2014). As an illustration, Chen and Stroup (1993) advocated for the integration of GST as the foundational theoretical framework in science education reform, fostering the integration of subjects like physics, biology, and chemistry and discouraging their compartmentalization. Building upon this perspective, we argue that GST provides a novel holistic lens to comprehend the integration of artificial intelligence (AI) technologies in science education. Within the GST framework, an educational system can be conceptualized as an organic entity composed of five fundamental components: subject, information, medium, environment, and technology (Von Bertalanffy, 1968).

In the context of the General System Theory (GST), the components of an educational system can be understood as follows. Firstly, the subject component pertains to the individuals within the educational system, including instructors and students, who engage in constant and adaptive interactions. Secondly, information encompasses the knowledge that is shared and constructed among the subjects, such as learning content, course materials, and knowledge artifacts. Thirdly, the medium component refers to the means or channels through which information is conveyed and subjects are connected within the system. Fourthly, the environment represents the underlying context that influences the functioning of the entire educational system. Lastly, technology, including AI techniques, is an external element that often impacts the operations and functions of the educational system.

Research Purposes and Questions

Given the imperative for conducting comprehensive trend reviews in the field of AI literature in science education at the early stages of education, focusing on the development of teaching tools, pedagogical strategies, and associated learning outcomes, this study aims to explore the trends in science education studies related to AI published between 2013 and 2023. The primary objectives are to identify potential development trends and uncover distinctive characteristics within this domain. With the increasing diversity of topics and technologies in research concerning the fusion of information for science education with AI, conducting quantitative and qualitative analyses becomes crucial for gaining a deeper understanding of the following research questions (RQs). The study’s methodology follows Galvan and Galvan’s (2017) three main steps for producing a literature review, namely searching, scanning, and writing.The study aims to address the following research questions to shed light on the current state of AI in science education:

  1. 1)

    What were the research trends on AI in science education in terms of yearly distribution and citations?

  2. 2)

    What were the evolving trends of research developmental paths and prominent topics in AISE over time?

  3. 3)

    What were the most prominent keywords and their corresponding themes in AISE research?

  4. 4)

    What were the productive journals that contribute to AISE research?

  5. 5)

    What were the prolific institutes, countries/regions in AISE research?

  6. 6)

    What were the overall effects in terms of learning outcomes of AISE in the early stage?

These research questions are presented by referring to previous studies, which are similar to this study and aim to understand the research profile in this area. A critical bibliometric review was conducted to rigorously analyze, evaluate and synthesize studies pertaining to the above-mentioned review questions.

Methodology

Database

A rigorous critical review was conducted to analyze, evaluate, and synthesize studies pertaining to the aforementioned research questions. To ensure the quality and reliability of the literature sources, the Web of Science (WoS) database was chosen as the primary data source. WoS is widely recognized for its comprehensive coverage and scientific evaluation of specific research fields. Additionally, the Scopus database was utilized to complement the search process. Scopus provides an integrated search facility and encompasses primary bibliographic sources from reputable publishers such as Elsevier, Springer, ACM, and IEEE, among others. It offers comprehensive coverage of journals and top-ranked conferences within various fields of interest. Given the relatively limited body of literature on the subject, the search was not restricted to specific journals or regular conference proceedings. The search strategy involved examining titles, keywords, and abstracts of relevant papers published between 2013 and 2023.

Search Strategies for Articles

Based on the specific requirements of bibliographic databases, we proposed the search strategies. In terms of the research questions, four types of keywords were used as the search terms. First, keywords related to AISE and specific AI applications were added (i.e., “artificial intelligence” OR “AI” OR “AIED” OR “machine learning” OR “intelligent tutoring system” OR “expert system” OR “recommended system” OR “recommendation system” OR “feedback system” OR “personalized learning” OR “adaptive learning” OR “prediction system” OR “student model” OR “learner model” OR “data mining” OR “learning analytics” OR “prediction model” OR “automated evaluation” OR “automated assessment” OR “robot” OR “virtual agent” OR “algorithm” OR “machine intelligence” OR “intelligent support” OR “intelligent system” OR “deep learning” OR “AI education”). Second, keywords related to science were added (i.e., “science”). Third, keywords related to education were added (i.e., “education” OR “learning” OR “course” OR “class” OR “teaching”). Fourth, keywords related to education level were added (i.e., (“secondary school” OR “middle school” OR “primary school” OR “elementary school”) NOT (AB= (“higher education” OR “university” OR “college” OR “undergraduate” OR “kindergarten” OR “early childhood”) NOT (AB= (“literature review” OR “systematic review).

Inclusion and Exclusion of the Articles

Following the literature scanning phase, data collection procedures adhered to the guidelines provided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (Moher et al., 2009). The PRISMA flow chart, as outlined by Moher et al. (2009), was employed to systematically map the identified articles, as well as those included and excluded from the review, along with the reasons for exclusions (refer to Fig. 1). Adhering strictly to the PRISMA method, which serves as the preferred reporting framework for systematic reviews and meta-analyses (Moher et al., 2015), ensured the systematic collection of target studies for review. To capture the diverse nature of AI and its role in science learning activities, the abstracts of the identified articles were thoroughly reviewed following the identification of databases and journals. The following criteria were applied during the screening process. Subsequently, full-text articles were accessed and thoroughly examined to ensure their alignment with the objectives of the review. Ultimately, the targeted articles were retrieved from the initial search. After collecting the initial set of articles, a filtering process was implemented based on four inclusion and exclusion criteria as Table 1 shows. These criteria were applied to ensure the relevance and suitability of the selected articles for the review.

Table 1 Inclusion and exclusion criteria

After the initial round of searching, a total of 487 articles from the Web of Science (WoS) and 253 articles from Scopus were retrieved. Following a review of titles and abstracts, 649 records were excluded as they did not fall under the category of Education & Educational research or were not journal articles. Consequently, the number of articles was reduced to 103 based on the predefined inclusion and exclusion criteria. To further confirm the eligibility of the chosen studies, all of them were thoroughly screened manually by researchers. In the bibliometric study, to ensure the interrater reliability of data, researchers searched, and a consensus agreement was reached after team discussions and experts’ reviews. In this review study, the selected papers are marked with an asterisk in the reference list and are listed in Appendixes 1 and 2. The first author examined them. The second author independently reviewed approximately 30% of the articles. The inter-rater agreement between the two authors was determined to be 92%. Next, the full text of the articles was thoroughly reviewed by the first author to ensure they met all the inclusion criteria. As a result, a total of 76 articles that met the established criteria were identified for this bibliometric review and subsequent in-depth analysis. Figure 1 illustrates the selection process and the methods employed.

Fig. 1
figure 1

PRISMA procedure for the article selection process

Coding Scheme

After completing the screening process, a comprehensive review was conducted for each selected article, during which pertinent information was meticulously extracted. To facilitate the classification of articles and address the research questions effectively, the content analysis method, as advocated by Cohen et al. (2002) and Zupic and Čater (2015), was employed. Consistent with the tenets of General System Theory (GST), a meticulously devised coding scheme was developed, encompassing the constituent elements of the educational system. This coding scheme enabled a systematic examination of articles pertaining to the integration of artificial intelligence (AI) in science education (for detailed information, refer to Appendix 2). The coding scheme comprised several fundamental components: subject (including educational level, sample size, and measured learning outcomes), information (i.e., learning content), medium (i.e., educational medium), environment (i.e., educational context), and technology (i.e., AI techniques/software). For the purpose of this review, the information component focused primarily on science content, whereas the environment component centered on the experimental context. Notably, the medium component was excluded from the coding scheme, as it did not constitute the primary focus of this research endeavor. To ensure the utmost rigor, experienced researchers independently conducted the coding of data presented in the table and diligently flagged any instances of inconsistencies encountered. Subsequently, these discrepancies were meticulously discussed in face-to-face meetings, fostering a collaborative environment wherein consensus was achieved through open deliberations among the researchers.

Codes for the Learning Outcomes of AISE

In accordance with the learning outcomes taxonomy proposed by Xie et al. (2019), the reviewed studies were categorized accordingly. The learning outcomes were assigned codes that encompassed seven categories: affection, cognition, skills, behavior, correlations, others, and no experimental results (e.g., review/conceptual articles). The category “affection” further comprised eight sub-categories, namely technology acceptance/learning intention, learning attitudes/expectation of learning engagement, learning motivation, self-efficacy/confidence, interest/satisfaction, cognitive load, learning anxiety, and students’ opinion/learning experiences. The second category, “cognition,“ included three sub-categories: learning achievements, high-order thinking/competence, and collaboration/communication. The remaining five categories were self-contained and did not require further subdivision. Considering these factors, the category of no experimental results was omitted, and a customized coding scheme was developed (refer to Appendix 2), wherein the specific meaning of each code was elaborated.

Codes for the Effects of AISE

The symbols (labeled as i.e., +, =, X, +!) depicted in Appendix 1 reveal the outcomes of the reviewed studies. The majority of studies yielded positive findings, indicating the effectiveness and usefulness of AI in science education (+). Some studies presented neutral findings (=), while others reported mixed results (+=), signifying a combination of positive and negative outcomes. Additionally, certain studies reported positive findings but did not provide inferential statistics to support their claims (+!). For a detailed representation of these results, please refer to Appendix 1.

Data Analysis

This review study employed a bibliometric analytical approach, which has been widely used to identify trends in AI in Science Education (AISE) studies (Chen et al., 2019; Jia et al., 2022; Martí-Parreño et al., 2016). Bibliometric analysis utilizes mathematical and statistical techniques to quantitatively analyze the bibliographic features of a specific body of literature (Pritchard, 1969). It is regarded as an effective and scientific method to discover and visualize patterns within a collection of knowledge, unveil trends in a research field, and analyze the underlying structures, evolution, and dynamic aspects of research (Trinidad et al., 2021). By employing bibliometric analysis, this study aims to visualize research characteristics and trends in various aspects, including subject domains, keywords, themes, and contributors in terms of countries or regions and institutes (Aktoprak & Hursen, 2022; Zou et al., 2022).

In this research endeavor, the CiteSpace software, a widely utilized Java application, was employed as a powerful tool to visualize and analyze data trends and intricate interrelationships within the realm of scientific literature (Chen, 2004). CiteSpace has emerged as a pivotal asset in the field of bibliometric analysis, providing the means to generate visually compelling co-occurrence knowledge maps that elucidate the interconnectedness among knowledge areas, documents, or authors (Small, 1999). To construct these comprehensive visualizations, the data encompassing yearly publications and citations sourced from renowned databases such as Web of Science (WOS) and Scopus were meticulously inputted into Microsoft Excel. Subsequently, trend charts were meticulously generated to effectively illustrate the distribution of yearly publications and citations, unraveling valuable insights into the temporal dynamics of scholarly contributions. The co-occurrence knowledge mapping technique, harnessed through the capabilities of CiteSpace, constituted a pivotal approach employed in this study to unravel and represent the developmental paths, emerging trends, and underlying keywords in the realm of AISE research. By meticulously scrutinizing the frequency of keywords employed across a myriad of studies, the co-occurrence knowledge mapping methodology was leveraged to quantitatively assess the prominence and popularity of specific research topics and domains, shedding light on their respective centrality within the scholarly discourse (Pei et al., 2021).

Moreover, the sophisticated clustering function embedded within the CiteSpace framework played a fundamental role in unraveling the main themes permeating the landscape of AISE research. This innovative clustering approach facilitated the systematic modularization and grouping of high-frequency keywords into distinct thematic clusters, offering a comprehensive and coherent representation of the multifaceted dimensions underpinning AISE scholarship. The ensuing visual depictions in the form of clustered maps not only provided a holistic overview of the primary themes pervading the research landscape but also enabled the identification of interconnections and patterns within and across these thematic clusters.

Furthermore, the timeline view function incorporated within CiteSpace proved instrumental in conducting a meticulous progression analysis, enabling a comprehensive understanding of the evolution and trajectory of AISE research within specific themes over time. By leveraging this temporal perspective, researchers were empowered to discern the shifting emphases, emerging subdomains, and evolving research directions that have shaped the landscape of AISE scholarship throughout the duration of the study (Rosvall & Bergstrom, 2010; Song & Wang, 2020). Additionally, co-citation analysis, coupled with clustering mapping, constituted a pivotal facet in unraveling the intricate patterns of citations emanating from influential research studies, both at the level of individual researchers, institutes, and countries/regions. Through a meticulous examination of citation patterns and interconnections between different groups, researchers could gain valuable insights into the key contributors and influential entities driving research advancements within specific topics or areas of investigation (Park & Shea, 2020). This comprehensive analysis of citation patterns not only offered a nuanced understanding of the scholarly impact and significance of specific research endeavors but also shed light on the collaborative networks and knowledge dissemination dynamics within the broader AISE research community. To analyze the data, the year of publication, contributing countries/regions, journals, prolific authors, institutes, countries/regions, pedagogical strategies and outcome domains involved in the AISE studies were categorized, and the relevant frequencies were calculated.

Results and Discussions

The results are displayed in the order of presented research questions (RQs). There are 6 RQs in total. Specifically, RQ 1–5 focuses on the bibliometric information and the methodology of the reviewed studies; RQ 6 highlights the learning outcomes and reports the effectiveness of the reviewed AISE research based on the coding scheme. Moreover, the detailed coding of the key dimensions in the reviewed AISE empirical studies can be found in Appendixes.

Bibliometric Analysis of AISE

Yearly Distribution of AISE

Figure 2 provides a comprehensive overview of the annual publication trends within the realm of AISE research over the course of a decade, spanning from 2013 to 2023, as observed in both the examined databases. The graphical representation of volume statistics allows for a visual assessment of the distribution patterns of these publications. The findings reveal a discernible fluctuating upward trajectory in AISE research output over time. During the analyzed decade, the number of relevant publications exhibited periodic increases and decreases, occurring in waves. Notably, the volume of publications experienced its initial peak in 2017, following a pattern of gradual growth from 2014. Subsequently, a considerable decline in the number of educational studies is evident during the 2018–2019 period. This decline can be attributed to the global impact of the COVID-19 pandemic, which had significant repercussions on research activities and scholarly outputs across various disciplines. However, it is worth noting that the number of publications gradually rebounded, reaching its second and third peaks in 2020 and 2022, respectively. It is important to acknowledge that the volume of publications for the most recent year (2023) appears lower than the preceding years. This discrepancy can be attributed to the limitations of the data collection process for this review, which concluded in June 2023. Consequently, publications released in the latter part of 2023 were not included in the analysis. Based on the observed trends and considering the ongoing growth trajectory, it is anticipated that the volume of relevant literature in the field of AISE will continue to rise in the future.

Fig. 2
figure 2

Number of publications between 2013 and 2023

Developmental Paths of AISE

To gain a deeper understanding of the identified clusters, CiteSpace employs a process that involves extracting noun phrases from various sources such as titles (T), keyword lists (K), or abstracts (A) of the articles that cite a specific cluster. The resulting clusters are then evaluated based on two important metrics: modularity (Q) and weighted mean silhouette (S) scores. These metrics provide insights into the structural properties and homogeneity of the network. A modularity score (Q) greater than 0.3 indicates a significant clustering structure. The weighted mean silhouette (S) score reflects the homogeneity of a cluster, with higher scores indicating greater consistency among cluster members, assuming the compared clusters have similar sizes. A silhouette score above 0.5 is considered reasonable, while a score above 0.7 indicates highly convincing clusters. In terms of uniqueness and coverage, the LLR (Log-likelihood ratio) algorithm typically yields the best results (Table 2).

Table 2 Cluster summary of the label

The clusters identified in this study are assigned numerical labels in descending order of cluster size. As Fig. 3 shows the largest cluster, labeled #0, is related to collaborative learning. The second-largest cluster, labeled #1, pertains to robotics. Other notable clusters include #2 computer science education, #3 intelligent tutoring system, #4 primary school, #5 vocabulary, #6 regression model, #7 theory of planned behavior, #9 improving classroom teaching, #10 explanation, and #12 explanation. These clusters provide a categorization of the identified research topics within the AISE domain and offer valuable insights into the main areas of focus within this field.

Fig. 3
figure 3

The keyword clustering map of AISE between 2013 and 2023

According to Mazov et al. (2020), a research trend refers to the collective action of a group of researchers who begin to focus their attention on a specific scientific topic. It typically arises when the interests and needs of the research community align with the current scientific advancements. The visualization of timelines and time zones provides a representation of the progression of keywords over time, offering insights into the development of research themes (Chen, 2016). In Figs. 4 and 5, the time zone visualizations demonstrate the significant evolution of Artificial Intelligence in Science Education (AISE) integrated into technology-supported learning research from 2013 to 2023.

Based on the analysis of Figs. 4 and 5, the development process of this research can be categorized into three stages.

Fig. 4
figure 4

The timeline map of AISE between 2013 and 2023

Fig. 5
figure 5

The timezone map of AISE between 2013 and 2023

During the first stage (2013–2014), the field of AISE research experienced an initial surge in attention in 2013. This was followed by a focus on clustering #3 intelligent tutoring system and #6 regression models. Table 2 provides relevant cluster labels, revealing that this time period witnessed an influx of research on the application of intelligent tutoring systems in various educational contexts, particularly in science education. Theoretical studies also emerged, exploring topics such as learning satisfaction and expectations in open learning environments. For instance, researchers developed different intelligent tutoring systems with the goal of enhancing science learning for primary school students through interactive multimedia environments. Several studies, including Dolenc and Aberšek (2015a), Segedy et al. (2013a); Ward et al. (2013), reported significant learning gains among students who used these systems. Additionally, Avsec et al. (2014) conducted a study using a regression model to examine student satisfaction, focusing on their attitudes toward open learning (OL) usage. The aim was to transform teaching and learning by integrating learning science and emerging technologies.

These studies illustrate the early efforts in leveraging intelligent tutoring systems and regression models to improve science education outcomes, both in terms of student learning gains and satisfaction in open learning environments.

During the second period (2014–2019), there was a significant increase in the number of studies, reaching its peak in 2017. This period also witnessed a diversification of clustering results, including #1 robotics, #2 computer science education, #7 theory of planned behavior, #10 explanation, and #12 explanation. Analysis of the timeline graph reveals a shift in research focus towards robot-based and computer-based learning and teaching, accompanied by an exploration of various learning theories and research methods, with an emphasis on programming and computational thinking. Experts and academics have dedicated their research efforts to these areas, resulting in a rich body of literature. Numerous studies have combined the use of educational robots and programming platforms to implement project-based learning approaches, with a primary focus on developing skills related to science, technology, engineering, arts, and mathematics (STEAM). These studies have demonstrated significant improvements in concepts and skills related to computational thinking (CT). For instance, Elizabeth Casey et al. (2018), Julià and Antolí (2016), Nemiro et al. (2017), and Witherspoon et al. (2018) investigated the use of educational robots and programming in project-based learning, highlighting the superior development of CT concepts and skills in students. Other studies, such as those conducted by Gomoll et al., 2018); Sisman et al. (2019), focused on enhancing the use of humanoid robots in educational settings and assessing their impact on classroom transformation. These studies emphasized the role of students’ attitudes towards robot-based education in increasing their interest and knowledge in science subjects. Furthermore, Shiomi et al. (2015) explored the effects of social robots on children’s interest in science.

Based on these findings, it can be predicted that the future hot topics in AISE research will continue to be closely related to the specific applications of different types of robots in educational contexts. The integration of robotics holds promise for enhancing student engagement and knowledge acquisition in science education.

During the third period (2019–2023), several clusters emerged, namely #0 collaborative learning, #4 primary school, #5 vocabulary, and #9 improving classroom teaching. Researchers in this period have built upon previous studies and developed various techniques and algorithms to support the integration of AI in the field of science learning. There has been a particular focus on collaborative learning and problem-solving processes, utilizing multiple data analysis techniques to enhance and improve science teaching in the classroom. The research areas during this period primarily revolved around life sciences, engineering sciences, and other related disciplines. The utilization of educational robots and intelligent tutoring systems continued to be prominent, following the technological approaches of previous periods. However, it is noteworthy that the keywords related to Artificial Intelligence (AI) started to appear in research only since 2020 and have been rapidly increasing. This indicates that AI in science education has recently gained widespread attention from the academic community and is currently undergoing extensive exploration. Nguyen et al. (2023) demonstrated the potential use of AI techniques, specifically data mining, in predicting collaborative learning success by examining regulatory patterns and constructing learning outcomes and learning performance. Learning analytics faced various challenges, and researchers have realized the benefits of game-based learning environments that incorporate implicit assessments or build predictive models (Lu et al., 2023). Furthermore, researchers have explored various artificial neural network (ANN) models (Çetinkaya & Baykan, 2020; Zhai et al., 2022), deep neural networks (DNN) (Min et al., 2020), decision trees (Biehler & Fleischer, 2021; Göktepe Körpeoğlu & Göktepe Yıldız, 2023), visual recognition (Wu & Yang, 2022), image classification (Martins et al., 2023), and Bayesian networks (Jiang et al., 2023). These new algorithms have provided valuable insights into students’ learning and progress in primary and secondary science education, indicating an emerging trend in the field.

Overall, the third trend reflects the ongoing exploration and development of AI techniques in science education, with a focus on collaborative learning, problem-solving, and the application of advanced algorithms to enhance teaching and learning outcomes.

Research Topics

The keyword co-occurrence analysis resulted in a keyword co-occurrence map (Fig. 6) comprising 354 nodes and 1198 co-citation links spanning the period between 2013 and 2023. The density of the keyword co-occurrence network was calculated to be 0.0192, which indicates the interconnectedness of the keywords in the network. In this map, nodes represent keywords, and the size of the nodes corresponds to the frequency of the respective keywords (Chen et al., 2012; Chen, 2017). Meanwhile, the links between keywords indicate their co-occurrence relationship. Thicker links signify a closer relationship between the associated (Chen et al., 2015; Rawat & Sood, 2021). Analyzing Fig. 6, it is evident that certain keywords, such as “students,“ “science education,“ “robotics,“ “artificial intelligence,“ and “machine learning,“ have higher frequencies and exhibit denser and thicker links among them. These keywords represent areas of higher research interest and influence in the field of AI in science education. Additionally, keywords like “collaborative learning,“ “computer-aided instruction,“ “computational thinking,“ “automated feedback,“ “STEM education,“ “learning perception,“ and “learning achievement” or “satisfaction” also demonstrate significant associations with the core keywords. This indicates that scholars have positioned science education alongside these keywords, representing focal areas for research. The prominence of keywords related to robotics, artificial intelligence, and machine learning underscores the significance of both hardware technologies and software algorithms in the effective use of AI in education. It suggests that advancements and innovations in these areas are essential to bringing about fundamental changes in education through the integration of AI technologies. On the other hand, the low density of distribution for each keyword in the co-occurrence map indicates that research on AI in the scientific field is currently scattered and has not yet coalesced into a concentrated research technology or a specific research direction. This suggests that there is room for further consolidation and integration of research efforts in this domain.

Overall, the keyword co-occurrence map offers valuable insights into the key themes and relationships in AI in science education research, highlighting prominent research areas and underscoring the potential for further collaboration and development in the field.

Fig. 6
figure 6

Keyword co-occurrence map of AISE between 2013 and 2023

Productive Journals

Figure 7 provides an overview of the top productive journals in which the selected papers on AI in science education (AISE) research were published, along with the corresponding number of papers published in each journal. The analysis reveals that a significant proportion of the papers in this field were published in several prominent journals.

The International Journal of Social Robotics emerged as the leading journal for AISE research, with a substantial number of papers published. This indicates the growing interest and focus on the intersection of social robotics and science education. ETR&D-Educational Technology Research and Development, International Journal of Technology and Design Education, Education and Information Technologies, Frontiers in Psychology, British Journal of Educational Technology, Journal of Science Education and Technology and IEEE Transactions on Learning Technologies also demonstrate notable contributions to the literature in this field.

The presence of these journals highlights the multidisciplinary nature of AISE research, spanning educational technology, psychology, design education, and learning technologies. These journals provide platforms for researchers to share their findings, insights, and advancements in integrating AI into science education. The prominence of these journals also suggests that they are recognized as reputable outlets for disseminating AISE research. Scholars and practitioners interested in AI in science education can refer to these journals to access a wide range of studies and stay updated on the latest developments and trends in the field.

Overall, the distribution of publications across these top journals signifies the significance of AISE research and the collaborative efforts of researchers from diverse disciplines to advance the understanding and application of AI in science education.

Fig. 7
figure 7

Top productive journals of AISE between 2013 and 2023

Prolific Institutions and Countries/regions

An analysis of international cooperation in AISE research sheds light on the relationships between different countries and their impact on the field (Rosvall & Bergstrom, 2010; Small, 1999). Figure 8 provides insights into the contribution and cooperation of various countries in the realm of AISE research from 2013 to 2023. The analysis reveals that a total of 29 countries/regions from around the world have actively participated in research related to AISE.

Fig. 8
figure 8

Prolific countries/regions of AISE between 2013 and 2023

The geographic distribution of the reviewed studies depicted in Fig. 8 highlights the dominant role of the United States, which accounts for nearly 40% of the research in this field. The United States emerges as the leading contributor with a substantial count of 31 publications. This indicates the influential position of the United States in shaping the direction and advancements of AISE research. The country has played a pivotal role and serves as the core for research collaborations in this domain. Following the United States, Turkey and the People’s Republic of China (PRC) stand out as significant contributors to AISE research, with publication counts of 8 and 5 respectively. Canada and Taiwan also demonstrate notable involvement, with both countries having publication counts of 4. South Korea, Slovenia, Japan, Finland, and Spain have each contributed to the field with 3 studies. The analysis suggests that the USA’s pivotal status in AISE research is primarily due to its higher publication frequency and significant research collaborations. The country has established itself as a key player in driving innovation and advancements in AI and science education integration. The collaborations and partnerships formed by researchers and institutions within the USA have contributed to the growth and development of the field.

Overall, the analysis of international cooperation in AISE research underscores the global nature of this field and the collective efforts of researchers from various countries. While the USA maintains a central position, the contributions from other countries indicate the widespread interest and collaborative initiatives in exploring the potential of AI in science education. Such international cooperation fosters knowledge exchange, cross-cultural perspectives, and the enrichment of research outcomes in the field of AISE.

Between 2013 and 2023, 78 institutions from across the globe contributed to research in the field of AISE (See Fig. 9). The main institutions like Carnegie Mellon University, Vanderbilt University, National Taipei University, University System of Georgia and University of Basque Country have relatively high productivity and co-authorship links. The prolific institutions obtained after the analysis are distinguished by the colour of the nodes as a chronological distinction, where the nodes that form the network are mainly regional group nodes and isolated nodes, and the number of node connections in this network is low and present mainly in some of the nodes, indicating poor connectivity of the network. Although several small groups have been formed in this area of research with different backgrounds, there has been no large-scale collaboration and interaction between institutions in different regions.

Fig. 9
figure 9

Prolific institutions of AISE between 2013 and 2023

Top AI Techniques

In Fig. 10, an analysis of 76 studies reveals the popularity and evolution of various AI technologies in the field of AISE. Among these studies, the most widely utilized technology was educational robots, with a count of 39. This indicates the significant emphasis on integrating robots into educational settings to enhance science learning experiences. The educational robot category encompasses diverse types, including social robots (Shiomi et al., 2015; Yueh et al., 2020), programming robots (Gkiolnta et al., 2023; Salas-Pilco, 2020; Witherspoon et al., 2018), human-centered robotics (Bernstein et al., 2022; Gomoll et al., 2018), and chatbots (Deveci Topal et al., 2021).

Fig. 10
figure 10

Distribution of AI techniques of science education between 2013 and 2023

Machine learning and data mining emerged as the second most prominent technology, with a count of 16. This category encompasses a range of techniques and algorithms, such as Bayesian networks (Dettweiler et al., 2017; Hagger & Hamilton, 2018; Jiang et al., 2023), genetic algorithms (Yin et al., 2016), natural language processing (Aldabe & Maritxalar, 2014), computer simulation functions (Magana et al., 2019), visual recognition (Wu & Yang, 2022), convolutional neural networks (Chen et al., 2017), decision trees (Biehler & Fleischer, 2021; Göktepe Körpeoğlu & Göktepe Yıldız, 2023), and image classification (Martins et al., 2023). The use of machine learning and data mining in AISE research demonstrates the increasing recognition of the potential for analyzing and deriving insights from large datasets to support science education.

Intelligent tutoring systems (ITS) garnered attention in 8 studies. ITS integrates intelligent tutor (Dolenc et al., 2015; Dolenc & Aberšek, 2015a; Ward et al., 2013), visual systems (Polyak et al., 2017), mentor agents (Dede et al., 2017), or conversational agents (Segedy et al., 2013a) to provide adaptive learning and personalized support for primary and secondary students. The focus on ITS highlights the importance of tailoring instructional content and feedback to individual learners, promoting more effective and engaging science learning experiences.

The detection and prediction category encompassed affect detection (Almeda & Baker, 2020), performance prediction (Çetinkaya & Baykan, 2020; Lu et al., 2023), activities prediction (Järvelä et al., 2023), and etc., with a count of 7 studies. These studies aimed to develop algorithms and models to detect and predict learners’ emotional states, performance levels, academic achievements, and grades. This line of research contributes to understanding students’ learning progress and providing targeted interventions and support.

Automation, consisting of automated feedback (Cutumisu et al., 2017; Lee et al., 2021), automated educational assessment (Qian & Lehman, 2018; Saha & Rao CH, 2022; Zhai et al., 2022), and auto-subtitle systems (Malakul & Park, 2023), was explored in 6 studies. This category reflects efforts to automate certain aspects of the learning process, streamlining administrative tasks and providing timely feedback to students, thereby enhancing efficiency and effectiveness in science education.

The evolution of these AI technologies in AISE research shows an overall growing tendency in their usage. Educational robots and ITS were employed at an early stage, demonstrating their established presence in the field. On the other hand, technologies such as deep neural networks (DNN), genetic algorithms, Bayesian networks, clustering algorithms, natural language processing, multiple regression analysis, visual recognition, image classification, and various detection and prediction methods gained attention at later periods, indicating the emergence of new avenues for exploration and innovation in AISE research.

Table 3 provides further insights into the characteristics of the reviewed 76 studies in AISE. It reveals information about the educational levels of learners, sample sizes, and the duration of the studies. Among the reviewed studies, 31 of them specifically mentioned the educational level of learners at the primary level, indicating a focus on elementary school students. Meanwhile, 44 studies focused on the secondary level, targeting high school students. Only one study encompassed both primary and secondary levels, suggesting a limited number of studies that encompass a wider age range. Regarding sample size, the most common category was studies with a large-scale sample size of more than 80 learners, which accounted for 40 studies. This suggests that researchers often aimed to gather data from a substantial number of participants to ensure statistical robustness and generalizability of findings.

Table 3 Distribution of sample size, research length and educational level

In terms of study duration, the majority of studies (36) had a duration of less than two weeks. This indicates a preference for shorter-term interventions or experimental sessions in AISE research. Additionally, 23 studies had a duration ranging from 2 to 10 weeks, suggesting a moderate-length intervention period. On the other hand, 17 studies had a duration of more than 10 weeks, indicating a smaller proportion of studies with longer-term implementations.

Overall, the findings suggest that AISE studies tend to focus on specific educational levels, with a predominant emphasis on secondary level learners. Additionally, researchers commonly opt for larger sample sizes, implying the desire for reliable and representative data. The prevalence of shorter study durations suggests a preference for concise interventions, while studies of longer duration are less common.

Content Analysis of Learning Outcomes and Overall Effects of AISE

Measured Learning Outcomes Analysis in AISE

Affection

In Table 4, among the reviewed articles, a significant number of studies focused on examining the educational effects of AI technologies on students’ affective perception, particularly in relation to learning attitudes/expectation of learning engagement(N = 15), interest/satisfaction(N = 7) and learning motivation (N = 6). In terms of learning attitudes and expectations, 15 studies explored the impact of AI technologies on students’ attitudes toward learning and their expectations of learning engagement. These studies generally reported positive attitudes and expectations toward the integration of AI technologies in science education. Researchers, such as Dettweiler et al. (2017), Hagger and Hamilton (2018), and Witherspoon et al. (2018), provided evidence linking students’ autonomous motivation towards science activities and the positive impact of AI technologies. Similarly, studies conducted by Göktepe Körpeoğlu and Göktepe Yıldız (2023), Sisman et al. (2021), and Üçgül and Altıok (2022) examined the variables influencing students’ attitudes towards science education and highlighted the positive effects of integrating AI technologies.

Furthermore, the application of AI technologies also contributed to students’ interest and satisfaction in science learning. Seven studies specifically investigated the impact of AI technologies on students’ interest and satisfaction, revealing positive outcomes. For instance, Avsec et al. (2014) examined critical factors affecting students’ satisfaction and found that students had significantly positive perceptions of using robotics as a learning-assisted tool. Malakul and Park (2023) concluded that an auto-subtitles system in English educational videos enhanced students’ learning comprehension, reduced cognitive load, and increased satisfaction.

These findings indicate that the integration of AI technologies in science education has the potential to promote positive attitudes, motivation, interest, and satisfaction among students. The studies reviewed provide support for the beneficial effects of AI technologies in enhancing students’ affective perception towards science learning.

Cognition

Most studies have investigated the impact on learners’ cognition with the involvement of AI technologies. In Table 4, the impact on learners’ learning achievement is the most studied (N = 21), followed by high-order thinking/competence (N = 7) and collaboration/communication (N = 5).

The most commonly studied aspect was learners’ learning achievement, with 21 studies specifically exploring its impact. These studies aimed to compare the learning outcomes of students in situations where AI technologies were involved, in order to assess whether the technologies enhanced learners’ achievements. For example, Hoorn et al. (2021) conducted experiments using different designs of social robots and found that these robots directly improved students’ scores.

Another aspect of cognition that was frequently examined was high-order thinking or competence, which was the focus of 7 studies. These studies investigated the impact of AI technologies, particularly in the context of programming education, on students’ ability to engage in higher-order thinking processes. Researchers such as Noh and Lee (2020), Pou et al. (2022), and Witherspoon et al. (2017) evaluated the effectiveness of robotics programming curricula in developing students’ computational thinking (CT) knowledge, which is considered a form of high-order thinking. These studies aimed to bridge the gap in CT research and assess the impact of AI technologies on students’ ability to understand computer science concepts more easily.

In addition, 5 studies examined the impact of AI technologies on learners’ collaboration and communication skills. These studies explored how the integration of AI technologies in educational settings facilitated collaborative learning and improved students’ communication abilities. The specific effects varied across the studies, but the overall findings highlighted the potential of AI technologies to enhance collaboration and communication among learners.

These findings indicate that researchers have primarily focused on investigating the impact of AI technologies on learners’ cognition, particularly in terms of learning achievement, high-order thinking/competence, and collaboration/communication. The studies reviewed provide insights into the potential of AI technologies to enhance these cognitive aspects of learning and highlight the importance of further research in these areas.

Skills

Indeed, the integration of AI technology in science education has been found to deepen students’ skills, including programming skills. Among the reviewed studies, 18 of them specifically examined the impact of AI technology on learners’ skills, particularly in the context of programming. These studies focused on assessing how the involvement of AI technology, such as robotics, in the learning process influenced students’ programming skills. The curriculum for the robotics development process was often used as a framework for teaching and assessing these skills. By engaging with AI technologies, students had the opportunity to practice and enhance their programming abilities, gaining hands-on experience in designing, building, and programming robots. The findings from these studies highlighted the positive effects of AI technology integration on students’ programming skills. The practical application of programming concepts within the context of robotics provided a meaningful and engaging learning experience, allowing students to develop and refine their programming abilities.

Overall, the studies in this area demonstrate that the incorporation of AI technology, particularly in the form of robotics, can effectively contribute to the development of students’ programming skills in science education. This emphasizes the potential of AI to enhance students’ practical skills and prepare them for the demands of an increasingly technology-driven world.

Others

Indeed, a smaller number of articles in the reviewed studies focused on validating behavior and correlations of learning outcomes, particularly through the application of interaction data mining techniques, predictive models, and other machine learning categories. These studies aimed to explore the relationships between learners’ behaviors, interactions, and learning outcomes using data mining techniques and machine learning algorithms. By analyzing the patterns and correlations within the collected data, researchers sought to validate the effectiveness of AI technologies in predicting and improving learning outcomes.

For example, researchers may have used data mining techniques to analyze students’ interaction data with AI-based learning systems or educational robots. They may have employed predictive models to forecast students’ learning achievements or to identify behavioral patterns that are associated with successful learning outcomes. The focus of these studies was to provide empirical evidence supporting the efficacy of AI technologies in predicting and influencing learning outcomes. By leveraging machine learning algorithms and data analysis techniques, researchers aimed to validate the connections between learners’ behaviors, interactions, and their ultimate learning achievements.

While the number of studies in this area may be smaller compared to other research topics, their contributions are significant in terms of understanding how AI technologies can effectively analyze and leverage learner data to validate learning outcomes and identify correlations between behaviors and achievements. This validation helps strengthen the evidence base for the use of AI in education and informs the development of more targeted and effective learning interventions.

Table 4 Distribution of measured learning Outcomes analysis in AISE

Overall Effects of AISE in the Early Education Stage

This review summarized the overall effects of AI applications in AISE research as well. From the educational perspective as is reflected by the symbols, 67 of the 76 reviewed articles reported the positive educational effects and findings when applying AI techniques in science education. Specifically, 6 out of the 76 articles reported mixed results. 2 articles concluded with negative findings and 1 article concluded with neutral findings. Figure 11 shows the distribution of different effects.

Fig. 11
figure 11

Distribution of the effects of AISE between 2013 and 2023

The findings from the six articles reporting mixed results suggest that the impact of AI technologies on learning outcomes can vary across different contexts and settings. These studies highlight the complexity of the relationship between AI interventions and learning outcomes, as well as the influence of individual differences and other factors. In the 6 articles which reported mixed results, Shiomi et al. (2015) proved the social robot named Robovie did not influence the scientific curiosity of the entire class, there were individual increases in the children who asked Robovie science questions. Similarly, Luo et al. (2020) reported in the CT-integrated science unit developed for the study both successes and failures in 2 participants of a quantitive study. And Segedy et al. (2013b) used a virtual agent named Betty to teach the science topics while both the experimental group (with Contextual Conversation feedback) and the control group showed significant pre-to-post test learning gains, the difference in learning gains between the groups was not statistically significant. and Noh & Lee’s (2020)’s results revealed that the inquiry-based scenarios improved students’ inquiry skills and subject knowledge, but study motivation of affection and computational thinking did not improve. And Hoorn et al. (2021) results show that affective bonding tendencies may occur but did not significantly contribute to the learning progress, but also improved students’ achievements. And precision education (PE) using ITS, the quantitative data were analyzed with nonparametric statistics, which did not show significant differences (Liu, 2022). Most of the studies that presented mixed results were in the category of educational robots in AI technology. The reason for this may be explained by the fact that the schools have used robots previously in learning and the emotional or cognitive learning outcomes effect has faded out.

The two articles reporting negative results further emphasize the variability in the impact of AI technologies on learning outcomes. Rosi et al. (2016) found that the presence of a humanoid robot did not result in significant learning improvement. Dolenc et al. (2015) found that an individualized and adaptive ITS showed poorer reading comprehension compared to reading the same text on paper.

Overall, these studies demonstrate that the effects of AI technologies on learning outcomes are nuanced and can be influenced by various factors. The mixed and negative results highlight the importance of considering contextual factors, individual differences, and the specific nature of the AI interventions when examining their impact on learning. Further research is needed to understand the conditions under which AI technologies can effectively enhance learning outcomes and to identify best practices for their implementation in educational settings.

Conclusions and Implications

Although AIEd has attracted wide attention in educational research and practice, few research works have investigated the applications of AI in the science education context. To gain a comprehensive understanding of the integration of AI in science education, this study reviewed the empirical studies of AISE in primary and secondary schools that were published between 2013 and 2023. Grounded upon a bibliometric literature review, we proposed to investigate and analyze the development trends employed to teach and learn AI in science education worldwide, in order to offer conclusions and recommendations for future educational endeavours.

To answer the first question, we found a gradually increasing trend of AISE in the past decade based on period changes. And in the second question, the evolving trends of research developmental paths and prominent topics in AISE over time clearly presented a changing shift in AI applications. Trendy AI technologies (i.e., educational robots, ITS, data mining, machine learning, algorithms) were being incorporated into primary and secondary science education by scholars to mine, contrast and examine the different educational effects. Five categories of AI applications were located, namely educational robots, machine learning\data mining\algorithms, ITS, automation and detection\prediction, which were frequently applied in the learning contents of science. The most prominent keywords and their corresponding themes in AISE like science education, robotics, artificial intelligence and machine learning have a higher frequency, with thicker or denser links between them and keywords such as collaborative learning, computer-aided instruction, computational thinking, automated feedback, stem education, learning perception and learning achievement or satisfaction, to some extent reflected the link between AI technology and the subject, medium, and environment of science education. Regarding the fourth and fifth questions, the top productive journals included the International Journal of Social Robotics, Educational Technology Research and Development, International Journal of Technology and Design Education, etc. In addition, the prolific institutes and countries/regions were mostly located in the USA. Furthermore, about the last question, in terms of learning outcomes of AISE in the early stage, a majority of studies revealed the educational effects of AI technologies on students’ affective perception, followed by cognition and skills. In this review, we have highlighted the overall positive educational effects of applying AI technologies in science education. However, it is crucial to note that the integration of AI in science education, particularly at the primary and secondary levels, remains superficial. This shallow integration extends to educational assessment, where the potential of AI has yet to be fully realized. One of the most glaring gaps is the lack of a comprehensive curriculum system that incorporates AI in science education evaluation (AISE). This absence suggests that while AI has shown promise in enhancing educational outcomes, its application is still in the nascent stages. The reasons for this could range from a lack of resources and expertise to institutional barriers that hinder the adoption of AI technologies.

The identified limitations in the systematic review provide valuable insights for future research directions in the field of AI in science education (AISE). Addressing these limitations can help improve the quality and comprehensiveness of future studies. The two limitations mentioned in the review are as follows: (1) Biases in the searching and screening process: Despite using well-known scholarly databases and relevant keywords, there is a possibility of biases in the searching and screening process. Some studies might focus more on the technological aspects of AI rather than its application in the educational context. (2) Lack of investigation into the mutual relationships between elements: The review utilized a GST framework to examine multiple elements related to AISE. However, the analysis did not explore the mutual relationships between these elements. Understanding the complex relationships and interactions among different elements, such as technology, pedagogy, and learner outcomes, is crucial for gaining a deeper understanding of the application of AI in science education.

Future Study

In summary, future research should employ a more comprehensive and systematic search process to minimize biases. This could involve multi-disciplinary collaborations to ensure a wide range of perspectives and methodologies are considered. And there is a need to explore the mutual relationships between different elements of AISE, such as the interplay between curriculum design, teaching methods, and assessment techniques. Understanding these relationships could provide insights into how to optimize the integration of AI in science education. Given the shallow integration of AI at primary and secondary levels, especially in educational assessments, future studies should focus on identifying the barriers to deeper integration and propose actionable solutions. Researchers should also consider the ethical and practical implications of integrating AI into science education, including issues related to data privacy, algorithmic bias, and educational equity. By addressing these limitations and focusing on these recommended areas, researchers can significantly contribute to the advancement of AI technologies in primary and secondary school science education. This, in turn, will promote more intelligent and effective teaching and learning practices, better preparing students for the challenges and opportunities of the 21st century.