1 Introduction

New Zealand (NZ) schools base their pedagogy on Constructivist and Sociocultural theory [1], and children regularly conduct information search during their educational pursuits. The NZ Curriculum and NZ educators refer to this as inquiry-based learning, and this is a core value of the curriculum. Thus, in NZ classrooms as well as classrooms around the world, children are conducting regular Internet searches for both educational and personal information needs. Yet, the systems for information search used by children are typically those that have been developed with adults in mind [2]. We are not alone in arguing that children’s Internet search using these systems therefore requires investigation. For example, van der Sluis and van Dijk [3] report that these adult-oriented systems do not suit children’s information seeking needs because they require complicated search procedures and return results that might not be appropriate for a child’s needs. Bilal and Gwizdka [4] also suggest that further research is needed into how and why children (re)formulate search queries.

This article specifically addresses the need to further understand children’s information search and their use of information search tools within a child’s educational pursuits. Though there are a wide variety of technologies being used within today’s classrooms and homes, it is not clear how these technologies are facilitating information seeking for children, nor is it clear if children are able to use these systems effectively. Further research, including that presented here, is required to clarify the way children use information search tools and how tools can be developed to support children’s educational needs. This article significantly extends on the previous reporting [5, 6] of a small portion of the results of this study.

The goals for the investigation reported here were to (1) review how children use a search engine that is reported [7] to be regularly used by children in NZ schools, (2) to identify issues and successes of children’s information seeking using the search engine, and (3) to observe the information seeking behaviour relating to creation of search queries.

We structured our investigation around the following research questions (RQ).

RQ1:

What query types do children use for internet search?

RQ2:

Does children’s search behaviour differ depending on how the task is worded?

RQ3:

How do children use query support features of internet search engines?

RQ4:

How do children explore search results?

We conducted an observational study with school children at primary and intermediate schools in the Waikato region of New Zealand to gain insights into their practice of information searches during educational pursuits. This study took the form of a task-based usability observation with children in their school environments followed by a small exit interview.

This article is structured as follows: Sect. 2 discusses the related work, while Sect. 3 reports on the study setup. Our analysis of these observations and interviews provides evidence of the specific issues encountered by children (Sects. 4.2 and 4.3). We discuss our findings and draw conclusions for the design of search interfaces that are supportive of child-appropriate information-seeking behaviour (Sects. 5 and 6).

2 Related Work

Despite the broad literature on children’s information search and retrieval, there is still much to learn about how children conduct searches using contemporary search interfaces in educational settings. We first attempted to structure the related work around our research questions above, see Sect. 1. However because several studies cross multiple of our research questions, we cluster the related work by topics and provide a summary of the insights structured by research questions at the end of this section.

2.1 Searching and learning

Information is the basis for human learning, and thus, learning occurs while students engage in a search process. Vakkari et al. [8] observed that a student’s level of knowledge about a topic predicts their ability to create successful search queries. Perspectives on searching and learning often follow one of two paths: either focusing on a “learning to search” perspective or on a “searching to learn” perspective. The former is about students’ search skills and closely related to information literacy [9], while the latter deals with students’ research activities in the context of class activities and assignments [10]. While both have their place in the classroom environment, our study here deals with a “searching to learn” context as we are observing children implementing their learned or taught search strategies without intervention from ourselves as educators or researchers.

Ghosh et al. [11] conducted a study into the relationship between searching and learning. They see information seeking as a response to problematic situations. Their study used a framework for learning to explain search behaviour: 31 student participants were asked to perform search tasks over a given time period, to keep records of the consulted resources, and to write a summary at the end of each task. Four learning-oriented tasks were used, with the following principles: remember and understand, apply, analyse, and evaluate. The statistically significant results of the study found that learning is an important outcome of searching. This type of learning by exploring topics is supported in pedagogy based on Constructivist and Sociocultural theory [12], which is used in NZ classrooms.

Rieh et al. [13] distinguish four ways of conceptualising learning with respect to search activities: (1) learning as context for searching, distinguished from contexts of work and ordinary life, (2) learning as conceptual change, changing or confirming existing knowledge structures, (3) learning as interactive intention, or sub-goal, in searching, and (4) search tasks as elements in the taxonomy of learning: remembering, understanding, applying, analysing, evaluating, and creating. They stress the importance of defining for a given study the researchers’ construct of learning. The structure of our study refers back to Rieh’s concepts, for details see Section 3.

2.2 Children vs adult search behaviour

It is well known that children’s search behaviour differs greatly from that of adults [14,15,16,17,18,19]. In the school environment, search tasks may be triggered through assignment tasks [20], and search strategies used in such a context may not express a child’s choice [21]. Molin-Juustila et al. [22] observed that children participating in ICT studies typically bring voices of ‘others’ beyond their own, i.e. behave not necessarily according to their own preferences but those of parents and teachers.

Many of the adult-oriented systems that children use do not suit their information-seeking needs. The reason is that these systems require complex knowledge about search and query formulation, and often provide results that do not answer children’s information needs [3]. The standard response to this problem is the development of specialised child-centred information retrieval (IR) systems, web interfaces, and digital libraries [23,24,25,26,27]. These systems are often research-based prototypes and date visually very quickly and, naturally, do not receive the ongoing support typical of a commercial search engine. We found that dedicated child-centred systems are not used in NZ classrooms, and many are no longer available.

Jochmann-Mannak et al. [28] stressed that one should not underestimate the influence of the Google search engine on children’s search behaviour. From their comparison between children’s software interfaces and Google, they concluded that well-meant, child-friendly designs may not work for children, and suggested instead adding value to the Google interface, e.g. by making the Search Engine Results Page (SERP) simple to scan.

2.3 Children’s use of information retrieval systems

While there has been a recent surge in interest in children’s Internet search [4, 13, 29], most available results refer to earlier studies that used web search engines and digital libraries with less robust search algorithms than available today [28, 30,31,32,33,34,35]. In addition, many prior studies considered transaction logs instead of user observations [36, 37]. These investigations have included larger samples of anonymous log data from naturalistic inquiries of information search logs [37] from which information needs and purposes are difficult to ascertain. Similarly, a number of studies that include qualitative as well as quantitative analysis predominantly investigated search in the home rather than in an educational setting [32].

For those that focus on an educational setting, Eickhoff et al. [15] studied the search behaviour of elementary school children, while Gossen et al. [18] compared children’s and adult’s search behaviour in a laboratory setting. Reuter and Druin [27] observed children using a child-specific digital library to search for and select books to read. Wu and Hung-Chun [26] observed children using novel 2D and 3D interfaces of a children’s digital library to analyse search performance and success related to memory and spatial visualisation. Cole et al. [38] surveyed and interviewed Grade 8 children (aged 13 to 14 years old) in a private school in Canada with specific focus on exploring Kuhlthau’s ISP Model by these children in a Canadian school. Rieh et al. [13] undertook work that focused on search in a learning context and search as a learning task. They observed a contradictory picture: studies reporting on students struggling to phrase queries or a reluctance to rephrase queries [39] vs an overestimation of their search skills [40]. Instead of critically evaluating students’ skills, Rieh et al. [13] take a different approach: they focus on the underlying concepts of retrieval systems and argue that systems have been developed to support the acquisition of factual knowledge but do not facilitate other types of learning [41].

Duarte-Torres and Webber [33] performed a large query log analysis to identify challenges in the search behaviour of children. They highlight issues such as the following: (1) encountering ads that appear to be query results, (2) being presented with result pages that are beyond their reading skills, and (3) unsuitable or unclear query suggestions. They observe that children typically select higher-ranked results, which are then explored for a short time. Using a set of queries created by children, Anuyah et al. [42] conducted a series of simulated searches on two adult-oriented and two children friendly search engines. Their findings suggest that child friendly search engines were effective for limiting the display of inappropriate resources, however did not necessarily result in the return of educationally relevant results. They found that both child-oriented and adult-oriented search engines returned webpage results that were above the children’s reading level.

In addition to search results being beyond children’s reading skills, children have been observed to begin reading much of the presented SERP in the order that it is presented. Younger children are less likely to review the entire SERP before selecting a website to visit [43]. Until recently [43, 44], few studies have investigated children’s use of the presentation features of a SERP (i.e. the title, URL, and snippet information) and these authors show there is still work to be done in this area. One of the reasons may be that such studies are management and labour intensive [45].

Anuyah et al. [46] compared general purpose interface elements in IR systems with child-specific elements, focusing on query suggestions for children [47, 48]. Only three of the eight children in Anuyah et al.’s [46] study seemed to have noticed the different query suggestions. It was not clear from the study report if children were using query suggestions unprompted. Usta et al. [49] explored the re-finding of online material in a K-12 educational setting through a log study. They found that children seemed to use less re-finding search behaviour. Different to Usta et al.’s work, we did not consider any repeat queries, as our focus is enquiry-based learning, which predominantly explores topics that are new to students.

2.4 Vocabulary problem

The vocabulary problem is a widely accepted IR problem for both children and adults. A lack of vocabulary or domain knowledge is known to hinder query formulation [32]. Spelling issues have been widely discussed to contribute to the query formulation problems for children [31, 32]. Spink et al. [50] observed new entrant children (i.e. first year of school) conducting web searches in the classroom. They noted the children’s frustrations and difficulties with spelling due to their emerging literacy and spelling skills. Typical investigations of the vocabulary and spelling issue in the fields of HCI/ergonomics have had a focus on typing, mouse input, and keyboard input difficulties [37, 51]. Druin et al. [31, 32] and Hutchinson et al. [52] observe that the keyboard can be frustrating for children as a query construction tool.

Alternative interfaces such as browsing interfaces and speech-to-text interfaces have been explored. In digital libraries, these have proven successful such as in the International Children’s Digital Library [52] and for eBook browsing [53, 54]. However, in Internet search we see often a different result; for example, Jochmann-Mannak et al. [28] reported how mouse input using mine-sweeping and browsing interfaces can also prove frustrating because they involve simultaneously reading and comprehending the interactive information on the screen while interacting with the mouse. Children’s speech recognition has also been shown to perform poorly and interruptions from voice assistants can interrupt children’s thought processes during search and use [55]. Speech-to-text interfaces also have issues for classroom use, given the boisterous nature of the classroom and the fact that previous investigations have shown New Zealand children using desktop computers in classrooms rather than laptops, tablets, or mobile devices with built-in microphones. We therefor focus in our study on the use of computers and external hardware that is available in typical classroom settings.

2.5 Queries and query reformulation

Many studies use different classifications of queries and reformulations (i.e. the rephrasing of a query after initial results) when exploring search engine use, for example in [4, 5, 56, 57]. While Rose and Levinson [58] and others discuss classification, none of these explore their suitability for children’s search behaviour. We now discuss a number of studies who investigated the use of search queries and reformulation.

2.5.1 Natural language vs keyword queries

Duarte-Torres et al. [59] conducted a query log study on content for children using 485,561 queries (10,252 unique queries). The queries included in this study may have been phrased by children or by adults on behalf of children. They found that the queries for child-related content significantly differed from the queries of the complete log: children-related queries were longer, using natural language constructs more often, and contained more questions. They also observed significantly longer query sessions for child-related content (i.e. more queries per session). Kammerer and Bohnacker [51] explored children’s use of natural language in search queries, with special focus on the quality of the outcome. From their study with 21 children, they observed that natural language search queries were more successful than keyword queries. White et al. [57] explored the use of natural language constructs in a log-based study. Queries with question intent were found to be using either natural language structures or keyword form. While their query log study used a taxonomy to distinguish the different query forms for question-intended queries, they found little difference in query result quality. As a conclusion, they recommend that “searchers should only be utilising keyword queries.” They did not discuss why searchers may use/prefer natural language in their search queries. Bilal and Gwizdka [4] conducted a laboratory-based study that investigated the searching behaviours of 24 children aged 11–13 years old conducting information search tasks using Google. They employed eye-tracking tools in the laboratory and used a modified Google search engine that omitted sponsored links and advertisements, limited to seven results per page. Bilal and Gwizdka [4] used a simple coding of keyword, natural language question and phrase queries only. Without exploring changes in query type, Bilal reports the Grade 6 children creating more question queries and the Grade 8 children creating more keyword and phrase queries. Finally, Yarosh et al. [60] investigated children and adults information seeking using voice assistants. They suggest that a significant amount of contextual information is needed to be provided by a searcher when using a natural language in order for the interface to provide an appropriate response. Further, they noted that it can be quite difficult for children to construct appropriate questions that include the necessary contextualisation for these types of interfaces. Children also struggled with query reformulations using these voice interfaces.

2.5.2 Query reformulation

Significant numbers of studies have analysed query reformulation either manually or automatically. Jansen et al. [61] employed n-gram modelling to identify query reformulation in a log analysis study of more than 1 million queries. Rieh and Xie [56] defined three facets of query reformulation: content, format, and resource. While their classification is suitable for automatic query analysis in large log studies, its conceptual abstraction may make it difficult to map against concrete reformulations [62]. Most of the laboratory-based studies reported here use a manual coding scheme, while large log studies typically do automatic analysis.

Sanchiz et al. [63] work with adults found that older adults took longer to create their initial search queries and reformulate search queries, as well as taking longer to evaluate Search Engine Results Page (SERP) lists. Liu et al. [64] also explored query reformulation in adults (studying 48 participants each working on 6 search tasks) and distinguished three tasks according to their structure: simple fact finding, hierarchical finding of multiple characteristics of a single concept, and parallel search for multiple concepts. They developed a taxonomy of query reformulation, which they used to automatically detect the reformulations in their study. They found a significant effect of task type on users’ query reformulation, with parallel tasks containing the largest number of reformulations. More than half of all reformulations were due to the previous search results not containing any useful pages. Finally, about half of all of reformulated queries were effective in finding suitable results.

Marchionini’s [65] seminal work on the searching behaviour of children in an electronic encyclopedia found that younger searchers undertook more query reformulation than older searchers. Bilal [30] observed that keyword searches by children were either too narrow or too broad for their information need and therefore required reformulation. Similarly, Druin et al. [31] and Fails et al. [66] found that children’s use of long natural language structures in search queries often required reformulations. Usta et al. [16] analysed the query log of the children’s web-based educational system Vitamin during a single month in 2013. They also found a high proportion of repeated queries. Rutter et al. [67] investigated query reformulations used by 8- and 9-year old children in a UK primary school. This in situ study analysed 12 children carrying out search tasks during a school lesson using observations, search recordings, post-task interviews and teacher interview. Children in this study were shown to use ‘did you mean’ and auto-complete functionality of the search engine. When reformulating, children also used previous queries, corrected errors with previous queries and made queries more specific. Weber and Jaimes [36] analysed a query log for query focus and query type. They found that children and younger people (aged 5 to 23) showed a prevalence of music, gaming and educational content. Younger people were also observed to issue fewer focused queries and a larger number of diverse queries. Finally, they were found to more often use queries suggested by the search engine.

2.6 Related work summary

We here summarise the insights from related work using the themes of our four research questions.

Query types used by children (RQ1)

Many of the existing works on query types are somewhat dated and thus refer to older information seeking environments. There is also a number of studies reporting log-based investigations that do not provide insight into the context for which a search was undertaken. Recently, studies were carried out in laboratories rather than in a school environment familiar to the children. Across these studies, keyword and natural language constructions have been identified for both initial query and query reformulation. However, few studies have reviewed the nature or detail of the natural language constructions that children used. We thus identify a need to revisit the details of query creation and reformulation by children during educational pursuits using modern internet search engines (ISE).

Task formulation (RQ2)

While related work provided insights into educational search behaviour, the influence of the construction or wording of search tasks by teachers on children’s internet search behaviour was not explored.

Use of support features (RQ3)

A number of works have explored the use of additional features of information seeking interfaces by children—be they used successfully or unsuccessfully. Given the constant evolution of information seeking interfaces, this remains a relevant topic of investigation.

Search result exploration (RQ4)

Recent work has used eye-tracking to revisit children’s information seeking triage and SERP interactions. Because these studies are labour-intensive, they are typically performed in a laboratory with simplified search interfaces (not in a live school environment) and very few have been carried out. Specifically, more work is needed on what features of SERP do children use to make decisions for website visits and which websites are visited, when and how.

3 Method and procedures

Researchers have approached the classification of search and seek tasks differently. There has been some focus on type, nature, dimensions, and goals of a task [21]; for example, Marchionini [65] distinguishes open and closed tasks, and Bystrom and Hansen [68] identify two levels of work: the information seeking task and the search task. For our study, the work of Borlund [69] is particularly relevant. They explored the use of simulated work task situations for evaluating interactive information retrieval. A simulated work task is a short textual description that presents a realistic situation in which an information need motivates a test participant to search a given IR system. They observed a lack of tailoring of the simulated work task situations to the test participants (especially a lack of pilot testing and refining the simulated work), and a lack of detailed reporting on the simulation task used. These simulated situations are widely used in studies of IR systems, also with children [4], including the one reported in this article. While Borlund [69] does not explicitly address studies with children, their extended list of requirements for the use of simulated work task situations still applies, if not more so, to children. We discuss these details in Sect. 3.4. In another recent work, Rieh et al. [13] stress the importance of defining for a given study the researchers’ construct of learning. In this article, we work in the context of a structured learning environment (similar to Concept 1 according to Rieh et al. [13]); while learning through searching was part of the teachers’ pedagogy, our simulated work task aimed for finding information as a part of the learning process vs learning complex concepts.

We conducted structured observations of how children search for digital information using an Internet search engine. This study consisted of user observations and a brief exit interview following a method similar to studies of children’s information seeking reported in the literature [30, 70]. The observations were conducted by a single researcher in a one-to-one situation on location at three schools with a total of 50 children.

We use the remainder of this section to detail our method, beginning with detail regarding our participant recruitment and background (Sect. 3.1), the in situ study environment (Sect. 3.2) and our equipment and apparatus for data collection (Sect. 3.3). We next detail our simulated work tasks (Sect. 3.4), and exit interview (Sect. 3.5) which were developed to be specifically relevant to the NZ education context in which we were studying and appropriate for in situ use with children in the age ranges that we were working with. We end our detail of the method and procedures with a brief description of the data collection and analysis procedures that we followed (Sect. 3.6).

3.1 Participants

Our study was conducted in the Waikato School District, which is located in the central North Island of NZ. The NZ primary education system is comparable to that of the US and UK primary education systems; however, we offer here a brief review of the educational system in which these studies were undertaken. New Zealand government-funded schools are typically separated into primary schools (catering to new entrant Year 1 through Year 6), intermediate schools (catering to Year 7 and 8), and high schools (catering to Year 9 through 13). Our work presented here focuses on students in Years 5&6 at primary level and Years 7&8 at intermediate level. These Year 5&6 children are typically 9 to 10 years old, while Years 7&8 children are typically 11 to 13 years old. The NZ education system also classifies schools according to the socio-economic status of homes within a school’s catchment zone—the decile rating system. A decile 1 rating indicates a high proportion of low socio-economic homes in the catchment zone.

Participants included children from two primary schools and a single intermediate school. The three schools that participated in this study are the same as those that participated in our related studies with children [71] and teachers [72]. These three schools have decile ratings of 4, 5 and 9, respectively. When inviting participant schools to contribute to our studies, schools from across the decile rating system were approached; however, we have not gained participation from a school with a decile rating lower than 4. Observations by the research team during studies over numerous years has been that these schools have comparable technology access. Interview studies [72] also suggest that similar teaching practices relating to information seeking are conducted in all three schools.

Each study session began with a brief demographic survey of each participant. The participants age, gender, school year, and school were recorded on the researchers field notes, and an anonymous participant ID was rendered. Ethical approval to conduct this study was sought from the University Ethics Board before contact with any school. We received signed informed consent from the principal of each school, signed informed consent from the parents of each child, and verbal consent from each child at the beginning and end of each session. The teachers whose pupils were involved were also briefed on the project before it began.

3.2 Environment

NZ primary and intermediate schools incorporate a broad range of teaching spaces. The most common spaces that children use computing equipment for inquiry practices are school libraries, classrooms, computer laboratories, and adjoining workroom spaces (with partial or full partitions or walls). We observed that children in these three schools have access to all of these types of spaces for conducting information seeking for their educational purposes. Our in situ observation studies were carried out in library, classroom, and adjoining workroom environments at the three schools as arranged by the school principal. While a participant was excused from their class and worked with the researcher, other students, teachers, librarians, and staff worked in the spaces conducting their typical daily routines.

It is not unusual for children and multiple staff to be working on different learning tasks within the same space at a school in NZ and therefore the researchers presence was of only minor disruption to the school. Other children in the vicinity were not able to overhear the researcher verbalising the task descriptions as these children were engaged in other tasks and supervised by teachers and librarians. We can therefore rule out possible contamination effects. We acknowledge that there is the potential for some distraction to the participant during our study depending on the learning tasks being completed by other students. For this reason we place little emphasis on time to complete a task in the analysis of our results.

Each child participated in a single session with the researcher, with a session typically taking between 30 and 45 minutes per participant. This resulted in the researcher being able to conduct up to three observation sessions during a school day. Not all days were suitable for working with a student during every available teaching session, and not all days were suitable for visiting the school should sporting or cultural events coincide. The researcher spent two to three weeks in each school and became a familiar face in the school and learning environment amongst teachers and students.

Due to the impact of working in shared spaces, children were not asked to use the think aloud protocol. Additionally, it has been argued that children under the age of 12 struggle with the think aloud protocol during research studies [73], and therefore, this method was not considered for our study. Children were noted to behave in a range of ways during their observation sessions; children chatted with the researcher, described what they were doing, or remained quiet while they conducted a task.

3.3 Apparatus

The children used computers supplied by their school that they would therefore be accustomed to using. Each school provided a different workstation; however, all were desktop Apple Mac or Windows PC computers. The Google Chrome web browser was used for all sessions, and it was determined that all children had prior experience with this web browser. The browser cache and history was cleared before each session by the researcher to ensure no user history influence on the individual studies. No additional safe browsing or Google Safe Search features were implemented by the researchers. We expect that the schools will have net safe protocols in place for the entire network; however, no site blocking was apparent to the users for any of the searches completed during this study.

Sessions were video and audio recorded (over the participant shoulder), and additional handwritten notes were taken. A digital video camera with microphone was set up on a tripod behind the chair of the student facing the computer screen. This was zoomed to an appropriate viewpoint and recording begun before the student was invited to begin working with the researcher. Once the apparatus was prepared, the next student to participate was identified and collected from their classroom by the researcher.

The length of time to complete a task was recorded using video data; however, no in situ timing was conducted. The researcher did not conduct in situ timing so as not to be seen by the participant taking such calculations which might influence how the child behaves during observation. To calculate the time to complete a task, the video footage was marked with a start point when the researcher begun to verbally give a task instruction or at the point the researcher asked the participant to read the instruction provided. The task was considered complete when the participant deemed it to be complete. The participant verbalised to the researcher their answer, or their perceived success or completion.

3.4 Simulated work tasks

We note in the related work that searching and learning often focus on “learning to search” (information literacy [9]) or on “searching to learn” (research activities in the context of class activities and assignments [10]) perspective. Our study and the tasks were designed to be as educationally appropriate as possible in a simulated work set study. Teachers in NZ classrooms are known to teach digital and information literacy as part of their preparation for children to undertake inquiry-based tasks in the context of class and assignment activities. The study reported here was therefore most concerned with the undertaking of the classroom or assignment task, rather than observing or inquiring about previous teaching received. We report a separate study elsewhere that sought insight into student [7] and teacher [72] insights into the teaching of information search and information literacy.

Simulated work tasks in a study are those tasks that are not set in the context of an actual educational procedure, but rather created for a particular study. Five requirements for simulated work tasks experiments were defined by Borlund [69]; we discuss here how we have integrated these into our method. (1) Tailoring to participant situation: we tailored our testing situation to children’s typical educational experience by developing tasks that are age, subject, and educationally appropriate, and conducted within a natural environment with familiar technology for those children participants (see Sects. 3.2 & 3.3). (2) Include personal base line: we begun each session with a personal interest task, inviting the children to investigate a topic of their initiating when given the opportunity to investigate a sporting, musical, or book interest (see Sect. 3.4). (3) Switching tasks: while Borlund recommended alternating between personal and simulated tasks, we chose to limit personal tasks because we were working with children and wanted to control the cognitive load of long experiment sessions. (4) Pilot study: an initial pilot investigation was carried out with a small group of both adults and children before beginning the study reported here. This pilot insured that the language used in the script and the tasks was appropriate, that the tasks were achievable within a reasonable time-frame, and that technology considerations and recording apparatus would be able to operate as expected. (5) Report on situation: the simulated work tasks situations are described in Table 1.

Table 1 Five search tasks

3.4.1 Five tasks

The observation study consisted of a set of five search tasks conducted by each student in the order listed (see Table 1). These tasks were developed to be educationally appropriate for children in these school year levels through collaboration with teachers from the schools that we were working with as well as advice from an educationalist. The teachers were asked about recent assignments and topics that had been discussed in class, and we also considered the current global situation to identify relevant events to which tasks may refer. Based on a discussion with the educationalist, we created an initial task list. This was then verified with teachers and the educationalist, before being finalised.

We are following the task nomenclature as used in the NZ curriculum. Prior to the implementation of the study, a selection of teachers at these schools participated in an interview study during which common topics of investigation for the students learning and in-line with the NZ curriculum were discussed [72]. The task types (Verbal vs. Written, Open vs. Closed, Instructions vs. Queries) were developed based on prior interviews with teachers about instructing children on information seeking in the classroom [72].

Table 2 Exit Interview

All five tasks required the children to create search queries and therefore allowed for investigation of the types of queries that children use in modern internet search engines (RQ1). While entering initial search queries, reformulating queries, or triaging the SERP list we were able to observe how children use support features of ISE’s (RQ3). After submission of a search query, children were required to triage the results lists and visit the resulting websites and therefore, how children explore search results (RQ4) was also observable through all five search tasks.

3.4.2 Instruction

Before the tasks were introduced, the children were encouraged by the researcher to conduct themselves as they would have done had a teacher set the topics of investigation for them during an ordinary school task. While this is perhaps aspirational, the intention was to create a structured test environment that was neither intimidating nor unusual. It was explained to the participants that we were interested in how they searched for information on a computer and what they found easy or difficult. Further, it was emphasised that we were not testing them or investigating their success or failure.

We structured the study such that it would allow insights into how a child treats verbal instructions differently to written instructions (RQ2). That is to say, are queries created differently if the instructions are given in verbal or written form. Tasks 1, 2 and 3 were read aloud to the child with all instructions given by the researcher. Tasks 4 and 5 were given to the children as a printed hand-out.

Similarly, we structured the search task instructions using two distinct typologies whereby a task instruction was posed to the child as a question or as an instruction. Again, we were interested to identify if children constructed queries differently if the task was posed as a question or as an instruction. Tasks 1, 2, and 4 were posed as questions, while Tasks 3 and 5 were posed as instructions.

We denote the types of instructions given for the 5 tasks, by using the labels Open Task, Verbal Question, Verbal Instruction, Written Question and Written Instruction, respectively (see Table 1).

3.4.3 Task completion

Task completion and success was decided by the participant. The student reported they had finished or completed the task and was thanked by the researcher before the next task was administered. The researcher did not analyse, grade, or investigate the quality of answers or successful completion by participants. The goals for this investigation were to consider use of the search engine and creation of search queries, and therefore, measures of success were not analysed during this study.

3.5 Exit interview

Once the participant had completed all five tasks, the researcher conducted a short exit interview with each participant. The interview was kept brief, and the overall interaction with the children was well within a typical class exercise time.

Each child was asked the questions in the order listed (see Table 2). Audio and field notes data were collected during the exit interview portion of the study.

We showed each child a printed visual example of a query suggestion (see Fig. 1) and asked if they had seen one of these during their tasks. We then asked the participant if and when they use these spelling or query suggestions during Internet searching (Questions 8 and 9 of Table 2).

During the interview, we showed children a printed visual example of a search results page (see Fig. 2). We used this visual when we asked Questions 10, 11, and 12 of Table 2. We indicated on the printed example where the related searches were and asked the children if they use these related searches suggestions during Internet searching (see Questions 10 and 11 of Table 2). We also showed the children the Title, URL, and Descriptor text formatting and asked which text they read when making their decision to visit a website (see Question 12 of Table 2).

3.6 Data collection and analysis

The principle investigator conducted quantitative and qualitative analysis via post-observation coding of the video, audio, interview and field notes data that were collected during the user observation and interview portions of the study. Basic demographic information was gathered at the beginning of each participants study session. Video with audio and field notes were taken during observations of the students’ interactions with the computer. These observations were followed by an exit interview, which was analysed as a combination of audio as well as written notes. The field note data were predominantly used to clarify observations from the video and audio.

Initial coding was undertaken by the principle investigator and a codebook was iteratively developed. A second researcher reviewed the codebook after approximately twenty percent of the participants had been coded. Inter-rater reliability was not undertaken in this investigation due to the size of the research team.

For our video analysis, we coded head movements, mouse and hand movements, text entered onto the screen, and mouse click as well as mouse location information. A large part of the coding was concerned with the students’ queries, their construction, reformulation and quantities. For this, each coding step was developed based on the specific analysis performed. For example, to qualify the reformulation of queries, we coded the number of times a student submitted a query with adjusted wording but not those adjustments that were entered but not submitted. Similarly, we coded head movements as follows: each time a child would move their eyes from the screen to the keyboard or mouse during search query entry, we marked the video timeline. These timeline markers were tallied to give a per child count for each search task.

Fig. 1
figure 1

Query suggestion visual shown to participants. Google and the Google logo are registered trademarks of Google Inc., used with permission

Fig. 2
figure 2

Related searches visual shown to participants. Google and the Google logo are registered trademarks of Google Inc., used with permission

We used ELANFootnote 1 to perform qualitative annotation of video and audio moments and intervals. ELAN was also used for audio annotation of the exit interview.

For greater clarity, the coding details will be described in Sect. 4.2 in the context of each analysis step in which its results are used.

4 Results

We structure the results of our study in three sections. We present the results of the brief demographic survey (see Sect. 4.1), followed by the quantitative results of the user observation study (see Sect. 4.2), and finally, we present the findings of the exit interview (see Sect. 4.3).

4.1 Participant’s demographic results

We begin our discussion of the results with an overview of the participant sample. Three schools (2 primary schools and 1 intermediate school) agreed to participate in this study. Each schools’ principal was asked to identify 16 children (8 male, 8 female) from their school to participate in this study. We requested participants who were in Years 5& 6 (primary school children), and Years 7& 8 (intermediate school children) at the time of the study. We received 16 participants from School A, 18 from School B and 16 from School C.

In total, 24 boys and 26 girls participated in this study. All children were aged between 9 and 13. It was noted that children came from a range of classes in each school—that is to say, the 16 children from each school that were invited to participate were not selected from a single teachers class. Throughout this article, we refer to individual students by an anonymous ID that encodes gender, year level, and a unique identifier, i.e. FY5_1 is a Year 5 female student. Table 3 provides a reference to the demographic data relating to our participants.

Table 3 Participant Sample

Children at these three schools are taught in composite classes which is the term used in New Zealand to describe two-year-levels that are taught by a single teacher. Children in Year 5&6 are taught together, and Children in Year 7&8 are taught together. This is common in NZ schools, and therefore, it can be expected that all children in a particular Year 5&6 classroom will have had the same information literacy, digital literacy, and inquiry teaching. Similarly, children in a Year 7&8 classroom will also have had the same teaching during that year. It is also common in NZ schools that where a school has multiple classes of students at the same composite level that teaching resources and practices will be shared across the school. We therefore expect students coming from different classrooms will have had similar learning opportunities. It was the nature of this composite class teaching practice that provides the impetus for sampling students from both Year 5&6 and Year 7&8 when identifying participants for inclusion in our studies.

4.2 Simulated work tasks results

We discuss here the results of our observation study of children’s Internet search during the five search tasks using the Google search engine. At the beginning of each section, we briefly describe the coding techniques used the data displayed. Due to the small numbers of participants, we do not attempt to give statistical significance to the numbers that we present. A preliminary investigation of student gender, age, year level, and interactions related to verbal or written instruction has revealed no significant observations; these aspects are therefore not analysed in detail in this article.

4.2.1 Physical interactions

Before we begin the analysis of children’s search activity, we highlight observations about the physicality of children’s interactions. Children were observed in a number of physical actions when using the search engine interface, which were recorded in the field notes and while coding the video footage. These observations may give clues as to the reasons for some of our findings.

Tables in this article present the results per school year and, additionally, per school-year-level composite (grey header), for those results with sufficiently large numbers (see Table 4, for example). Where numbers are smaller, only the two school-year-level composites were reported. The additional colour formatting was created separately for each column (smaller numbers are blue, progressing through grey and yellow to larger numbers that are green); the colours imply no value judgement.

Children were observed to look at their fingers when typing, and showed signs of struggling with spelling (e.g. slowing down and hesitating with typing), were noted to be looking for letters on the keyboard, and were seen to check what they were typing on the screen—both during as well as after completing—text entry for a search query. Many of the children did not touch type, nor did they typically keep their hands on the keyboard or mouse during searching or reading. This required the child to regularly visually assess the location of the keyboard or mouse, moving eyes from the computer screen to the input devices and back. When creating search queries we counted the number of times a child looked up from their fingers to review what they were entering into the search box for each task. We did this only for the first query entered for each task.

Table 4 Averaged keyboard-to-screen head movements

All children looked up from the keyboard to the computer screen before sending a query for the majority of their tasks. Three children in total were able to complete one of their queries without a head movement, but no children completed more than a single query entry without a head movement. Y5&6 children made a total of 576 of such head movements (average 3.39 per child per task, max 10, min 0) while Y7&8 children made 261 head movements (average 3.26 per child per task, max 9, min 0), see Table 4. The nature of the task did not appear to correlate to the total number of head movements. While a child was looking at the screen to check their entered query, they were observed to notice query suggestions or query expansions that could be used to extend their query.

During this study, children pointed to the screen with their finger, for example, at images in the sidebar (as if in surprise). They also appeared to be using their finger or mouse as a visual guide when scanning the SERP list, reading the sidebar or pull box content. Navigation and menu items were also moused-over by children even when not clicked. We also observed that children highlighted texts they were reading, both as visual markers of information to come back to and as a visual reference for reading.

4.2.2 Use of queries and construction techniques

We coded the queries that children constructed as follows: a query was recorded once the child pressed the Enter key or clicked the search button; no query was recorded when a child started typing a query but changed their mind before pressing the Enter key. When Google’s auto-complete feature (query suggestion) was used by the participant, we counted this as a single query only.

Number of queries

We present a breakdown of queries per task in Table 5. Because the number of children in the year levels vary, we also present the average (mean) number of queries in Table 6. Year 5&6 children together created a total of 360 queries, while Y7&8 children made only a total of 190 queries. Overall, the 50 children worked through 250 tasks, posting 550 total queries. In total, the average number of queries for Year 5 children was 11.67 (min 6, max 22), for Year 6 children was 9.74 (min 6, max 21), for Year 7 children 11.63 (min 7, max 18) and for Year 8 children 12.13 (min 6, max 24). Taken together the average number of queries constructed by Year 5&6 children was 10.59 and 11.88 by Year 7&8 children. In conclusion, Y7&8 children entered on average about one query more than Year 5&6 children.

Table 5 Total number of queries per task
Table 6 Average number of queries per task

As can be seen from the colour coding of the rows in both tables, the younger children produced fewest (average of 1.53 vs 2.5) queries for the Open Task (see data columns 3)Footnote 2 while the older children produced fewest queries for the Written Instruction (see data columns 6).

Query types

We explored which search strategies the children used to create queries and from the set of observed queries identified the following five query-construction types: Natural language sentences (NLS), Natural language questions (NLQ), Simplified searches or Keywords (KW), Two-part Searches (2PS); and Query enhancements (qualifiers and refiners). The results reported in this section are for both initial query formulations and query reformulations. The children did not construct any Boolean search queries (e.g. using operators AND and OR) and only used the terms “and” and “or” as part of NLS and NLQ.

Natural language queries

We recorded as queries using natural language constructs (natural language queries) those queries that were constructed using language too complex to be considered a keyword query. This included the use of punctuation and non-keyword text within the query string. We further distinguished within natural language structures those that appeared to be natural language sentences (NLS) and those that had patterns of a natural language questions (NLQ). An example of a NLS is the query by FY5_1: “facts about mount cook”. This query was coded as a NLS because it contains the non-keyword text “about” and its construction differs from the simplified keyword search entered by MY5_1: “mount cook facts”. An example of a NLQ is the query by MY5_1: “how many rings are there on uranus”, which was created as a reformulation of the initial keyword query “uranus rings”.

Table 7 summarises the number of children who created a NLS query (data column 1–3) or a NLQ query (data column 4–6) for the given search task. We observe that if a task was set as a question, the child was likely to enter a question, see rows for Verbal Question (36/50 children, 72%) and Written Question (42/50 children, 84%). For Verbal Instructions and Written Instructions this ‘mirroring’ of the task pattern as NLS was not as pronounced (25/50, 50% used NLS for verbal and 21/50, 42% used NLS for written instruction tasks). On average the younger children made more natural language queries than the older children.

Table 7 #Children creating NLS and NLQ

Keyword queries

We counted the times that children shortened the question or instruction given for the task into a simplified set of keyword search terms (see Table 8).

Table 8 #Children using KWs

For example, MY7_4 searched for “possum”, FY7_1 searched for “possum habitat” and FY7_2 searched for “uranus rings”. The children more often used simplified queries when the task was not set as a question but rather as an instruction. Greater numbers of children at all year levels created simplified queries when exploring the Open Task (23/50, 46%), Verbal Instruction (32/50, 64%) and Written Instruction (24/50, 48%) than when exploring the two question-based tasks. On average the younger children made less keyword queries than the older children. For all tasks other than the Verbal Instruction task children created more natural language queries than keyword queries.

Two-part searches

The questions developed for both the verbal question task and the written question task required the child to find information about two inter-related features of a topic. This approach is similar to the one used in Bilal’s [30] studies of children’s use of the now defunct Yahooligans!. For example, the verbal question asked, “Where do possums live, and are they a pest in New Zealand?” We were interested to see if students typed exactly what they were given, or if they simplified the search into two-part queries (see Table 9 with horizontal colour coding).

Table 9 #Children separating questions vs. using only first part to answer question

The left block of Table 9 presents the results for children who separated this task into two search queries and entered both queries in order to complete this task. 21/50 (42%) children conducted the Verbal Question task in two parts, while 22/50 (44%) children conducted the Written Question task in two parts. The right block presents the results for children who, having separated the search queries into two parts, were able to complete this task without entering the second query.

In addition, seven of 50 children (14%) attempted to search for the whole Verbal Question in a single query; for example MY5_4 who searched: “where do possums live and are a pest”. Nine of 50 (18%) children attempted to search for the whole Written Question in a single query. All of these children were in Year 5&6.

Query enhancement techniques

When learning to search, primary school children are taught to use two query enhancement techniques: query qualifiers and query refiners. We report here on the use of these techniques during query construction of both initial and reformulated queries.

Query Qualifiers

The children are taught to use query qualifiers, such as the addition of “for kids” to assist with identifying pages that are aimed at children. Additionally, they are encouraged to “find facts” about a broad concept during an investigation. As part of their educational practice, teachers teach search queries that append the word “facts” as another query qualifier. We were interested to see how many students used the child-focussed query qualifiers “facts”, “for kids”, or “for children” (see Table 10).

Table 10 Children using child-focussed qualifiers

While our Verbal Instruction and Written Instruction tasks explicitly stated that a child should “find facts about” a given concept, the remaining three tasks did not. The term facts was used in high numbers in the instruction tasks (36/50 and 21/50) by all ages of child. The query refiners “for kids” or “for children” were used a total of 11 times during our 250 tasks. The use of “facts” and “for kids” or “for children” was more typically used by the Year 5&6 students.

Query refiners

Children were also taught to use query refiners or query refinement strategies, such as methods for shortening queries to keywords by avoiding “small words”, avoiding punctuation marks, specifying the information need within a broad concept, and Advanced Search.

Very little punctuation was used during query formulation. Only three participants used a question mark at the end of a question. MY6_2: “what’s uranus?” and MY8_4: “new zealand are possums realy pests?” [sic]. No students used a full stop. One student used a comma: “Mount cook, south island new zealand” [FY7_4].

In our interview studies both students and teachers at these same schools reported using Google Advanced Search features as a method for simplifying the searches returned by Google. In our observations, no child chose to use any of the advanced features of the search engine.

4.2.3 Query reformulations

We explored how children proceeded when their first search query did not resolve their information need. We analysed the types of queries that children created, and how they adjusted these queries. We coded the number of children who reformulated queries as well as the query types at each step. An example of a reformulation was the search by MY5_1 who created an initial KW query “uranus rings”, yet after some searching of the results list reformulated this search with the NLQ “how many rings are there on uranus”. Thus, we were able to calculate the number of reformulations a participant made (in this case a single reformulation), as well as the type of query that they constructed (an initial KW) and reformulated query (a NLQ reformulation). We use the query types that we identified in Sect. 4.2.2.

We do differentiate reformulations and two part searches. If children broke a question-based task into two parts, this was not counted as query reformulation but is considered in Sect. 4.2.2 as a 2-part query. For example, FY6_2 rephrased the initial query formulation of “possums” (KW) to “are possums pests in new zealand” (NLQ); coded as initial query and reformulation, respectively. In a third step, she created another search using the NLQ “where do possums live”, counted as a 2PQ.

Number of query reformulations

Table 11 shows the number of children in each composite class that reformulated a query in at least one of the five tasks. We found that both primary school children and intermediate children showed similar patterns of query reformulation use. Only five of fifty students (10%) conducted all five of the search tasks without reformulating a query. All other students (90%) rephrased at least one of their queries in order to complete the five tasks.

Table 11 #Children reformulating searches per task

Because each child was required to create at least one query per task, there were 170 initial queries made by Year 5&6 children and 80 initial queries created by Year 7&8 children. Year 5&6 students reformulated 75 of their 170 (44.12%) initial queries (see data column 1 in Table 11), while Year 7&8 students reformulated a total of 37 of their 80 (46.25%) initial queries (see data column 2 in Table 11). Therefore, a total of 112/250 (44.8%) queries were reformulated during this study.

Types of initial query that required reformulations

Table 12 (left block) shows the number of children who reformulated searches after making an initial keyword query. Table 12 (middle block) shows reformulations for initial NLQ queries and Table 12 (right block) for initial NLS query patterns. It can be seen that reformulations were used most often when the initial query was a KW. 66 of the 112 (54.1%) reformulations were made after an initial KW query, 24 (21.42%) of the reformulations were made after an initial NLQ query, and 22 (19.64%) reformulations were made after an initial NLS query.

We observed that children mostly retained the search formulation pattern while reformulating queries. That is, when a child created an initial KW query, their reformulation was most likely to be another KW query, and so for NLQ or NLS initial query constructions, respectively.

Table 12 First query reformulations
Table 13 Number of query reformulations per task

A minority of reformulations used other query patterns than those used in the initial query. Of the 66 initial KW queries that the children made, 20/66 (33.3%) were reformulated into another pattern (either NLQ or NLS queries). Thus, 33.3% of the reformulations of initial KW queries used a different query construction. Of the 24 initial NLQ’s that were reformulated, only five of twenty-four (20.83%) resulted as queries other than a NLQ. Finally, 22 initial NLS queries were reformulated with 10/22 (45.45%) resulting in other query types. The majority of those reformulations (7/10, 70%) used NLQ rather than a keyword query construction (3/10, 30%).

Overall, children who changed query types did so on their first reformulation and changed types only once. We observed that keyword queries are most used initially and also dominate queries that needed reformulations. Natural language queries were used more in reformulated queries than in initial queries (i.e. children added question words to unsuccessful keyword queries). The queries for Task 3 (Verbal Instruction) received most changes (68)—more than twice as many as for any other tasks. The reason may have been that this task also received the most initial keyword queries (27/50). Both Tasks 3 and 5 (Verbal and Written Instructions) resulted in very few initial queries using natural language questions (0/50 and 3/50, respectively).

Multiple reformulations

After having talked about changes in query type, here we discuss changes in query content. Table 13 compares the number of queries that were reformulated per task (see data column 1) to the number of reformulations per task (see data column 2) and the average number of reformulations per task (see data column 3). We note that the verbal question task underwent the least number of reformulations. A reason may be that children often initially created an NLQ for question tasks, and generally NLQ queries were the least reformulated query type (see Table 12).

We observed that 53 of the 112 (47.32%) query reformulations were not changed further after the initial change, while the remaining 59/112 (52.68%) queries were changed two, three or more times. For example, when attempting to answer the first part of the Verbal Question, MY5_3 created a KW (“mount cook”) query and then reformulated this into a NLQ (“where did mount cook get its name from”) and again reformulated this as a second NLQ (“how did mount cook get its name”), then resolved his query to the NLS (“facts on mount cook”) and finally the NLS (“fun facts on mount cook”).

Query reformulation through query elaboration

Children were also observed to reformulate searches with query elaboration techniques they had been taught, such as query qualifiers or query refiners (see Section 4.2.2). The most typical initial query construction to receive reformulation included Query Qualifiers or Query Refiners were initial keyword queries. Children were observed to do this by adding a query qualifier (e.g. FY8_4 reformulated “mt cook” to “mt cook facts”) or by converting that search to a natural language query (e.g. FY6_4 who reformulated “mount cook” to “how was mount cook formed”).

4.2.4 Final/successful queries

Table 14 shows the types of the final, or “landing”, queries (after possible reformulations). These are the queries for which the children decided that they had completed the search tasks. We refer to these final queries as successful queries from here.

Table 14 Type of final query selected by children

We observed that Tasks 2 and 4 (verbal and written question) were predominantly solved by the use of natural language questions as the final query. Verbal and written instruction tasks were likely to be completed as KW or NLS query. This indicates that the construction of a search task or inquiry by a teacher when set as a question or a natural language task may result in different query construction techniques and Internet or information search practices by the student.

Table 15 compares the query types of both the initial queries and the final or successful queries. When comparing the results for the final queries with the initial query types, we observe the predominance of initial KW queries versus the successful number of natural language queries, particularly NLQ.

Table 15 Initial queries compared to successful queries

Table 15 differs with regard to previous colour formatting. Here we created the colour formatting separately for each row as apposed to each column (smaller numbers are blue, progressing through grey and yellow to larger numbers that are green); the colours imply no value judgement. Thus, the colour formatting shows that NLS were the least used initial query and least used successful query.

4.2.5 Query formulation using search engine features

We investigated which features of the search engine the children use, including system interventions such as query suggestions and expansions, related searches and spelling assistance.

Query suggestions (query expansions)

46 of 50 (92%) children used query suggestions at some point during the observation studies; only four of fifty (8%) children did not use a query suggestion during any of their observed searches. These four children were all Y5&6 students. The 30 Y5&6 children, who used query suggestions, used these slightly more often (142 times by 30 children, average 4.7) than the 16 Y7&8 children (72 times by 16 children, average 4.5). Year 5 children used query suggestions 65 times (average 4.33, min 0, max 10), Year 6 children 77 times (average 4.05, min 0, max 9), Year 7 children 40 times (average 5, min 1, max 10) and Year 8 children 32 times (average 4, min 2, max 6). We broke this down further into the separate tasks (see Table 16). For the Written Question, all age groups used query suggestions most often. This may have been due to the utilisation of the proper noun Uranus, and the children being unfamiliar with its spelling.

Table 16 #Children using query suggestions

Table 16 and tables that follow are again colour formatted by column (smaller numbers are blue, progressing through grey and yellow to larger numbers that are green); the colours imply no value judgement.

We further analysed the query suggestions the children used to identify if these functioned as extensions of an existing query, or if an automatic query suggestion led to an alternative query to what the child was already typing. For example, MY8_4 entered “how long till thecricket [sic] world cup 2015” and then chose “where will the cricket world cup 2015 be held” from the list of query suggestions. This is clearly a different question to the originally typed by the child. This case was coded as an alternative query while the option “how long till the cricket world cup 2015” would have been coded as a query extension. Therefore, we coded as query extensions, when a child selected a suggestion that offered a correction to their spelling, or when the suggestion was completing the text the child was attempting to enter. Query alternatives were coded when a child begun entering a query and then clicked on an improved query being offered by the search engine.

We found that the majority (188/214, 87.85%) of used query suggestions observed were extensions as opposed to query alternatives. Of the 142 query suggestions used by Y5&6 students, 121 (85.21%) were query extensions. Of the 72 query suggestions used by Y7&8 students, 67 (93.1%) were query extensions and five (6.94%) were query alternatives. We observed that 18/34 (52.94%) Y5&6 children used alternative queries and only 4/16 (25%) Y7&8 children. The more frequent use of query alternatives by younger children may be due to lower confidence query formulation.

Spelling (“did you mean?”)

We counted how many children clicked a spelling correction as presented in the Google SERP (i.e. the suggested query shown at the top of the search engine result page). Only eight children used the provided spelling correction, which used to be prefaced by “did you mean” at the time of the study (now changed to “search instead for”). One Year 5, two Year 6, one Year 7, and four Year 8 children used the spelling correction. Spelling correction in Google was used a total of 11 times by these 8/50 (16%) children during the total number of studies completed. One Year 8 female was noted to use the spelling correction twice and one Year 8 male was noted to use the spelling correction three times. All of the remaining six children used the spelling correction only once.

Related searches

We counted how many children used the Related Searches feature in the SERP of Google. No students at Y5&6 level and only three students at Y7&8 level were observed to use this feature.

4.2.6 Search task fulfilment

Finally, a number of observations were made that are linked to task fulfilment but are outside of query formulation and reformulation such as triaging and link selection.

In-page triage & reading

When selecting webpages from the SERP list, children would spend often very little time to comprehend or read the content of the page. Many were observed to click the back button or delete key to return to the SERP very quickly. Alternatively, a cursory and very rapid scroll to the bottom of the page and to the top of the page again, or part therein, was made before a rapid use of the back button. When questioned about these interactions, children claimed to not know why they had made the decisions to retreat. When a reason was given, it was often claimed that the site did not “look professional” or “safe”, or it was a known website that the child did not trust.

At times, after poor scanning of a web page, children were noted to leave a page declaring information was not present (even though a more careful reading would have revealed the sought answer). This seems to indicate that the children’s scanning, reading, and comprehension ability is understandably still developing, while they are under the impression of already being fluent.

Search result exploration

Table 17 shows the average number of links visited per task. We found that that older children visited significantly more websites from the search result lists to complete each task. Based on these numbers, we can assume the younger children were not performing triangulation during their search result exploration. Because we were not using a think aloud protocol, we do not infer that the older children visiting more websites is a sign of triangulation by these children.

Table 17 Avg. #visited links per task

First link

We examined how many children visited the top result in the search results list as the first link they clicked. Of the 50 participants, 20 (40%) clicked the top result for the Open Task, 26 (52%) for the Verbal Question, 15 (30%) for the Verbal Instruction, 32 (64%) for the Written Question, and 23 (46%) for the Written Instruction. While a number of students appeared to read the content of the sidebar or pull box, very few students clicked the link associated with the sidebar or pull box as their first result visit (six in total for all tasks clicked the sidebar link first, and seven in total clicked the pull box link first).

We also examined how many children visited Wikipedia as the first result. 13/50 (26%) clicked Wikipedia first for the Open Task, 5 (10%) for the Verbal Question, 18 (36%) for the Verbal Instruction, 6 (12%) for the Written Question, and 8 (16%) for the Written Instruction. In total, a Wikipedia link was visited 99 times by children during their search tasks. Only 11 (22%) students did not visit a Wikipedia link during any of their tasks.

Across all five tasks, 9/50 (18%) children in total visited an advertisement or sponsored link as the first result they clicked for one of the set tasks. Use of sponsored links is discussed further below.

Repeat result visits

We analysed the number of times students visited the same result for the same query. We counted only the times before a search query was adjusted, and only for the first query created for each task. A total of 14/50 (28%) children visited the same result twice during their first query creation. These were not just the younger children, but 4/15 from Year 5, 4/19 from Year 6, 2/8 from Year 7, and 4/8 in Year 8. Children also visited links from a pull-box or sidebar and then revisited the same website via the search results list. There was not an opportunity for the researcher to question the students about this second visit to a result. Video analysis revealed that at times the selection of the result is followed by quick use of the back button, or sometimes a verbal statement by the student acknowledging that they have visited this result previously or that they did not mean to visit this result.

Sponsored links

Using a live search engine in the natural school environment meant that we could analyse selection of sponsored links during results exploration. This type of search results list link use was analysed due to the potential for confusion for young readers. A total of 9/50 (18%) children selected an advertisement or sponsored link for the first result that they visited during one of their set tasks. All of these children were the younger primary school children (4/15 Year 5 and 5/19 Year 6 children).

In total 20/50 (40%) children visited advertisement or sponsored links during their studies. The children visited these links expecting that they would contain useful information based content related to their search needs (often for the task about Mt Cook), however, typically found that these links were advertising for accommodation or tourism. 6/15 Year 5 children visited advertisements (8 visits total), 5/19 Year 6 children visited advertisements (7 visits total) and 3/8 Year 8 children visited advertisements (5 visits total). Given sponsored link visits by children in both the early years and the latest year studied it appears from these numbers that indeed advertisements and sponsored links are able to be confusing for children at all levels of our study.

Time on task

Using the video footage, we coded the length of time a search took from the time that the child began to type to the point at which the child indicated or declared that they were satisfied that they had completed the task. We rounded the results to the nearest quarter of a minute due to the variability’s present amongst participants and in this recording method. The Y7&8 children took on average less time than Year 5&6 children to conduct the Verbal Question (3 vs 3.5 minutes) and Verbal Instruction tasks (2.75 vs 3.50 minutes) as well as the Written Question task (2.25 vs 2.75 minutes). Only for the Open Task are the Y7&8 children observed to take longer than the Y5&6 children (4 vs 3.5 minutes) and for the Written Instruction (3.25 vs 2.75 minutes). For single year levels, the Year 8 children took the longest to complete both the Open Task (4.25 minutes) and the Written Instruction (4 minutes). We noted that during this additional time exploring the Open Task, the Y7&8 children delved deeper into the subject they were investigating compared to the Y5&6 children. Often the older children identified a personal information need during their searches. These personal information needs were identifiable by the researcher when the children formulated or reformulated queries with specific lines of inquiry. For example, MY7_3 used the specific query “when was soccer invented” for the Open Task, identifying soccer as his favourite sport. A second example was FY8_4, who began her inquiry with piano music as her favourite musician for this task. To investigate this task she started with the query “piano songs sheet music” and after some exploration of the websites returned by this query, she chose to reformulate her query to “Beethoven piano” and then “Beethoven facts”. For the remaining three tasks either the Y5 or Y6 children represented the longer search times.

Across the year levels, verbal tasks tended to take longer to complete than the written tasks for both questions and instructions. Question tasks also appeared to take longer than instruction tasks for students to complete. Perhaps this is because question tasks required a known answer to a known question while children could determine when they had completed finding “enough” facts or information about a given topic with the instruction tasks.

When we compare the length of time per query to conduct a task, Year 7&8 children made more queries in total when conducting the Open Task, and in turn spent longer on this task than the younger children. It is interesting to note that the average number of minutes to complete a task was similar for younger and older children, i.e. the older children were able to visit more websites in a shorter amount of time than the younger children.

4.3 Exit interview results

From the interviews, we have yet another layer of insightful data for analysis.

Sense of ease when searching

Question 2 and Question 4 (see Table 1) asked What was easy when searching today? and What can be easy when searching on a computer? Children oozed confidence when answering Question 2, predominantly talking about broad internet use, including websites as well as seeking and using information on the Internet. For example, FY7_3 stated, “it was pretty easy just to click on Google and search.” Children described searching the Internet as “quick” and “easy” which seems to be emotionally significant for those interviewed. Children also described numerous websites, such as Wikipedia, YouTube, BBC-Kids as being easy or helpful. Children described the wealth of information on the Internet and the speed and ease of the Internet as a resource. For example, FY7_2 noted, “when there are lots of websites, so if it doesn’t work, you can look at another one.” FY8_3 described appreciating this wealth of information; “when there were lots of websites that had the information, so I could confirm it.”

Children did not discuss in great detail information seeking strategies or search habits when answering these two questions. Of the few search and search engine related points of discussion, children described search engine features such as pull-boxes, sidebars, and SERP entry descriptions. When discussing what was easy, six of 50 children discussed pull-boxes and two discussed sidebars as making their tasks easier during their observation sessions. For example, FY5_5 stated, “when I wrote how high is Mt Cook, and [the answer in the pull-box] just came up there [pointing at the screen].” FY5_3 “When [the Google auto-complete] corrects you.” Additionally, children noted that Google seems just to work. For example, FY5_3 “When I wrote ’for kids’ and it came up straight away with what I wanted.”

Additionally for Question 2, it was not uncommon for children to interpret this as asking about a particular task that they had completed during their session with the researcher. Tasks, where the children had some prior knowledge of the topic, were often reported as easier such as Mt Cook (5/50). This is highlighted by MY7_3 who stated: “it was easier to find facts when I already knew some stuff about it, like Mt Cook, which I already knew was the highest mountain in New Zealand.” One student also reported the Uranus task as being easy.

Sense of difficulty when searching

Question 1 and Question 3 asked What was hard when searching today? and What can be hard when searching on a computer? When describing difficulties that they experienced, children noted spelling (3/50), typing and scanning or reading (3/50) to be difficult at times. While three students noted spelling as a difficulty, additionally, two further students noted spelling when using a computer as easy. Features of auto-correction were the reasons children noted spelling as easy, while features of manual entering of a search query were noted for the reasons for difficulty with spelling.

Reading and triaging of information was identified by the children as an issue both within the SERP and within the resulting web pages. For example, FY5_7 described that she would “scan through the text and find the info. Sometimes websites don’t have headings, headings which make it easier to find information.” MY6_8 described disliking “when too many websites don’t have what you want, and you keep wasting time reading lots.” Children also noted that finding information within a web page was difficult. Reading, skimming, scanning, and triaging in-page is complex for children. This difficulty was a feature of the discussions for all of the age groups included in this study. For example, MY8_1 stated: “finding exactly what I wanted to find. You had to go through, scroll through.” MY8_1 also remarked that “sometimes people don’t put the right information on their websites.”

Query creation is a difficult task for children. 13 of 50 children described difficulty constructing queries. For example, MY5_5 found it difficult to know “what to write into the search box.” FY5_5 noted “when you can’t find the answer to your question. When you first write the fact, and it does not come up. I have to write the fact differently.” MY6_7 simply replied with “choosing the right thing to type in.” And unfortunately, Google, still, doesn’t read minds as explained by MY6_5: “To get the particular little bit of information that you want. It sometimes doesn’t give you what you want. It can’t read your mind. You have to search the right thing and tell it what you want.”

Children also noted difficulties with identifying the correct website to visit from the list (16/50). Alongside these observations, children also discussed finding answers or information for their problems on the resulting websites to be difficult. For example, FY6_11 stated, “it’s really hard to find the information sometimes.” MY6_8 discussed difficulty “finding the right site and knowing what to click.” FY5_5 noted it can be a problem when “you make the search too long, and it comes up with different things than you want to know.” MY5_6 said there was “heaps to choose from. Sometimes you have to change your question and make it more specific.” He had difficulty when the researcher asked him if he could elaborate, MY5_6 could not describe how to make it more specific. 13 out of 50 children discussed difficulty changing or altering search terms when a search term wasn’t producing results.

Additionally when asking what was easy or hard when searching today it was not uncommon for children to interpret this as asking about a particular task that they had completed during their session with the researcher. Typically the task about the Cricket World Cup (5/50) or Uranus (4/50) was identified, while only one student reported the possums or Mt Cook tasks as difficult. We hypothesise that this difficulty with certain tasks is due to a lack of prior knowledge or lack of interest in these topics by the participant.

Only two children of the fifty that we interviewed explicitly stated nothing was difficult for them when searching or using a computer.

Decision making when creating a search

Question 5 asked How did you decide what to type into the search box today?

18/50 children described using keywords when answering this question. For example, FY7_3 stated, “The keywords are what you are supposed to search. Not the whole, you know, you wouldn’t really want to put all of that in, so just the main points.” She continued, “You probably wouldn’t want to put all of it in because you would get less results if it is longer.” FY7_1 stated, “I used to type in the whole question, and Mum and Dad helped me and told me just to use keywords.”

20/50 children described using questions to create a search. For example, FY8_4 said “I just typed in the question. Sometimes, I won’t type in the whole question I will just type in something I want an answer to.” She continued, “the simpler the question is, the simpler the answer is. If it is too complex, [the search engine] doesn’t process it well.” FY7_1 noted that “if keywords do not work then I use the whole question.” 5/50 children described typing simply what was asked of them by a teacher, or in this instance the researcher. For example, MY8_1 stated, “I searched what you told me” and MY8_2 said, “I used your words.”

Additionally, children discussed sentences and attempting to put the search into their own words. MY6_5 said, “I tried writing things in other words. What would be the best sentence to write.”

Seven out of 50 children could not, or chose not to answer this question.

Search behaviour and search engine use

Questions 6 and 7 asked When a teacher sets an inquiry, what is your process? and Is there anything else about searching for and using information you would like to tell me? The answers to these questions were typically scant with little detail or deep discussion by the children. We presume this is likely due to the open nature of the question. These two questions did not provide new insights that differed from the results published in [7] and therefore are not reported here.

Questions 8 and 9 asked Did you see example search queries while you were typing today? and Do you use use these example queries? When? We showed each child a printed visual example of a query suggestion (i.e. query suggestions appearing as drop down suggestions from the search box) and asked if they had seen one of these during their tasks. All but one child claimed that they saw a query suggestion during the task session and that they do indeed use query suggestions when conducting Internet searches.

Question 10 and 11 asked Have you ever seen a list like this one before? and Do you use these related queries? When? We showed children a printed visual example of a related searches suggestion and asked children if they had seen one of these during their tasks. Fifteen of the 16 Y7&8 students reported to have seen a query suggestion during this session, while only fourteen of the 34 Y5&6 students said to have seen a query suggestion. We asked if they use these related searches suggestions during Internet searching. No Y5&6 student claimed to use Related Searchers during their normal search habits. Of the three students who were observed to use related search during the study, only one of these students reported using this feature during their typical search activities; the remaining two students described using this feature only occasionally. Four more Y7&8 students (and only two Y5&6 students) reported sometimes using related searches.

Again for Question 12 the children were shown a visual and asked The image above is what a search result list might look like, what coloured text did you use/read today when you were making decisions about which website to visit? This question was intended to give insight into children’s search result exploration habits. The majority of the children (46/50) reported reading the blue text, or title of the search entry first. Two Year 6 children reported reading the grey text, the short description of the search entry first. One Year 5 and one Year 8 child reported reading the green text for the URL first. 35 children also reported reading the grey description text once they had read the title of the search result entry. One of the Year 6 children who had first read the grey text stated that she next read the green URL. Only two children reported reading a third part of the search result entry, one Year 6 child reported reading the green URL last, while FY6_2 also reported reading the blue title text as the third part she would consider when making a decision.

Typical explanations regarding why children read the text they do and how they make their decision to visit a website often revolved around if a child was able to identify their topic or keywords within the title. For example, MY8_3 stated “I usually look at the blue, but never the green. I see if my information could be in there (points to the grey text).” Similarly, FY6_10 said “I read the blue text and I also read the grey text to see if it mentions anything.” Children were also able to describe the importance and usefulness of the various parts of the search result entry for them. For example, MY6_6 said “I use the blue text to see what website it goes to and the grey text is just a little bit of a brief to tell you about what is in the website and what information is given” and MY5_1 stated that he reads “the blue first, and if they’re all the same ones [if all of the blue text is the same], I read the grey”, he went on to state “because the green is just the website.”

5 Discussion

The study described in this article scrutinised the query-construction phase as well as the search-term-adjustment phase of children’s Internet search. After exploring study limitations, we discuss our findings with regard to our four research questions posed in Sect. 1 and in comparison with related work.

5.1 Limitations

Most studies identified in the literature working directly with children (i.e. face to face rather than log studies) had a relatively small data pool. While our participant selection was limited to only three schools, our study is one of the larger ones with regard to the numbers of participants, search queries, and directly observed interactions. Due to the relatively large cohort of participants, we believe that our findings contribute to the investigation of children’s query formulation and reformulation behaviours.

In this study, we worked with a single Internet search engine. Google was selected as it was the search engine familiar to all children in our previous work [7]. We do not believe that the results pertaining to query formulation or SERP list triage would be drastically different depending on the search engine used. We did not pre-screen participants for parameters such as reading level, prior web searching skills, or topic knowledge. This was mainly to limit the burden for the children and the length of time required of them. Alternatively requesting access to school records to obtain such data would have required extended ethical consideration by parents and schools. We acknowledge that there was the potential for some distraction to the participants during our study from other learning tasks being completed by students in the class. For this reason we placed little emphasis on the time to complete a task in the analysis of our results.

5.2 Addressing RQ1: Query types used by children

We note a high use of natural language queries by children in all grades studied (see results in Sect. 4.2). In our observation, children across the age levels studied tended to create more natural language queries than keyword queries. The majority of final queries (i.e. successful query after possible reformulation) used by the children were found to be question- or sentence-based queries. We thus observed a preference for Natural Language queries compared to the lesser-used keyword queries. As a consequence, final queries were less often keyword queries when compared to the number of originally posted queries. This means that more children fulfilled their information need through natural language queries. Duarte-Torres et al. [59] similarly found in their query log study that queries relating to child-appropriate topics were longer, using more frequently natural language constructs, and more often contained questions. Our study results reinforce their findings and confirm the use of natural language queries by children (which Duarte-Torres et al. [59] had only inferred were queries performed by children based on the query topics available in their anonymous logs).

We found that keyword queries required more reformulations than natural language queries to be successful (see Sect. 4.2.3). Similarly early related work reported that children experienced difficulties when constructing keyword queries [30, 74]. Often these difficulties have been linked to children’s lack of vocabulary or lack of cognitive structure [30, 32]. However, Vanderschantz et al. [7] found that children believe that keyword searches are more appropriate than natural language queries (i.e. favoured by their teachers), which may explain why they often start enquiries with keyword queries. In 2015, White et al. [57] actively discouraged natural language queries in general, even though their log study showed similar result quality for both keyword and natural language queries. This observed tension between teacher’s expectation and the children’s preferences is in line with Molin-Juustila et al.’s [22] observation that children participating in ICT studies typically bring voices of ‘others’ beyond their own.

Existing studies also indicated that children were not able to easily identify promising query-reformulation strategies [7, 36]. Our work adds further detail to this finding by observing that children often reformulated queries by adding a query qualifier (such as “for children”) or by converting that search to a natural language query. The first approach indicates a poor understanding of search queries, as the children did not recognise that their original query was lacking specificity (which cannot be achieved through adding simple query qualifiers). The second approach indicates an abandoning of the original query in favour of a natural language query (i.e. replacement instead of reformulation).

About half of all queries by children seem to undergo reformulation: Bilal and Gwizdka [4] found that 52% of children’s queries used query reformulations while in our study 45% of final queries were achieved through (on average 2.1) reformulations. Similar to Bilal and Gwizdka, our detailed analysis of question types went beyond the single classification of natural language query constructs. We found that children used more often natural language questions (102) than sentence constructions (72), which confirms the observations by Bilal and Gwizdka [4] (56 questions vs 50 phases). Different to Bilal and Gwizdka, we further analysed changes in query type during query reformulation. We found that many initial keyword queries were rephrased into natural language queries, leading to an overall dominance of successful natural language queries.

Finally, we observed differences in the number of queries created at the different age levels. We also noted that while older children visited more links to answer a query, the average number of minutes to complete a task was similar for younger and older children. This means the older children were able to visit more websites in a shorter amount of time than the younger children. This may support the arguments of both Gossen et al. [75] and Foss and Druin [17], who similarly observe differences between younger and older children and point to the need for children’s search-interfaces to change according to the children’s development of cognitive and fine motor abilities.

5.3 Addressing RQ2: task formulation

We set up our study in such a way that it allows for insights into the effects of task wording. We found that tasks posed as questions more often result in natural language questions (72% for verbal and 84% for written questions) while tasks posed as instructions more often lead to queries phrased as natural language sentences (50% for verbal and 42% for written questions). This observation should be useful when considering the way in which a learning task is being posed to children.

5.4 Addressing RQ3: use of support features

The use of the full Google Search Engine (vs a simplified test interface) allowed us to analyse the use of supporting features such as spelling correction, query suggestions, “did you mean” and related-search features. This type of query support has been sparsely reported in the literature for children’s search strategies. Weber and Jaimes [36] found in their query log analysis that children and young people used query suggestions more than adults. While their aggregation of data from 5 to 23 year old does not allow detailed analysis for children alone, our observations of children aged 9 to 13 supports their finding: 92% of observed children used query suggestions and completion support (see Section 4.2.5).

Both Anuyah et al. [46] and Fails et al. [66] explored children’s preferences in query suggestions. Different to their study results, we observed a very low use of the related search feature in general (3 of 50 students). When the query suggestion feature was used at all, children used it for correcting their spelling or choosing the completion of their query (as compared to an alternative query creation). Similarly, Druin et al. [31] also found little related searches on Google, however, they also observed low use of spelling suggestions which differs to our findings. As argued by Druin et al. [31], a reason may be that children did not use live spelling and query suggestions because the children were looking at the keyboard when typing. By contrast, we observed that in our study a significant number of students (46 of 50) took advantage of the search box query suggestions (see Section 4.2.5). We hypothesise instead that the reason for the low use of the related-search feature is that Google places this service at the bottom of the results page which required scrolling.

5.5 Addressing RQ4: search result exploration

Different to other studies [4, 44] our setup was in situ, i.e. in the school environment on computer systems the children were familiar with, using the full Google search engine to allow naturalistic observations. This setup allowed us to make observations about children’s interaction with sponsored links and advertisements. 20 of 50 students were observed to follow sponsored links / advertisements as if they were search engine results (see Section 4.2.6). These observations confirmed the prediction by Duarte-Torres and Webber [33] that children may face potential issues when encountering ads that appear to be query results.

Both Duarte-Torres and Webber [33] and Vanderschantz et al. [7] reported that children were likely to select higher-ranked results, if not the top-most result in a search list. Vanderschantz et al. [7] also reported that children were taught to scan the result list for Wikipedia references. Children in our study were indeed observed to select the top result, be it a search result or advertising link for 116 of 250 queries posted, and numerous children (39 of 50) visited Wikipedia (see Section 4.2.6). This observation is further confirmed in our exist interviews (see Sect. 4.3) where children reported difficulties identifying the correct website to visit given the information presented in the SERP list.

Additionally, we observed 14 of 50 children mistakenly revisiting links from the result list that they had already visited through the sidebar or pull-box (see Section 4.2.6). This observation gives new insights that are not covered in other work. This aspect of the result triage process requires further research.

We observed physical features of children’s interactions during both search construction as well as search engine results page triage (see Sect. 4.2.6). We observed that all of the children needed to look away from the screen to interact with either the keyboard or the mouse. Similarly, children also used their mouse cursor or their finger to guide their eye when reading web pages or triaging SERP results. We hypothesise that this may be to assist with remembering the location of information that they want to refer back to after additional reading or scanning.

Vanderschantz et al. [7] reported that when a child was satisfied with the answer they planned to further confirm the answer using multiple sources. Our study found very little evidence of children performing such triangulation. Different to Bilal and Gwizdka [4], task completion, result quality, and search success were not assessed as a part of our study. We instead allowed the children’s judgement to stand as it would be in an educational setting.

5.6 Implications for query support

As an outcome of our study reported here, we identify five requirements for Internet search engine interfaces to better support children’s search behaviour.

As discussed in Sect. 5.2, our study found that children predominately used natural language constructs, which are not explicitly supported in current Internet search engines. We further observed that the way in which learning tasks are phrased has implications for the children’s approach to searching (see Sect. 5.3). As many learning tasks are phrased as questions, this is a query construction type that may need explicit support. We found that children used existing features that support query construction, reconstruction, and search result list triage (see Sect. 5.4). We recommend therefore the development of interface features that explicitly support and encourage the construction of natural language queries, and prioritise the visualisation of re-formulations and related searches to assist with query reconstruction.

Design and presentation of information in a SERP results list may need to be considered differently for children, many of whom need to regularly look away from the screen during text input or triage (see Sect. 5.5). Supporting children to recall what websites they have already visited in a more visual manner may alleviate some of the repeat visits that we observed in our study. Providing SERP triaging tools may also assist with planning of website visits and further minimise confusion regarding websites already visited and websites yet to visit.

Lastly, information in sidebars and pull-box’s was used by children in our study but the source of this information was not well understood. Investigation of interface design options that highlight the providence of this information will assist children.

Our five requirements for better supporting children’s internet search are summarised below:

  1. 1.

    Interface elements that assist with constructing natural language questions and sentences, based on findings from RQ1, RQ2 and RQ3

  2. 2.

    Assistance for query re-formulation, based on discussion about RQ1

  3. 3.

    Easier identification of related search, based on discussion about RQ3

  4. 4.

    Interface elements that support triaging of visited websites, based on RQ4

  5. 5.

    Clarification of sidebar pull-box sources, see RQ4

Early exploration of an interface to support these requirements can be found in [76].

6 Conclusion and implications

This article presents an investigation into children’s query formulation and search result exploration when using the Google search engine in a New Zealand school context. We carried out an in situ study with a considerable cohort of 9 to 13 year old children. Our results strengthen and extend the insights of previous log and laboratory-based studies. Children’s query formulation and result exploration have previously been investigated predominantly in somewhat artificial settings (e.g. in laboratories or with simplified/dedicated interfaces) or as log studies. By contrast, our study was done in a naturalistic educational setting, where both task formulation and live search engine features may influence the children’s query formulation and result exploration. A further novelty of our work is the exploration of how query types change throughout children’s reformulations towards successful queries.

We found that the children copy the manner in which a search task is posed to them. We further observed a change from predominately keyword-based searches towards natural language queries after reformulation. Reasons for the observed change may be found in the children starting out following their teacher’s advise to search using keyword queries [7, 72], gradually giving way to the children’s own inclinations to pose natural language queries. Observing children using a live search engine highlighted that children were misled by features such as advertising and sponsored links, while not benefiting from the support features (related search and query suggestion). Reasons seem to be the physicality of children’s interactions with the computer, requiring them to look at their hands for coordination.

Our findings are directly relevant to teachers of inquiry-based learning as well as digital literacy. Firstly, we identify a need for educating both children and teachers in how to construct search queries best suited for modern search engine capabilities. This may include a need for education in the creation of suitable natural language queries. Secondly, the influence of task phrasing should be considered when developing learning tasks for children, and helping children to define information needs.

We believe that further studies are needed to explore these issues, preferably involving collaborations between researchers from the fields of information seeking, information science, human computer interaction, and education.

The findings from our study are also relevant for search engine design. While previously a common solution to children’s information search issues was to develop child-friendly designs, interfaces, or software, we follow [28] in their suggestion to instead improve existing systems. We recommend developing system interventions that will equally support children and adults in creating and reformulating search queries. Search interfaces could therefore better serve users by clarifying their improved ability to handle natural language, both as questions and sentences. Furthermore, any support for query formulation needs to be clearly visible before typing, while support for reformulation needs to be separate to the search box and not interfering with text entry.

Finally, future research is needed into design and presentation of information in a search results list to better support children who need to regularly look away from the screen during text input or triage. While acknowledging that in situ studies such as ours are management and labour intensive [45], we would like to encourage future research to embrace such methodology as it is particularly important for children to work within their known context, resources and devices.