1 Introduction

Children naturally explore their world, are inquisitive, and seek knowledge. In a world where the real and the virtual merge, especially for a child who is learning and playing, the quality of the software application, both the simulation and the user interface (UI), can be the difference between joy and high impact learning or frustration. The term “quality” for this paper is defined as the most accurate approximation to reality and accuracy possible. The field of human–computer interaction (HCI) and computer-mediated learning needs to support children through usable, child-friendly technology. Content for formal and informal learning embedded in virtual reality applications, specifically educational simulations, is defined as the integration of realistic, high-fidelity, 3D real-time computer graphics and e-learning technologies, to make a significant difference in the child’s task of discovery-based learning.

These educational simulations may be implemented with a class of technology commonly referred to as virtual reality (VR) applications, virtual worlds and environments (VE), massively multiplayer online games (MMOGs), and video games, because this class of technology can be used to design and support discovery-based learning activity found to be fun, enjoyable, and engaging. Educational simulations may be implemented in a wide range of technology, with designer controlled settings of the ranges in visual quality and navigational freedom. These settings can range from cartoon-like images and scaffold navigation to photo-realistic imagery and complete 360° navigational freedom. Most academic applications produced for research in the field of educational virtual reality have focused on a wide range of variables, making the isolation of causal factors nearly impossible. The entertainment/edutainment field, while commercially motivated, has used cartoons and low-quality graphics due to the lower production cost, and may unintentionally introduce misconceptions. Thus, where fidelity and freedom have not been explored as critical design features, the question remains:

Do these choices, visual fidelity and navigational freedom, in the design have an impact on user learning activity and on measurable learning outcomes?

It has never been more important to design systems based on empirical research to inform design choices of educational software, as the form and function of the tool will influence learning outcomes. Most video games, which may be considered high fidelity simulations, stimulate the visceral (Dill and Dill 1998) without yielding educational benefits. Given the large number of children playing such games, and the time spent on this activity, it is an important area of research to understand the impact this new media can have on the perceptions and cognition of children. We now live in an age where the designer/programmer of such systems can bias the perception, cognition, memory, learning, significantly influencing long-term and stable episodic memory of children and ultimately, the decisions and actions of the next generation through these powerful simulations.

This work focused on the design of such a system with the goals and activities of intrinsic learners in mind. It follows the best methods and practices of user-centered design with children (Druin 1999) and ethnographic observations of real-world activity (Harrington 2009, 2010). A strong case in this research is made for ecological validity (Gibson 1979) of educational simulations with real-world fidelity and real-world informational accuracy in both content and sensory representation required. This case is in stark contrast to the case made by the edutainment community, where designer contrived “indirect control,” is intentional, used to influence the child’s decisions, behavior, and possibly their learning outcomes (Schell 2005). The goal of this research is to focus on the factors of the design that support motivation for true, not indirectly controlled, self-directed learning and to select those design factors to maximize free inquiry and choice in knowledge seeking behavior. Thus, the main directional thrust of this empirical investigation is into the UI design parameters that influence self-selected and self-directed knowledge-seeking behavior, in stark contrast to prior work in edutainment. Given that much of the edutainment research has been highly informal, lacking scientific rigor in empirical and statistical research design, system design and methodology, it has been difficult to understand or to isolate from the thousands of variables the exact case and effect of such work. A key contribution of this work is the clear and careful statistical design, supported by the system design that makes it possible to isolate the factors and to be confident in the results. Ceteris paribus means, “To hold all else equal,” and that is exactly what this system and statistical design has done.

It began with a simple question, “How do software design parameters impact the quality of the child-computer interaction in support of the user task of intrinsic learning?” Argued here is that the two main parameters of all virtual environments used as educational simulations consist of the quality of the computer graphics, Visual Fidelity, and the freedom of choice in active and investigative knowledge-seeking behavior, Navigational Freedom, commonly referred to as discovery-based learning.

This paper reports a detailed empirical investigation into the design parameters of Visual Fidelity and Navigational Freedom as independent variables and their effect on the outcomes or the dependent variables of Salient Events, Fact Inquiry, Time in System, and Knowledge Gained. Thus, with this goal to isolate the impact of the independent variables on the dependent variables, the system, The Virtual Trillium Trail, was intentionally constructed to support a Two-way ANOVA as a planned orthogonal contrast (POC), and delivered the highest quality of results possible. The findings presented show that High Visual Fidelity has a significant and positive impact on Salient Events and Knowledge Gained. Furthermore, when combined with High Navigational Freedom, the difference between a pre-test and post-test, Knowledge Gained increased by 37.44%, without any external motivators.

2 Theoretical background

2.1 Overview of virtual reality research and education

In the past, educational goals have been supported with the use of negative and positive external motivators, such as points, grades, money, and social factors. Past research investigated different methods, such as framing the problem, employing subliminal user interface tactics, scaffolding, and using constructivism. This work is not to debate the usefulness of such tactics or not, as intentional design choices of educational software, but to look at the salient design factors inherent in the software applications of virtual environments used as educational simulations for discovery-based learning, and to investigate the primary design factors of such systems. The first necessary scientific step is to isolate and define the impact of design variables found in such applications, and to establish irrefutably and free of other confounding variables, the impact on outcomes.

The past results were fragmented and dispersed in their educational goals, directions, and results, with non-overlapping variables and parameters across projects. Some focused on the educational pedagogy of “constructivism” (Winn 1993), others on the role of social factors (Barab et al. 2007), and yet others on the effectiveness of “scaffolding” (Johnson et al. 1999; Roussou et al. 2006). As important as these individual findings are, combinations of results are impossible due to the complete lack of internal validity across projects, including different hardware, software, procedures, methods, and educational measurements. It is a patchwork of past results.

Furthermore, this is not a paper about immersion or presence, as both terms are widely used in the virtual reality community, but are without any universal, uniform, and quantifiable definition. Both terms are usually defined in a post-experience attitudinal survey that measures the subjects’ subjective ranking of that quality. Some papers use the term “immersive” to describe the hardware configuration of a head mounted VR display (Winn et al. 2002), and others use it to describe caves or surround theaters (Jacobson 2008). Research has shown that large displays, such as the CAVE™ (Cruz-Neira et al. 1993) or surround theaters increase the subjective feeling of immersion, further confusing the definition. The subjective rating of “presence” has been cited as a result of immersive VR (Nash et al. 2000; Pausch et al. 1997), thus it is expected that where subjective ratings of immersion are found, we should find subjective ratings of presence.

There is non-parametric evidence that such immersive displays positively impact learning (Jacobson 2008). Jacobson linked the technical implementation of a surround theater, to a subjective survey with results of high ratings of immersion, to evidence of learning. In addition, higher levels of the subjective rating of presence predicted post-test scores, (standardized regression coefficient β = 0.44, t = 2.87, P < 0.05), thus linking immersive head-mounted VR equipment to an educationally powerful learning experience (Winn et al. 2002). However, presence, has been reported as occurring with desktop VR implementations (Dede et al. 2006; Winn et al. 2002), and while less powerful than immersive implementations (Winn et al. 2002), the feeling of presence exists. Thus, if a desktop VR application showed learning outcomes, one can postulate that those findings would be magnified in an immersive head mounted, cave, or surround theater implementation, and would most likely yield higher levels of subjective ratings. The point is that presence, and immersion, are either a subjective emotional reaction or a description of the output display configuration, and both of which are outside the scope of this paper.

In this research, a desktop implementation was selected to test the two software factors of Visual Fidelity and Navigational Freedom on learning, independent of user subjective ratings of either immersion or presence, and independent of the output configuration. Output device size and design are additional variables, which can be added in future research. It is also possible to add other dimensions to the output configuration, to increase feelings of immersion, or presence, such as surround sound, smell, tactile output, and gross motor and sensory simulation with a treadmill or bike.

In the past, enjoyment is well documented as a subjective outcome of games, edutainment, and virtual reality, but not so for simulations used for educational applications. However, when the designer has intentionally developed a VR system for enjoyment and education, positive subjective ratings of enjoyment abound (Allison et al. 1997; Barab et al. 2007; Bobick et al. 1999; Dede et al. 2005; Johnson et al. 1999; Roussos et al. 1999; Roussou et al. 2006; Schell and Shochet 2001). This is the research space were the Virtual Trillium Trail resides.

Enjoyment is a known response to applications, VR and not, that allow social collaboration. When social collaboration is combined with VR for education, it has been shown to be a powerful design tool used to encourage participation (Barab et al. 2007; Dede et al. 2005). Barab et al. (2007) and Dede et al. (2005) cannot isolate the causal factors in the system design on their learning outcomes, because it could have been due to the social, collaborative factors, not the quality of the virtual environment. However, this study focused on the individual child and their interactions with the system, alone, so as to better isolate the cause—system design factors—on the learning outcomes. The ideal is to design a system that is both effective in producing learning and is enjoyable. Social collaboration can be added as an additional factor in future research to measure the additional impact enjoyment can have on learning.

2.2 Visual fidelity

Today, the military (Morie et al. 2005) and medical fields (Scharver et al. 2004) make use of immersive, desktop, augmented, and mixed reality applications with haptic and even olfactory feedback capabilities for situation awareness training and procedural task transfer training in mission-critical, manufacturing, and health-care environments. However, there is evidence to support the use of desktop virtual environments for procedural knowledge transfer to real-world activities and for use in reducing learning curves (Aggarwal et al. 2007). Researchers already know the positive impact of fidelity used in medical training (Nehring and Lashley 2009), hence the suggested importance of visual fidelity, isolated from the other signals, as a factor in the design of educational simulations for children with desktop virtual reality.

Over the last decade, research showed evidence of knowledge gains in virtual environments (Mikropoulos et al. 2003; Salzman et al. 1996; Dede et al. 2005), where the closer the simulations approximates visual reality, even on desktop VRs, the higher the gains. For example, a biology simulation was reported to increase test scores by 50% (Mikropoulos et al. 2003), compared to a low fidelity, older virtual environment implementations that was based on a fictitious scenario, “River City,” (Dede et al. 2005). However, a decisive comparison is difficult due to the lack of similarity in research design across experiments. Only a noticeable difference in the factor of visual fidelity suggests a tenable hypothesis. Thus, this research designed a system capable of setting that factor, Visual Fidelity, to high and low levels for a rigorous and powerful comparison.

2.3 Navigational freedom

While navigation as an isolated factor with respect to impact on intrinsic learning has not been explored, wayfinding and navigation as a response to frames of view have. Frames of view represent an important design dimension, as context will influence all perception, attention, decision making, action, and ultimately learning. Framing a problem or framing a situation will influence what signals are attended and which are acted on, and is a commonly used as a design tactic in all media. Frames of view in the Project Science Space (Salzman et al. 1996), showed the power of framing information for educational purposes in virtual environments. Additionally, the navigational properties of virtual environments showed positive impact on wayfinding knowledge acquisition (Darken and Sibert 1996), thus promising powerful learning results when spatial information is linked to landmarks. Suggesting the power of a design that allows wayfinding with landmarks linked to information for knowledge acquisition. Additionally, ease of use for 3D model inclusion has a higher rating than flat 2D images (Yoon et al. 2008), indicating that embedding a 3DUI would be easier to use, and thus desirable for educational simulations for children, then 2DUIs used as an overlay. To continue to build on the theme of the child’s freedom of choice to allow active participation in seeking information and knowledge, full freedom to choice a frame view, and to move in any direction were allowed in the high-level Navigational Freedom condition. However, restricting the child to the path, reduced the child’s frame of view and freedom of direction in the low-level Navigation Freedom condition.

3 Research framework and methods

3.1 The design components of an educational simulation

The Virtual Trillium Trail (VTT) is a desktop, virtual reality environment designed as an educational simulation, as it represents one square mile of biologically accurate scientific plot study data (Fig. 1), (Kalisz 1996–2006). It is a virtual environment based on statistical data visualization, not on fantasy. Unlike educational video games, or edutainment, it does not offer points, money, or a chance to win, nor does it use “indirect control” (Schell 2005) or other examples of external motivators. It is a simulation of a real world field trip to a local wildflower reserve, with fourth grade ecology educational content and informal learning activity embedded in the system (Harrington 2009; Harrington 2010). The content, from the Forest Community curriculum, was generously provided by the Western Pennsylvania Audubon Society (Beechwood Farms Nature Reserve 2005), and aligned with the Pennsylvania public school educational standards on ecology. The educational simulation is a scientific visualization, as the terrain is constructed from DEM data, much like Google Earth, and the plant population distributions reflect 36 of the 102 plants documented in biological field studies, (Kalisz 2006). VTT is built in a non-compiled, inexpensive, off-the-shelf video game technology (UnReal Technology 2008) and uses standard input/output devices of mouse, keyboard, monitor, and speakers. The soft user interface elements are tool-tip like “fact cards” and audio objects, “sprites.” These annotate the plants and locations with facts and concepts from the curriculum, providing access to all information required for the pre- and post-tests. This design choice, to annotate objects directly in the 3D model was based on prior empirical research and testing of 3DUIs, which showed that the parameters of usability and effectiveness of such UIs, embedded within the virtual environment as superior to a flat 2D overlay GUI (Bowman et al. 2003).

Fig. 1
figure 1

The Virtual Trillium Trail and a fact card, 2007

3.2 Impact of graphics fidelity on learning

Visual Fidelity is an important factor for three main reasons. First, the defining criterion of VR and specifically educational simulations is ecological validity and the visual approximation to photo-realistic reality, and therefore Visual Fidelity is a critical feature. Second, many educational/edutainment applications use low-cost, cartoon-like quality images, or images that are based on fantasy, and thus the image is the artist’s interpretation of reality, possibly introducing misconceptions. Third, many hand-held devices, or collaborative networked platforms and older PCs with low-end processing power make experiencing virtual environments anywhere and anytime possible, but the graphics on these portable and networked devices are of lower quality than those found on high-end desktop PCs or gaming devices. The experiment contrasted low-level Visual Fidelity and the high-level Visual Fidelity, in the software. These factors may ultimately prove to be a design choice, based on exogenous factors, costs, and learning goals, but the intent here is to isolate and understand the two factors as inputs on the primary investigative outputs of discovery-based learning behavior, Salient Events, and differences in test scores, Knowledge Gained. Salient Events is defined as the individual’s choice to stop exploring and to start fact and concept inquiry, and Knowledge Gained as the difference between a pre-test score and a post-test score.

3.3 Scaled and non-arbitrary image values

Visual Fidelity expressed on a scale (Fig. 2) would rank and order images from a low value to a high value of visual fidelity. A simple black and white 2D image is an example of low fidelity. Alternatively, the real-world object viewed on location is an example of high fidelity, the highest standard for fidelity. For this research, it will be necessary to choose two points on the scale that represent low fidelity and high fidelity states in VR.

Fig. 2
figure 2

The image represents the relative range, rank, and order or scale of Visual Fidelity, thus making it possible to select two variables, set to high and low values of one factor

3.4 High and low visual fidelity

The high-level Visual Fidelity (HF) condition was created by texture mapping the 3D wireframe models with photographs (Fig. 3). The low-level Visual Fidelity (LF) condition was created by texture mapping the same 3D wireframe models with cartoon-like images, while retaining the critical salient attributes of the leaf structure for identification purposes (Fig. 3). All other factors were held constant. Consider the following example of the LF state (Fig. 3) of a 3D computer model, textured with a color image of a Trillium, as one might expect in a cartoon-based edutainment system. The image resembles a cartoon image, as it lacks photo-realistic textural detail. It was created by using Photoshop to filter away the detail of the original photograph. While such systems may be described as artistic or beautiful, that is not the point here. The point is to control the contrast on a measurable scale. Now consider the following example of the HF condition (Fig. 3) of the same 3D computer model texture mapped with the photograph of a real Trillium and shown in the context of the Trillium on the virtual hillside.

Fig. 3
figure 3

High-level (top set) and low-level (bottom set) Visual Fidelity conditions

3.5 Impact of navigation on learning

Research on navigation in virtual reality as influencing learning is relatively new. Recent evidence on scaffolding shows positive results, which is restricted to navigation and intentionally instructor-controlled and programmer-implemented (Roussos et al. 1999). Such systems explicitly designed to guide a student through material with the student’s awareness and acceptance of that guidance. Such systems are ideal “crash courses” for quick access to important information in a body of knowledge. Additionally, there are situations where there is evidence in support of low-level Navigational Freedom without an explicit guide, but an implicit one. Traditional intelligent tutoring educational applications use such navigational design. These applications guide students through procedural or algorithmic processes by intentionally constraining navigation to content of the user’s personal weakness for focused and efficient improvement (Brusilovsky and Sosnovsky 2005).

In contrast to systems that explicitly control navigation there are some systems deigned to create the illusion of freedom, but in reality restrict the navigation to designer and programmer selected routes. They do not offer true freedom and the student may not be aware of the game’s underlying structure. Edutainment applications use “indirect control,” (Schell 2005) defined as designer constructed navigational routes. When the designer controls access to information, the player by definition does not have free choice. The player only has the illusion of free choice, which is the antithesis of freedom. The condition in this experiment of low-level Navigational Freedom is a controlled and restricted route.

Educational simulations are different from scaffold educational systems, traditional intelligent tutoring educational applications, or edutainment systems, as educational simulations focus on true, 360°, freedom of navigation to allow open exploration and to respond to individual intrinsic learning behavior. Such systems reflect the real world activity of children engaged in unrestricted informal, discovery-based learning, requiring true free choice in navigational activity in both the exploration and inquiry of the knowledge embedded in the system. There is empirical evidence from activity studies of real world field trips, that students require freedom to explore, and in the process, encounter personally meaningful events, Salient Events (Harrington 2009) required for initiate inquiry. Such events may prove to be powerful learning features in such systems for influencing episodic memory. These Salient Events are different from “indirect control” because they are not designer selected or programmer created, they are in the virtual world and the child uniquely responds to them as salient. These design options, scaffold, intelligent tutored, and high-level Navigational Freedom may well be complementary, depending on objectives. This experiment contrasted low-level Navigational Freedom and high-level Navigational Freedom (Fig. 4).

Fig. 4
figure 4

The image represents the relative range, or scale of navigational freedom, thus making it possible to select two variables at the ends representing one factor, set to high and low values

3.6 Scaled and non-arbitrary navigational values

Navigational Fidelity expressed on a scale (Fig. 4) would rank and order the degrees of freedom in user’s choice of routes from a low value to a high value. For example, linear wizard navigational design is an example of low navigational freedom, and may be an excellent navigational choice for drill and practice. Nevertheless, the choice in navigation is severely restricted to only forward. Contrasting the other extreme, such as in the example of a high-end computer game, where the user can fly, swim, and take portals to different levels. Navigational freedom allows active, user-initiated inquiry and free exploration in context, arguably with infinite route dimensionality and choice. For this research, it will be necessary to choose two points that represent the low navigational freedom state and the high navigational freedom state.

3.7 High and low navigational freedom

The low-level Navigational Freedom (LN) condition was a modification of the high-level Navigational Freedom (HN) condition (Fig. 5). Movement in the LN condition was restricted to the path implemented with invisible walls. Therefore, in that condition the child could see objects in the distance, but could not go there to investigate, freedom of free exploration was restricted. However, they could investigate any object on the path. Movement was not restricted in the HN condition. Thus modified, the same model expressed two different navigation states, or levels.

Fig. 5
figure 5

High-level (top set) and low-level (bottom set) Navigational Freedom conditions

3.8 Experimental conditions

By combining the design options in each dimension, four truly orthogonal system states result (Fig. 6). This is required for statistical integrity, where the design of the system directly supports the required constraints for internal statistical design validity of a planned orthogonal contrast. We are not comparing four different systems; we are comparing four different internal system attribute settings of one system: (1) High Visual Fidelity and High Navigational Freedom (HFHN), (2) High Visual Fidelity and Low Navigational Freedom (HFLN), (3) Low Visual Fidelity and High Navigational Freedom (LFHN), and (4) Low Visual Fidelity and Low Navigational Freedom (LFLN). As one system with two main factorial dimensions, Visual Fidelity and Navigational Freedom are set to High or Low settings, for the required Two-way ANOVA levels. One of the unique contributions of this research is the VTT system design supports planned orthogonal contrast statistical design for the definitive isolation and thus empirical impact those variables can have on outputs, something lacking in prior work. Thus, the impact of each factor on learning was measured with high confidence.

Fig. 6
figure 6

Two-way ANOVA design of the two main factors

3.9 Main questions

  1. 1.

    Do different levels of Visual Fidelity, have an impact on Salient Events, Fact Inquiry, Time in System, and Knowledge Gained.

  2. 2.

    Do different levels of Navigational Freedom, have an impact on Salient Events, Fact Inquiry, Time in System, and Knowledge Gained.

  3. 3.

    Is there interaction between the factors of Visual Fidelity and Navigational Freedom, impacting Salient Events, Fact Inquiry, Time in System, and Knowledge Gained?

4 Research design

4.1 Planned orthogonal contrast: two-way ANOVA design

The VTT software was intentionally designed to support the constraints imposed by a planned orthogonal contrast (Fig. 7) as a way to produce the most powerful statistical tests of mean differences (Glass and Hopkins 1996).

Fig. 7
figure 7

The process, a pre-test, random assignment to one of the four system conditions, and an immediate post-test

4.2 Independent variables for the two-way ANOVA

The independent variables are Visual Fidelity and Navigational Freedom. Visual Fidelity is a main factor and has two levels: Low (LF) and High (HF). Navigational Freedom is the other main factor and also has two levels: Low (LN) and High (HN).

4.3 Dependent variables for the two-way ANOVA

For each of the four conditions or groups, the dependent variables reported are Salient Events, Fact Inquiry, Time in System, and Knowledge Gained, as they were impacted by the different levels of the main factors.

4.4 Salient events

Reported is the count of Salient Events in each of the four system conditions: the number of times the student freely stopped to read a “fact card” or freely stopped to listen to an informational audio “sprite” in total. Thus, a Salient Event is a measure of freely selected change in student activity from open exploration to that of object informational inquiry. For example, a student may see a flower’s “fact card” and walk towards it and stop. A student may see a “sprite” and walk over to it, stand inside it, and listen to the audio recording for a concept. All “fact cards” and “sprites” are the same, so there are no designer programmed tools used to gain interest, it is depended on the individual what they select. If the student has a flower “fact card” in view but walks by the card, it is not counted as a Salient Event. If the student has a “sprite” in view and walks by or through it, it is not counted. Only events where the student stops exploring and starts inquiry are counted. Thus, the definition is the count of plant objects and audio recordings that the child selects to investigate. There were 0–100 objects in the model. Salient Events count measured the personal salience of interaction with environmental salience, within a natural context. All “fact cards” and “sprites” had the same visual attributes, a small spear of animated, glittery sparkles, so they should all be viewed equally, from a perceptual or attention standpoint. This is quite different from Schell’s (2005) use of indirect control. The use of the annotation in VTT was intended to be a type of 3D highlighter. The child freely chooses which ones to attend and which to ignore, exactly the behavior observed and video taped in the experiment. Each child’s path was personalized and individualistic in stark contrast to the indirect control work of Schell (2005).

4.5 Fact inquiry

Fact Inquiry is a total count per object and total count per system condition. When the child clicks on an object’s “fact card,” the UI responds with the stack of cards. These are like tool-tips, but multiple cards, staked, and navigable in a linear fashion, much like a PowerPoint presentation. Each click on each “fact card” was recorded as a count. There were 1–6 facts per object. Each fact had the same weight. The total number of facts in the system was 240, which is representative of the entire fact database presented in the system.

4.6 Time in system

The Time in the System is defined as the number of minutes the child chooses to stay in the system and is a proxy for enjoyment. Students may stop at will or continue for the maximum allowed time, 60 min.

4.7 Knowledge gain

Knowledge Gained is a measurement of facts, concepts, and values measured as a percentage change between the post-test scores and the pre-test scores. The tests were identical. All facts and concepts tested were in the VTT system. No knowledge of pre-test performance was given to the students. Knowledge Gain = [(post-test score—pre-test score)/post-test score] *100. The range is 0–100% in value. The rationale is to measure the impact on learning.

4.8 Hypotheses

The Two-way ANOVA tests main effects and interaction effects for all variables under investigation. For each of the independent variables (IV), Salient Events, Fact Inquiry, Time in System, and Knowledge Gain, three statistical tests were possible.

H01

μ IV(HF) = μ IV(LF)

Ha1

μ IV(HF) ≠ μ IV(LF)

H02

μ IV(HN) = μ IV (LN)

Ha2

μ IV(HN) ≠ μ IV(LN)

H03

No Interaction.

Ha3

Interaction

5 Experiment

5.1 Population

The volunteer sample (N = 64) was drawn from urban, suburban, and rural, public, private, and home-school populations, located in and outside of Pittsburgh, PA. All participants were volunteers, as is required by US Federal Regulations to protect the rights of human subjects in all research. The population is diverse but biased towards an upper socioeconomic profile.

As such, all of the volunteers were interested in computers, video games, nature, and art. The sample was restricted to third, fourth, and fifth grade students and used the legal definition for those grades. Recruitment occurred between March 2008 and June 2008 through schools, PTO e-mail lists, and individual referrals. Additionally, demonstrations of the VTT system to the public at several Earth Day events at nature reserves and a Mother’s Day event at a botanical garden resulted in the recruitment of student volunteers. Three volunteers were refused, as one was in sixth grade, one student was dyslexic, and one was autistic.

5.2 Sample

Random assignment to one of the four conditions was done to insure internal validity. The volunteer was assigned a number by using a random number generator without replacement on digits from 1 to 64, and then pre-assigned to one of the four conditions prior to the time of the study. A correlation coefficient between volunteer order number and random placement number is −0.05. The pre-experience demographic survey yielded information on user profile, age, grade, self-rank of PC computer expertise, and self-rank on enjoyment of nature. To verify that the four groups were homogeneous prior to the running of the study, a one-way ANOVA was run on the data to compare the variables of Grade in School (M = 4.03), F (3, 61) = 0.5199, P = 0.67; Gender (M = 0.625), F (3, 61) = 1.2392, P = 0.303; and Pre-test Score (M = 21.59), F (3, 61) = 0.9117, P = 0.4407. All four groups were statistically identical.

5.3 Materials

5.3.1 Curriculum

The educational content came from the Audubon Society of Western Pennsylvania’s Natural Communities curriculum for the fourth grade (Beechwood Farms Nature Reserve 2005).

5.3.2 User profile survey

Prior to the experience, a paper-based, researcher administered, demographic and user profile interview and Likert scale survey were administered by the researcher; the same interview and survey were used in a pilot test of the system in comparison to the real field trip (Harrington 2009). See Harrington (2009) for the exact questions used in this study.

5.3.3 Pre- and post-tests

The pre- and post-tests were pure recall for facts and concepts. They were administered by the researcher prior to the system condition experienced and immediately following the experience with the software. The tests consisted of facts, such as, “Name all of the wildflowers you know,” and concepts as terms to be defined, such as “What is a valley?” Fill in the blanks, as labels to drawings of the forest. Answers were graded as correct or incorrect and consistently with the use of a rubric. The tests were created with the feedback from a fourth-grade teacher, specialized in science.

5.4 Procedure

After the IRB consent process was completed, the volunteers were contacted to schedule the experiment. The researcher traveled to their home to administer the demographic survey and the pre-test. An equal number of 16 subjects assigned to each condition based on pre-recruitment random number assignment. Then the student, received a scripted tutorial, up to 15 min of training time on the system. Most students required no more than 5 min and were noticeably eager to start. All students received the same instructions, to, “Go and explore and inquire at will, and stop at anytime.” A unique part of this work is the free will afforded to the students, who were in complete control over the time they used the system, so they could stop at any time. Most choose to play with the software for the full time, and many asked to continue to play with the software after the experiment had ended. All activity was video- and audio-recorded and notes were logged in a lab notebook. Automatic logs proved to be too noisy, as children become impatient and hit keys multiple times before the computer responds. Instructing them to, “only hit a key once,” is ineffective. Once a student selected to stop using the software, an immediate post-test was administered. Additional attitudinal survey and a creationist, micro-world study was scheduled. After the second visit, they all received a free badge, similar to a Boy or Girl Scout badge as a thank you gift.

5.5 Assessment instruments

The materials consisted of a paper pre-experience demographic survey, pre- and post-tests, and a paper post-experience interview and survey. The test content, presentation, and grading rubric key were identical for the pre- and post-tests. The test content was derived from the curriculum, “Natural Communities,” provided by the Audubon Society of Western Pennsylvania. The content-tested knowledge of facts, concepts, and also probed personal values with questions and a drawing.

A notebook and a digital video- and audio-recording device were used to capture user activity in situ. The automatic computer logs created by Unreal (UnReal Technology 2008) have proven to be too noisy and to record false positives. With the video recordings, a manual audit conducted and corrected recording of all Salient Events and Fact Inquiry activity.

5.6 Audit of assessment

An audit conducted on the results compared 108 test items, or close to 20% of total tested material. Two independent teachers and the researcher graded the pre- and post-tests and the logs. Using the Cochran test, results produced evidence that both the pre-test, Cochran’s Q pre-test(2) = 9.8, P = 0.007, and the post-test, Cochran’s Q post-test(2) = 10.3, P = 0.006, are homogeneous, consistent, and reliable.

6 Results

6.1 Salient events

6.1.1 Visual Fidelity significantly impacts salient events

The main effect of Visual Fidelity shows that High Visual Fidelity (Μ = 14.46, SD = 6) produced higher Salient Events counts than Low Visual Fidelity (Μ = 11.31, SD = 6.37). A Two-way ANOVA produced evidence of statistically significant effects, F(1,60) = 4.35, P = 0.00413. Thus, increasing Visual Fidelity significantly increases Salient Event activity.

Ha1

μ Salient Events (HF) > μ Salient Events (LF)

6.1.2 Navigation strongly trends impact on salient events

The main effect of Navigational Freedom shows that High Navigational Freedom (Μ = 14.25, SD = 6.99) produced slightly more Salient Events counts than Low Navigational Freedom (Μ = 11.53, SD = 5.38). A Two-way ANOVA produced evidence of a strong trend that the Salient Events by Navigational Freedom are different, F(1,60) = 3.23, P = 0.0773. Thus, Navigational Freedom strongly trends towards a statistical impact on Salient Events.

H0 2

μ Salient Events (HN) = μ Salient Events (LN)

6.1.3 No interaction of Visual Fidelity × navigational freedom on salient events

A Two-way ANOVA produced no evidence of interaction, F(1, 60) = 1.48, P = 0.2285.

H0 3

No Interaction on Salient Events: Visual Fidelity × Navigational Freedom (Graph 1, Table 1).

Graph 1
figure 8

Salient Events 2 × 2 ANOVA

Table 1 Contingency table of salient events

6.2 Fact inquiry

How did Visual Fidelity and Navigational Freedom impact Fact Inquiry, as measured by the count of cards flipped by the student? A Two-way ANOVA was used to test main effects and interaction effects.

6.2.1 Visual fidelity shows no impact on fact inquiry

The main effect of Visual Fidelity shows that High Visual Fidelity (Μ = 32.875, SD = 20.85) produced higher Fact Inquiry counts than did Low Visual Fidelity (Μ = 25.96, SD = 16.88). A Two-way ANOVA produced no evidence of effect, F(1,60) = 2.31, P = 0.1338. Thus, Visual Fidelity has no impact on Fact Inquiry activity.

H01

μ Fact Inquiry (HF) = μ Fact Inquiry (LF)

6.2.2 Navigational freedom shows no impact on fact inquiry

The main effect of Navigation shows that High Navigational Freedom (Μ = 31.375, SD = 21.56) produced slightly higher Fact Inquiry activity than did Low Navigational Freedom (Μ = 27.468, SD = 16.47). A Two-way ANOVA produced no evidence to support this effect, F(1,60) = 0.743, P = 0.3931. Thus, Navigational Freedom has no statistical impact on Fact Inquiry activity.

H0 2

μ Fact Inquiry (HN) = μ Fact Inquiry (LN)

6.2.3 Interaction of visual fidelity × navigational freedom on fact inquiry

A Two-way ANOVA produced significant evidence of interaction, F(1, 60) = 6.8 P = 0.0115. It appears that varying Visual Fidelity and Navigation Freedom have inconsistent effects on Fact Inquiry activity (Graph 2, Table 2).

Graph 2
figure 9

Fact Inquiry 2 × 2 ANOVA

Table 2 Contingency table of fact inquiry

Ha 3

Interaction on Fact Inquiry: Visual Fidelity × Navigational Freedom

6.3 Time in system

How did Visual Fidelity and Navigational Freedom impact Time in System? It measures the time a student volunteered to explore, inquire, and discover. A Two-way ANOVA tested main effects and interaction effects for all variables under investigation.

6.3.1 Visual fidelity shows no impact on time in system

The main effect of Visual Fidelity shows High Visual Fidelity (Μ = 46.25, SD = 13.91) is close to identical F(1,60) = 1.07, P = 0.3051, to Low Visual Fidelity (Μ = 42.18, SD = 3.09) for Time in System. Note, that the total amount of time was limited in all conditions to 60 min.

H01

μ Time in System (HF) = μ Time in System (LF)

6.3.2 Navigational freedom shows no impact on time in system

The first main effect of Navigation, shows that High Navigational Freedom (Μ = 46.4, SD = 15.36) is close to identical to Low Navigational Freedom (Μ = 42.03, SD = 16.21), for Time in System. A Two-way ANOVA supports this claim, F(1,60) = 1.24, P = 0.2699; thus Navigational Freedom has no statistical impact on Time in System.

H0 2

μ Time in System (HN) = μ Time in System (LN)

6.3.3 No interaction visual fidelity × navigational freedom

A Two-way ANOVA produced no evidence of interaction, F(1, 60) = 1.62 P = 0.208. It appears that varying Visual Fidelity and Navigational Freedom have consistent effects on Time in System (Graph 3, Table 3).

Graph 3
figure 10

Time in System 2 × 2 ANOVA

Table 3 Contingency table of time in system

H0 3

No Interaction on Time in System: Visual Fidelity × Navigational Freedom

6.4 Knowledge gain

6.4.1 Visual fidelity significantly impacts knowledge gain

The main effect of Visual Fidelity shows that High Visual Fidelity (Μ = 30.95, SD = 14.76) produced higher scores for Knowledge Gain, than Low Visual Fidelity (Μ = 19.99, SD = 13.39). A Two-way ANOVA produced statistically strong and significant effects, F(1,60) = 10.54, P = 0.0019. Thus, Visual Fidelity has a positive and significant impact on Knowledge Gained in educational simulations for independent, exploratory, and discovery-based learning.

Ha1

μ Knowledge Gain (HF) > μ Knowledge Gain (LF)

6.4.2 Navigational freedom trends impact on knowledge gain

The main effect of Navigation shows that High Navigational Freedom (Μ = 28.24, SD = 16.51) produced slightly higher Knowledge Gain scores than Low Navigation Freedom (Μ = 22.69, SD = 13.06). A Two-way ANOVA produced only a modest trend to support this effect, F(1,60) = 2.71, P = 0.105. Thus Navigational Freedom trends towards a positive significant impact on Knowledge Gained.

H02

μ Knowledge Gain (HN) = μ Knowledge Gain (LN)

6.4.3 Interaction of visual fidelity × navigational freedom on knowledge gain

The Two-way ANOVA produced significant evidence of interaction, F(1, 60) = 4.85, P = 0.0315, of Visual Fidelity and Navigation Freedom on Knowledge Gain.

Ha3

Interaction on Knowledge Gain: Visual Fidelity × Navigational Freedom

The difference between the Low Visual Fidelity and Low Navigational Freedom conditions is striking, LFLN (M = 20.93, SD = 13.36) as between the High Visual Fidelity and High Navigational Freedom conditions HFHN (M = 37.44, SD = 13.88), thus the most powerful effect on learning is achieved by combining High Visual Fidelity and High Navigational Freedom (Graph 4, Table 4).

Graph 4
figure 11

Knowledge Gain 2 × 2 ANOVA

Table 4 Contingency table knowledge gain

7 Discussion

The third, fourth, and fifth grade student sample was from a diverse population of urban, suburban, and rural communities. This was a volunteer sample, as is required by Federal Regulations that protect human subjects in research, and as such, all of the students reported high enjoyment of nature as well as above average skill with computers and high exposure to video games. All of the volunteers wanted to participate in a study on virtual reality with computer games and wanted to learn about nature. So, all of the findings must be interpreted in this light.

The High Visual Fidelity condition has a strong and significant impact, F(1,60) = 10.54, P = 0.0019, on Knowledge Gained (M = 30.95, SD = 14.76) when compared to the Low Visual Fidelity condition (M = 19.99, SD = 13.39). There is significant interaction between High Visual Fidelity and High Navigational Freedom, as combining the two produces a larger impact, F(1,60) = 4.85, P = 0.0315, on Knowledge Gained, (M = 37.44, SD = 13.88).

Why would the combination of High Visual Fidelity and High Navigational Freedom have a larger impact than one factor alone? As results for Salient Events showed similar effects, one claim is that High Visual Fidelity entices a student to investigate while the High Navigational Freedom condition allows the child to go at will to the object of interest. On the other hand, in the Low Visual Fidelity version, the child may not see or perceive those objects as interesting, or if he does but finds himself in the Low Navigational Freedom condition, he is prevented from approaching that object.

The next interesting finding concerns the salient objects that resulted in a change of behavior in the student. First, Salient Events are the count of times the student decided to stop navigating and to start inquiring and second, navigation and way-finding is under the child’s free will. Thus, the choice is intrinsic as to which events children go to and about which they inquire. The data show that the High Visual Fidelity (Μ = 14.46, SD = 6) condition resulted in a significant and greater impact on Salient Event counts than did the Low Visual Fidelity (Μ = 11.31, SD = 6.37) condition, F(1,60) = 4.35, P = 0.00413. That is to say that in the photo-realistic version, the student decided to stop navigating, and either selected a “fact card” to read or stopped at a “sprite” to listen more often than in the other condition.

Why would children stop more often in the High Visual Fidelity version, especially since the “cards” and the “sprites” were identical to those in the Low Visual Fidelity version? This is open to future research, but it could be that the visually rich environment and context stimulates more curiosity than does the environment or context of a cartoon-like version. The data show that the High Navigational Freedom (Μ = 14.25, SD = 6.99) condition resulted in a strong trend and a possible impact on Salient Event counts when compared to the Low Navigational Freedom (Μ = 11.53, SD = 5.38) condition, F(1,60) = 3.23, P = 0.0773, and this evidence, combined with no evidence of interaction, F(1, 60) = 1.48, P = 0.2285, supports generalization within this student sample profile.

The next interesting finding is that Fact Inquiry was not significantly impacted by either the High Visual Fidelity or the Low Visual Fidelity, or the High Navigational Freedom or Low Navigational Freedom conditions. Thus, these factors had no impact on the inquiry into facts. Much like a PowerPoint presentation, it is independent of the surroundings; once attention focuses on linear detailed information, and frame of view obscures broader contextual information, those contextual factors become exogenous to the problem at hand and thus show no impact. This suggests the dynamic nature of reporting, analyzing, and optimizing the user interface in virtual environments. This finding is not surprising as the fact cards are part of the user interface elements that were held constant across all system conditions.

However, there was significant interaction, F(1, 60) = 6.8, P = 0.0115. One possibility is that the interaction can be explained by learning style, or that Salient Events, which were impacted by the system conditions, are positively and significantly correlated to Fact Inquiry, (r = 0.722, N = 64, P = 0.000).

The last finding is total time spent in each condition, Time in System for both factors, showed no significant main effects on Navigational Freedom, F(1,60) = 1.24, P = 0.2699; and Visual Fidelity, F(1,60) = 1.07, P = 0.3051, and no interaction, F(1, 60) = 1.62 P = 0.208. However, the design did not allow the students to go over 60 min, and 40% of all students were forced to stop across all conditions.

8 Conclusion

While there are many studies on collaborative or guided scientific inquiry in real, virtual, and simulated environments, there are few that study the interplay between the design of the simulation and the navigational aspects of the user interface for learning. The main research aim here was to decompose the simulation and user interface into the design parameters that influence a child’s free choice regarding inquiry and learning of scientific material, as expressed in the Simulated Ecological Environment for Education (SEEE) tripartite model (Harrington 2010). The research design investigated empirical effects of UI parameters found in all virtual environments used as educational simulations on independent exploration of a space and intrinsic learning of educational content.

High Visual Fidelity was significantly better than Low Visual Fidelity for learning activity, measured both as Salient Events mean counts and as Knowledge Gained expressed as mean percent change on test scores. The findings support the use of High Visual Fidelity over Low Visual Fidelity. Therefore, photographs or high fidelity virtual environments should be used over cartoons or low quality virtual environment graphics for the best results in educational simulations.

There is interaction between the High Visual Fidelity condition and the High Navigational Freedom condition, so their combined impact on Knowledge Gained is higher than the impact of either alone. Use High Visual Fidelity and High Navigational Freedom together for the greatest impact, 37.44% on Knowledge Gained. The strongest learning results were achieved by combining the photo-realistic and free navigation features for improved learning, ceteris paribus.

An additional interpretation is that the Low Visual Fidelity High Navigational Freedom and the Low Visual Fidelity Low Navigational Freedom system conditions have similar knowledge gains of about 20%. This is thus a design choice. In other words, a Low Visual Fidelity Low Navigational Freedom system condition should have the same impact on learning as a Low Visual Fidelity High Navigational Freedom system condition, so choose the more cost-effective system as both will have the same impact on learning.

Future work will explore the role of the child’s emotional reaction to educational simulations. These include the personal and subjective reactions of curiosity, perception of beauty, awe, and wonder, and how the system design parameters impact the final outcomes of intrinsic learning, long-term memory, and acts of creation in micro-world studies.