Introduction

In spite of the importance of improving students’ ill-structured problem solving abilities in higher education, creating problem-oriented learning environments that do this is a challenging task for many instructors. Therefore, the purpose of this study was to focus on the use of model-centered instruction (MCI) environments that use authentic cases pertaining to ethical decision making in program evaluation. The purpose of MCI in this study was to facilitate students’ modeling processes for ethical decision-making cases in the context of program evaluation. As one type of ill-structured problem solving, ethical decision making was chosen for this study because it had not been explored by prior empirical instructional research, yet it is an everyday part of life for a program evaluator.

In general, an ill-structured problem contains unknowns in the initial or subsequent state of affairs and some uncertainty with regard to what might result from particular decisions (Simon 1980; Toulmin 1958). Ethical decision making is often more complex and unpredictable than some other decision-making situations because it occurs in a context where there is a conflict of views, multiple alternatives, and some uncertainty with regard to how the situation can best be resolved. In ethical dilemmas, there is often no single best solution, and one is forced to explore alternatives and arrive at a satisficing (Simon 1980) solution. For this reason, establishing performance criteria in ethical decision making is difficult, as is identifying experts. As a consequence, the method of protocol analysis used in many expert-novice studies and the notion of developing expertise by deliberate practice is not feasible (Ericsson 2006). The appropriate solution might be different for each problem-solving situation. Teaching evaluation standards and ethical principles is insufficient. An alternative instructional approach is to help learners develop useful mental models for representative problems for which there are available expert models that can be used to provide feedback, stimulate reflection, and facilitate the exploration of alternative approaches.

Unlike well-structured problems, ill-structured problems seldom have clear problem statements. Therefore, an initial step in solving an ill-structured problem is to decide if there is a real problem and then what the nature of that problem is. As problem solvers determine whether there is a problem, they begin to construct a representation of it that contains all of the possible causes and constraints of the problem (Sinnott 1989). This problem-representation phase is extremely important for selecting a solution approach (Voss and Post 1988). When constructing the problem space, the problem solvers attempt to locate and select from their memories critical information that fits the context in the course of constructing the problem space (Voss and Post 1988). Then, after constructing a representation of the problem, problem solvers generate a variety of possible solutions. From these possible solutions, problem solvers choose one and develop a rationale to justify the selection. The problem solvers are constructing their own mental models of the problem to identify and select, or synthesize, a solution based on a representation of the problem (Jonassen 1997). The process of justification requires the problem solvers to identify the various perspectives about the problem situation, provide supporting arguments and evidence for opposing perspectives, evaluate information, and develop a reasonable rational for the selected solution approach. Reconciling different interpretations of the problem is a critical process in developing a justification (Churchman 1971). In the final step, the problem solvers continuously evaluate and reflect on the problem-solving process. Problem solvers reflect on the strategies they have used in order to evaluate what worked and failed; thus, they learn from their problem-solving experiences (Bransford et al. 2000). The ill-defined nature of such complex problems makes the monitoring and evaluation processes critical. This step is not essential for many well-structured problem-solving activities in which the success or failure of the solution is sufficient feedback. If a solution to a well-structured problem is successful, the problem-solving process concludes. If, on the other hand, the solution has failed to solve the problem, the problem solver must repeat the process by re-representing the problem and finding appropriate alternative solutions. This study focuses on ill-structured problem-solving processes which are more challenging to support with instruction and feedback.

Experts are better problem solvers than novices for several reasons. The main difference is that experts construct richer, more integrated mental representations of problems than do novices (Chi et al. 1981). The representations of experts consist of a large number of interconnected elements (i.e., coherent chunks of information organized around underlying principles in the domain). Grosslight et al. (1991) argue that novice learners often find it easier to assimilate explanations when experts provide their mental representations, but they find problem solving more difficult when they have to create their own mental models independently. Sometimes novices do not have the necessary knowledge of the domain to develop their own mental models for specific problem-solving situations. Therefore, the MCI approach could support the modeling process of learners in the ethical decision-making domain. MCI is an instructional approach that recognizes the criticality of helping inexperienced learners develops expert-like mental representations of complex problems. A specific goal of MCI is for learners to be able to develop mental models that look like models created by recognized experts and professional practitioners after instruction and experience in solving representative problems.

According to MCI approaches, such as model-facilitated learning (Milrad et al. 2003) and MCI (Seel 2003), learning should be situated in an authentic, complex, and dynamic environment, and there should be opportunities for the elaboration of a learner’s mental models and experiences. When confronted with the need to create meaningful models of real situations, students can invent possible solutions on the basis of model construction. There are two types of MCI, which in some cases might be combined: (a) expert modeling (EM), which emphasizes the internalization of expert conceptual models provided to students during instruction (often at the beginning of a unit of instruction); and (b) self-guided modeling (SGM), which emphasizes the process of students creating and expressing a conceptual model without external assistance.

The EM approach emphasizes learning that is oriented either toward the performance and behavior of an expert or toward the adaptation of a teacher’s model and explanation. Mayer (1989) argued that students who are given MCI may be more likely to build appropriate mental models of the systems they are studying and then use these models in generating solutions to problems. Therefore, many studies of mental models have focused on the internalization of a conceptual model provided to students in the course of instruction (Mayer 1989; Seel et al. 2000). In short, EM approaches emphasize the efficiency and effectiveness of instruction (Spector 2008).

In contrast, the SGM approach emphasizes the role of creating one’s own mental models in discovery or guided-discovery learning and problem-solving environments (Kafai and Ching 2004; Kolodner et al. 2004). SGM emphasizes engagement and aims for effectiveness, but it is often less efficient than more structured approaches for many learners.

Therefore, in this exploratory study, the overall goal was to investigate the effects of MCI and to develop a mechanism for determining whether EM or SGM is more appropriate for learning in a particular ill-structured problem-solving situation. The instructional approaches used in this study emphasize the efficiency, effectiveness, and engagement aspects of developing complex problem solving (Merrill and Gilbert 2008) and include relevant measures of those aspects of instruction. The subject area investigated is ethical decision making in program evaluation.

In addition to considering the factors already mentioned, the study also examined how the effects of these MCI approaches vary among learners with different prior knowledge and experience, which is an issue that has not been addressed in this context. Because inexperienced and experienced students learn differently, instructional designers and researchers need to take into account different levels of learner expertise; instruction that is effective for inexperienced learners may not be so effective for experienced learners (Seel et al. 2000). Some researchers have found an expertise reversal effect in which the optimal instructional approach changes as learner expertise increases (Kalyuga et al. 2003). Therefore, types of MCI used with inexperienced learners might differ from those used with more experienced learners; however, no studies have examined how various levels of learner expertise interact with EM or SGM. For this reason, this study examined how the level of the learner’s relevant expertise (inexperienced and experienced) interacted with EM and SGM. We examined the following research questions:

  1. 1.

    How do the effects of two types of MCI (EM vs. SGM) compare in terms of effectiveness, efficiency, and engagement?

  2. 2.

    How does a learner’s level of expertise moderate effectiveness, efficiency, and engagement measures with two types of MCI (EM vs. SGM)?

Model-centered instruction: expert modeling and self-guided modeling

Researchers of MCI generally follow one of two approaches, emphasizing either EM or SGM (Seel et al. 2007). Several authors (Johnson-Laird 1989; Penner et al. 1998; Lesh and Doerr 2003) have emphasized the role of SGM for the construction of effective mental models. When students are confronted with the need to create meaningful models of real situations, they can invent significant solutions on the basis of mental model construction (Lesh and Doerr 2003). In self-guided discovery learning situations, the learner searches continuously for information in order to complete or stabilize effective mental models. In this case, learners develop interpretations of the problem situation and create their own initial mental models and then progressively build upon them (Seel 2001; Lesh and Doerr 2003). When learners first begin to solve the given problem, they have or develop initial working models that they try to use to solve it. If they fail to solve the problem, they revise their strategy repeatedly until they succeed in solving the problem. Therefore, SGM occurs as a multi-step process of model-building and revision (Penner 2001).

However, SGM may dramatically increase the probability of stabilizing incorrect, initial mental models (Briggs 1990). In light of this argument, there is some instructional appeal in the idea expressed by several authors (such as Norman 1983) of providing learners, especially inexperienced learners, with a designed conceptual model like an expert’s representation. A study by Rieber and Parmley (1995) indicated that simulations for discovery learning are not very fruitful environments. They found that simply adding a simulation does not necessarily improve knowledge acquisition; explanations, feedback and conceptual models are generally required to facilitate effective learning. Seel et al. (2000) studied the effect of MCI within a multimedia learning environment. Their results indicated that the participants applied their initially-constructed mental models to mastering the learning tasks, and that the models were stable even though they changed after the learning period. Other research from Ifenthaler and Seel (2005) compared self-guided learning and scaffolding-based learning and found no significant differences between the two groups.

Actually, for inexperienced learners, SGM is more closely associated with learning by trial-and-error than by insight. It might, therefore, be easier for an inexperienced learner to assimilate a given causal explanation that has been provided through a conceptual model rather than induce his/her own mental model. Meanwhile, numerous studies (Mayer 1989; Seel 1995; Seel and Dinter 1995) have demonstrated that the presentation of a conceptual model affects the construction of a task-related mental model. Mayer (1989) states that when learners are provided with a conceptual model that illustrates the main components and relationships of a complex system, they are able to build mental models of the systems they are studying and use these models to generate creative solutions to transfer problems. Mayer (1989) confirms in his research that the presentation of model-relevant information at the beginning of the learning process seems to increase both the quality of comprehension during the learning process and the quality of causal explanations at the end of the learning process.

EM is an instructional strategy for encouraging learners’ engagement in ill-structured problem solving (Jonassen 1999). Seel (1995) also suggests that experts’ models facilitate the construction of an adequate mental model for cognitively mastering the demands of the learning situation. According to Gibbons (2001), experts’ models are structured in terms of goals, actions, motives, decision points, and rationales. Experts notice features and meaningful patterns of information that are not noticed by novices, organize knowledge in ways that reflect a deep understanding of their subject matter, and have varying levels of flexibility in their approach to new situations (Bransford et al. 2000). Experts also spend more time than novices on planning their initial strategies for solving problems (Leithwood and Steinbach 1995), and they are able to retrieve important aspects of their knowledge with little effort (Bransford et al. 2000). Therefore, EM helps learners analyze expert approaches and procedures and can affect the ill-structured problem solving process of learners. Pedersen and Liu (2002) indicated that EM helps students to apply effective problem-solving strategies to their work and impacts the quality of their reasoning. Through EM, learners are given an opportunity to observe the cognitive processes of experts, compare them with their problem-solving processes, and gradually internalize the cognitive processes of experts (Collins et al. 1989).

Effects of levels of learner expertise

In addition to considering an effective instructional strategy, such as MCI, teachers should consider levels of learner expertise. Research on aptitude treatment interaction has also demonstrated that instructional strategies effective for inexperienced learners can lose their effectiveness, and they even have negative consequences for more experienced learners. This reversal in the effectiveness of instructional strategies for different levels of learner expertise has been referred to as an expertise reversal effect (Kalyuga et al. 2003).

Numerous studies have investigated how adjusting/varying the level of instructional guidance affects learners who have different levels of expertise in a domain. For this study, ‘novice’ and ‘advanced’ correspond to ‘inexperienced’ and ‘experienced’. Mayer (2001) confirms in his research that advanced learners need less instructional guidance than do novice learners because they use their prior knowledge to compensate for a lack of support. Tuovinen and Sweller (1999) confirmed that novice students benefited more from well-guided instruction, and they found no differences between well-guided and minimally-guided instruction for advanced learners. In the task domain of calculating distances and projections in a coordinate geometry, Kalyuga et al. (1998) studied the interaction between levels of learner expertise and levels of instructional guidance. Post-test results indicated that less-knowledgeable high school students benefited significantly more from well-guided models such as worked examples. However, advanced learners benefited from less-guided instruction, like self-guided problem solving because as they already have schemas to apply to problem situations, the provision of well-guided models might lead to cognitive overload. Therefore, a significant interaction between knowledge levels and instructional formats demonstrated that the most efficient instructional format depended on the level of learner expertise. As learner levels of expertise increased from novice to advanced, the performance of the self-guided problem-solving group improved more than the performance of the worked-example group.

MCI research also suggests that the effects of EM and SGM vary according to the levels of learner expertise. SGM takes much more time and effort because SGM occurs as a multi-step process of model-building and revision (Penner 2001); however, SGM achieves better results in problem solving when time is not constrained and when learners have some prior experience in solving similar problems (Alessi 2000). If a domain is very complex and contains many variables, or if learners have no prior experience in building models, the expert-based modeling approach might be suggested (Funke 1991).

Methods

Participants

A total of 86 pre-service and in-service evaluators agreed to participate in the study. Pre-service evaluators were students who took program evaluation courses and in-service evaluators were field evaluators; however, all of the participants were unaware of this ethical decision making case provided by the instruction. They were assured of the confidentiality of their responses in accordance with standard research practice. In addition, subjects completed the informed consent form before the experiment took place. However, 24 subjects dropped out during the study, so the remaining 62 students constituted the final study sample. One of the authors interviewed some participants who dropped out of the study; they reported that the study involved a high mental load on participants because the ethical decision-making domain is very complex, and all of the test questions were open-ended. The number and characteristics of the subjects who dropped out from the two groups were similar; therefore, this does not alter the research methodology and its findings. In addition, there is no significant difference among four groups (EM-inexperienced, EM-experienced, SGM-inexperienced, and SGM-experienced) on pre-test results. These four groups were comparable before the treatment. The mean age of the participants was 36.95 years, 64% were females, and 71% were Caucasian-Americans. As thanks for their participation, the participants were given a $10 Starbucks gift card.

Instructional interventions

Participants in this study were taught how to solve ethical issues, and the instructional materials were presented entirely online and were self-paced. In order to develop instructional interventions, participants were provided the following authentic cases pertaining to ethical decision making in program evaluation.

Cases

The topic of all of the cases used for pretest, MCI, and posttest was ethical decision making in program evaluation. These cases described that the situation involved misrepresentation or misuse of evaluation results, and they were deemed appropriate as representative of common ethical dilemmas as well as ill-structured problems; all dilemmas had multiple acceptable solutions, and they presented alternative decision-making paths. All participants were presented with the same cases (see Fig. 1). After reviewing the case, participants in the EM group were provided with the conceptual models of experts on how to solve ethical conflicts within program evaluation. Participants in the self-guided group received only guiding questions, thus supplying them with the opportunity to develop their own representations and solutions regarding the case. This was an opportunity for learners to develop their own mental models. After the instruction, learners were required to solve a posttest.

Fig. 1
figure 1

Sample screen of a case

Model-centered instruction: expert modeling and self-guided modeling

Expert modeling instruction

The EM instruction in this study provided learners with conceptual models of experts (see Fig. 2). Commentary sections which had been written by two experts in the publication “Evaluation Ethics for Best Practice (Morris 2007)” were used as sources for constructing the conceptual models. However, the texts are too long and are not structured for online instruction, which should be “easy to read” so that people quickly understand contents and focus on key information. Therefore, we summarized the commentaries and designed a layout suitable for online EM instruction. The EM instruction (see Fig. 2) consisted of 2 parts: 1) guiding principles that might be relevant, and 2) two conceptual models of experts that are relevant to such situations. Through this EM instruction, learners were given an opportunity to observe the cognitive processes of experts, compare them with their own problem-solving processes, and then gradually internalize the cognitive processes of the experts.

Fig. 2
figure 2

Sample screens of EM instruction

Self-guided modeling instruction

Self-guided modeling instruction (see Fig. 3) allowed learners the opportunity to solve the problems containing guiding questions, which is an alternative form of instructional support in place of the expert model provided the other group. Learners were presented with a case on ethical conflicts within program evaluation and were then asked to answer five questions. The questions included each step of the problem-solving process: (1) representing problems, (2) analyzing problems, (3) generating solutions, (4) validating solutions, and (5) reflections. Through this instruction, learners had the opportunity to develop their own mental model of an evaluation case; however, they were provided with the guiding principles and standards for the evaluation case but not the expert conceptual model.

Fig. 3
figure 3

Sample screens of SGM instruction

Independent variables

Types of model-centered instruction (MCI)

  • Expert modeling (EM): Students were provided with the conceptual models of experts for the purpose of teaching them how to solve ethical conflicts within program evaluation.

  • Self-guided modeling (SGM): Students received minimal guidance (guiding questions) in developing their own mental models for solving ethical conflicts within program evaluation.

Learner’s expertise: inexperienced versus experienced

The criteria for defining experts can vary across domains. As Jonassen (1999) mentioned, solving ill-structured problems relies on case-based reasoning or application of previous experiences; thus, previous experience is the most important factor for defining the level of expertise in ill-structured problem-solving domains. The problems for this study presented ethical conflicts in program evaluation; therefore, the work experience of evaluation and ethics is one criterion for defining the learner’s expertise. In this study, we also considered the courses participants had taken in evaluation and ethics as another criterion for defining a learner’s expertise. All of the participants were classified as either experienced or inexperienced based on whether they had previously completed classes in evaluation and ethics and had some field experience. For the last criterion defining the learner’s expertise, we considered the pretest results and classified those who could not answer any questions as inexperienced participants. These inexperienced participants did not have any work experience or courses; therefore, this final criterion confirmed previous classifications from the two earlier criteria: work experiences and coursework in evaluation or ethics.

Dependent variables

Instructional effectiveness

Being able to monitor changes to mental models before and after instruction provided us with the necessary insight into the effect of instruction on complex problem-solving processes. Accordingly, changes in mental models should be assessed by using valid and reliable methods. Therefore, before and after the MCI, such as EM or SGM, participants were asked to solve an ethical decision-making case. In order to assess the changes in mental models, the mental models of the participants were compared to the representative mental model of one expert. Since T-MITOCAR (the tool used for analysis; see below) does not allow combining several models, one expert representative of mental modeling research was selected for this study. The expert had diverse academic and practical experience that extends over 30 years. She worked as an adjunct assistant professor in Program Evaluation and as a school district administrator in the Program Evaluation office.

Text-model inspection trace of concepts and relations tool

In this study, we used the Text-Model Inspection Trace of Concepts and Relations tool (T-MITOCAR), one of the HIMATT (Highly Integrated Model-based Assessment Tools and Technologies; Pirnay-Dummer et al. 2010) tools for capturing mental models in problem-solving processes. T-MITOCARis a software tool that is based on mental model theory (Seel 1995). T-MITOCAR can be used with text-based input that can be in the form of narratives and descriptions of the problem to create model representations and associations of learner knowledge (Johnson et al. 2006). First, the identification phase is a simple collection of statements of natural language. Then, the expressions are reviewed for plausibility and relatedness to the subject domain. After the review, a concept parser filters nouns and makes a list of the most frequent concepts. In this mode, the 30 most frequent terms from the concept parsing are used to make grouped lists. Then, through the verification and confrontation modes, the concepts are compared pair-wise considering the closeness, contrast, combined, and confidence. Consequently, the re-representations are constructed in a graphical output.

T-MITOCAR provides six indices for determining the similarities between two models: surface, matching, structural matching, gamma, concept matching, and propositional matching. Semantic indices (concept matching and propositional matching) and structural indices (surface, matching, structural matching, and gamma) may correspond in single cases; however, empirically they measure different aspects of mental models. Pirnay-Dummer and Spector (2008) suggested in their study that researchers should choose the indices that correspond best with their research question and theoretical foundation. Therefore, we chose gamma as one of structural indices and concept matching as one of semantic indices for this study. Previous literature (Spector and Koszalka 2004; Kim 2008) also confirmed that concept matching and gamma are appropriate measures for assessing mental models.

Concept matching compares matching sets of concepts between two models to determine the use of terms (Pirnay-Dummer 2007). This measure looks directly at the number of words or labels that match when comparing two models. Participants that work in the same knowledge domain would be expected to know similar concepts and terms used as nodes. Therefore, this measure can be used for assessing domain expertise.

Gamma measures the density of vertices. The density of vertices describes the overall connectedness, which is the quotient of terms per vertex within a graph or a concept map; it ranges from 0 (no connections) to 1 (all possible connections of pairs of terms or nodes in the map). Gamma indicates the cognitive structure (i.e., the breadth of understanding of the underlying subject matter).In order to have a good working model, a medium density is expected since the extremes of having all terms connected or few terms connected indicates a weak model (Pirnay-Dummer 2007). In a previous study, Spector and Koszalka (2004) confirmed that novice representations show low gamma, while expert representations exhibit higher gamma (around 0.5). Therefore, a gamma measure is appropriate for assessing structural aspects of participants’ mental models for this study, which investigates the different ways in which EM and SGM affect inexperienced and experienced participants.

Instructional efficiency

According to Tuovinen and Paas (2004), both instructional effort and test effort provide more useful measures of the efficiency of instruction because even when two students have used instructional effort equally, different test efforts might be needed to achieve the performance. Therefore, both types of mental effort (instructional and test) should be used to measure instructional efficiency. We also considered the amount of time for completing instruction because instructional time is related to the difficulty of instructional materials.

Perceived mental effort: instructional effort and test effort

Perceived mental effort reflects the amount of cognitive capacity allocated to a problem-solving task, and it was used as an index for cognitive load in this study. Perceived mental effort was measured by the single-item, 9-point rating scale developed by Paas and van Merriënboer (1994). The scale ranges from 1 (very, very low mental effort) to 9 (very, very high mental effort), and the participants were asked to use this scale to indicate the amount of mental effort they used during instruction (instructional effort) and testing (test effort). Previous studies have shown that this instrument is a reliable measure of perceived cognitive load (Paas and van Merriënboer 1994). The internal reliability of this instrument was 0.96.

Instructional time

The Web-based program tracked the time (in seconds) that the participants needed to complete the instruction. No time limit was set, and the average time participants spent on each problem was approximately 12–15 min.

Instructional engagement

After the instruction, student engagement with the instructional material was measured by a modified version of Keller’s (1993) Instructional Material Motivation Survey (IMMS). This revised IMMS provided a situational measure of the effects of instructional materials on the motivation of learners. We chose five items related to online instructional materials for each of the subscales: attention, relevance, confidence, and satisfaction. The participants answered each statement in relation to the instructional materials they had just studied, and they indicated how true each statement was in relation to their experience. The response scale ranged from 1 (Not True) to 5 (Very True). The internal reliability of this instrument was 0.93.

Procedures

When the participants connected to the online study material, they were guided to the consent form, which explained that they were studying instructional material on the topic of ethical conflicts in program evaluation. Those who then agreed to participate in the study were asked to respond to a short survey of demographic information that included: field of study, gender, race, and age. This survey also gathered information about their expertise on ethical decision making in program evaluation, such as experiences and courses taken on ethics and evaluation. Then, as a pretest, the initial mental models of the participants were captured by having them solve a case on ethical conflicts. Upon completion of the pretest, the participants were asked to rate the mental effort they had invested in solving the problems of the pretest. Then, each participant was presented with instructional material corresponding to his/her treatment group: EM or SGM. Since the instructional material had been designed as a self-paced program, the students were able to control the speed at which they studied the instructions. While students studied instructional materials, the time spent on the instruction was automatically measured and stored in a database. After they had finished the instructional materials, the instructional effort they had spent on them was measured with the 9-point Likert type rating scale. Subsequently, they were asked to complete the instructional engagement questionnaire. This included 20items for measuring their motivational reactions to the instructional materials. After they had submitted the questionnaires, the posttest appeared on their screens. The problem of the posttest was an ethical conflicts case. The process of solving this problem provided information about how the two types of MCI had changed their mental models. After completion of the posttest, the participants were asked to rate the test effort they had invested in solving the problems of the posttest. When the posttest was completed, the participants were thanked for their participation.

Research design and data analysis

The 2 × 2 factorial research design was used for this study: the first factor represents the types of MCI (EM vs. SGM), and the second factor represents the levels of learners’ expertise (inexperienced vs. experienced). Descriptive and parametric statistics were employed to analyze the data gathered for each of the following outcome measures: (a) instructional effectiveness (concept-matching similarity and gamma similarity with an expert); (b) instructional efficiency (instructional effort, test effort, and instructional time); and (c) instructional engagement. Two-way ANOVA and MANOVA were employed to determine if there were significant differences among the four groups in effectiveness, efficiency, and engagement measures. This research also reported the effect sizes (partial etas).

Results

Instructional effectiveness

Concept-matching similarity

No main effects were found for the two types of MCI on the concept-matching similarity measure [F (1, 58) = 0.117, p = 0.733, and η 2 p  = 0.002]; therefore, the above statistical results indicated that concept-matching similarity was not significantly different across the two MCI groups. As shown in Table 1, the mean score for learners in EM was 0.09 (SD = 0.11); whereas, the mean score for learners in SGM was 0.08 (SD = 0.12).

Table 1 Means and standard deviations of dependent measures across groups

Concerning the interaction effect, ANOVA revealed no interaction between the two types of MCI and the two levels of learners’ expertise on the concept-matching similarity [F (1, 58) = 0.990, p = 0.324, and η 2 p  = 0.017].

Gamma similarity

There were no main effects for the two types of MCI on the gamma similarity measure [F (1, 58) = 0.005, p = 0.946, and η 2 p  = 0.00]; therefore, the above statistical results indicated that gamma similarity was not significantly different across the two MCI groups. As shown in Table 1, the mean score for learners in EM was 0.47 (SD = 0.26); whereas, the mean score for learners in SGM was 0.47 (SD = 0.31).

Concerning the interaction effect, ANOVA revealed no interaction between the two types of MCI and the two levels of learners’ expertise on the gamma similarity [F (1, 58) = 1.323, p = 0.255, and η 2 p  = 0.022].

Instructional efficiency

Perceived mental effort: instructional effort

ANOVA revealed a statistically significant interaction [F (1, 58) = 8.939, p = 0.004, and η 2 p  = 0.134] between the two types of MCI and the two levels of learners’ expertise on instructional effort (see Fig. 4); however, there was no main effect for the two types of MCI on the instructional effort [F (1, 58) = 0.076, p = 0.784, and η 2 p  = 0.001], indicating that there was no difference between the two groups (EM vs. SGM) on the instructional effort. This suggests that the effect of the EM and SGM is different across experience levels. As shown in Table 1, the mean effort for learners in EM was 5.42 (SD = 1.61); whereas, the mean effort for learners in SGM was 5.45 (SD = 1.61).

Fig. 4
figure 4

The interaction between MCI and levels of learners’ expertise for instructional effort

Perceived mental effort: test effort

The main effects for the two types of MCI were not found on test effort [F (1, 58) = 0.008, p = 0.242, and η 2 p  = 0.000], indicating that there was no difference between EM and SGM on the test effort. As shown in Table 1, the mean effort for learners in EM was 5.55 (SD = 1.71); whereas, the mean effort for learners in SGM was 5.58 (SD = 1.96). Concerning the interaction effect, ANOVA revealed no interaction between the two types of MCI and the two levels of learners’ expertise on the test effort [F(1, 58) = 1.352, p = 0.250, and η 2 p  = 0.023].

Instructional time

ANOVA revealed a significant main effect for the two types of MCI on the instructional time [F (1, 58) = 11.015, p = 0.002, and η 2 p  = 0.160], indicating that the participants in the EM approach (M = 503.55 and SD = 298.24) used less time during instruction than did those in the SGM approach (M = 848.13 and SD = 479.58).

Concerning the interaction effect, ANOVA revealed no interaction between the two types of MCI approaches and the two levels of learners’ expertise on instructional time [F (1, 58) = 0.125, p = 0.725, and η 2 p  = 0.002].

Instructional engagement

A two-way MANOVA revealed a significant overall effect for the two types of MCI on instructional engagement in terms of attention, relevance, confidence, and satisfaction [Wilks’ Lambda = 0.37, F (4, 55) = 23.86, p = 0.000, and η 2 p  = 0.633]. Concerning the interaction effect, MANOVA revealed no interaction between the two types of MCI approach and the levels of learners’ expertise on instructional engagement [Wilks’ Lambda = 0.95, F (4, 55) = 0.732, p = 0.574, and η 2 p  = 0.051].

Follow-up ANOVA, using a Bonferroni adjusted alpha level of 0.05 revealed a significant main effect for the two types of MCI on confidence and satisfaction [F (1, 58) = 57.599, p = 0.00, and η 2 p  = 0.498; F (1, 58) = 26.185, p = 0.00, and η 2 p  = 0.311, respectively]. In addition, there was a statistically significant main effect for the levels of learners’ expertise on attention and satisfaction [F (1, 58) = 4.925, p = 0.03, and η 2 p  = 0.078; F (1, 58) = 6.991, p = 0.011, and η 2 p  = .108].

Discussion of research findings

Instructional effectiveness

At first, regarding gamma similarity measures, the ANOVA results did not indicate any significant differences between the EM and SGM groups; all the learners revealed a medium gamma (M = 0.47), which indicated a good working model. Based on Pirnay-Dummer and Spector (2008), a low density gamma (graphs which only connect pairs of terms) can be considered to represent weak models; whereas, a medium density is expected for good working models. These results are consistent with the previous study by Spector and Koszalka (2004), which showed that experts exhibited around 0.5 gamma after instruction. Therefore, it is concluded that MCI is effective in developing good working mental models.

However, in contrast to the results shown for gamma similarity, concept-matching similarity was very low for both types of MCI: EM (M = 0.09), and SGM (M = 0.08). Although both concept matching and gamma are measures for assessing mental models, they measure different features. The latter reflects structural aspects of mental models, while the former reflects the semantic aspects of mental models (Pirnay-Dummer and Spector 2008). Concept-matching similarity illustrates the differences in language use between the mental model of a participant and that of an expert; therefore, MCI might not be effective for teaching declarative knowledge, such as facts and concepts for problem solving, especially in ill-structured problems. The reason for low concept-matching similarity could be the ill-structured nature of the problems. As ill-structured problems can have multiple solutions and various solution methods, they often require learners to make judgments and express personal opinions or beliefs when they are solving the problem; therefore, different problem solvers might have used different terminologies while solving the ill-structured problems in this study. For example, experts in this study used technical terms in the guiding principles of program evaluation; however, some participants who did not have any prior knowledge or experience did not use technical terms.

Instructional efficiency

Perceived mental effort: instructional effort and test effort

For instructional effort, ANOVA revealed no significant difference between the two types of MCI. Although the descriptive data showed slightly higher instructional effort in SGM (M = 5.45) than in EM (M = 5.42), the difference was not statistically significant. Similarly, ANOVA showed no significant main effect for the two types of MCI on test effort.

However, there was a statistically significant interaction (p = 0.004) between the two types of MCI and the two levels of learners’ initial status. During EM instruction, inexperienced participants (M = 5.06) invested less mental effort than experienced participants (M = 5.80); however, in SGM instruction, experienced participants (M = 4.54) invested less mental effort than experienced participants (M = 6.11). Therefore, the effects on cognitive load reversed with increasing levels of expertise. This finding is consistent with the expertise reversal effect (Kalyuga et al. 2003). For the interaction between the two types of MCI on test effort, this expertise reversal pattern was similar but not statistically significant. This result confirmed the findings of a previous study by Kalyuga et al. (2001), in which trainees who studied worked examples showed lower ratings of mental load than did similar trainees in exploratory procedures. When participants became more experienced in the domain, the advantage of the worked examples’ condition disappeared. Clearly, less guidance is better for experienced learners because additional instructional guidance might be redundant, requiring them to integrate previous schema, thus leading to working memory overload. The additional cognitive load maybe imposed even if a learner recognizes the instructional materials to be redundant and therefore decides to ignore that information as best he or she can. Redundant information is frequently difficult to ignore. For this reason, a minimal guidance format might be more beneficial for these learners because they are able to construct their own mental models by extracting knowledge from previous schema. Kalyuga et al. (1998) and Yeung et al. (1998) found that experienced learners studying a minimal format reported lower estimates of mental load compared to formats with redundant information. Therefore, it seems reasonable to conclude that levels of learner expertise should be considered in order to design efficient instruction for ill-structured problem solving and, specifically, ethical decision making in program evaluation.

Instructional time

The effect of the two types of MCI (EM and SGM) had statistically significant differences on instructional time. Specifically, participants in the EM group (M = 503.55) invested less time than those in the self-guided learning group (M = 848.13). Since SGM instruction provides minimal guidance, learners may have lost their direction in learning, which could have led to more instructional time.

ANOVA revealed no interaction effect between the two types of MCI and the two levels of learners’ initial status. This may have occurred because, regardless of the type of MCI employed, the experienced learners tended to use less time than the inexperienced learners, even though the time difference was not statistically significant. In EM instruction, experienced learners (M = 429.93) used less time than inexperienced learners (M = 572.56). In SGM instruction, the difference between the two was smaller than in EM, but experienced learners (M = 807.54) still used less time than inexperienced learners (M = 877.44).

Instructional engagement

Many researchers (Keller 1983; Schunk 1991; Schunk et al. 2008) consider engagement to be an important factor in successful instruction because it can influence what, when, and how students learn. Student engagement with the two types of MCI was measured by a modified version of Keller’s (1993) IMMS consisting of four subscales: attention, relevance, confidence, and satisfaction.

Two-way MANOVA revealed that the two types of MCI had a significant overall effect on instructional engagement in terms of attention, relevance, confidence, and satisfaction. From the descriptive analysis, learners in EM instruction exhibited higher engagement than those in SGM instruction for all four subscales: attention (EM: M = 3.8; SGM: M = 3.5); relevance (EM: M = 3.9; SGM: M = 3.7), confidence (EM: M = 4.0; SGM: M = 2.8), and satisfaction (EM: M = 3.3; SGM: M = 2.1). Concerning the interaction effect, MANOVA revealed no interaction between the two types of MCI approach and the levels of learners’ expertise on instructional engagement.

The follow-up ANOVA results indicated a significant main effect of the two types of MCI on confidence and satisfaction. Learners in EM instruction (M = 4.0) exhibited higher confidence than those in SGM instruction (M = 2.8). Regardless of learners’ initial status, learners in EM instruction benefited from the opportunity to observe the mental models of experts. For inexperienced learners, EM may have provided a chance to learn the cognitive processes of experts; whereas, for experienced learners, this may have been a good opportunity to compare those processes with their own. For this reason, learners in EM instruction (M = 3.3) exhibited higher satisfaction than did those in SGM instruction (M = 2.1). Learners may have felt confidence and satisfaction with enough guidance from the instruction. On the other hand, learners in SGM indicated that they may have experienced frustration because they received minimal guidance. Clark (1982) mentioned suggestive evidence that even experienced students indicated higher satisfaction in the guided approach like EM instruction. Advanced students who select the more guided versions of courses do so because they believe that they will achieve the required learning with a minimum of effort. As experienced learners expect positive outcomes after guided learning, the experienced learners in the EM group exhibited more satisfaction than those in the SGM group. Regarding relevance, both EM (M = 3.9) and SGM (M = 3.7) exhibited high relevance for MCI. Keller (1983) mentioned that if learners perceive instruction to be helpful in accomplishing their goals, they will be more conscious of activating existing schema and integrating new knowledge into them; therefore, the perception that instruction is relevant will eventually improve learning.

Regardless of the types of MCI employed, the inexperienced participants expressed significantly higher levels of attention and satisfaction than did the experienced participants. It seems that the inexperienced participants were energized by the opportunity to solve an ethical decision-making case.

Conclusion

The findings from this study will contribute to our knowledge in the area of designing instruction for ill-structured problem-solving domains. The study focused on ethical decision making in the context of program evaluation. The results cannot be generalized to other ill-structured domains, but they do suggest directions to explore in other domains, such as the counseling field where ethical dilemmas are prevalent. This study also supported the expertise reversal effect and suggested that learner expertise should be carefully considered by instructional designers in the future design of MCI. Further research should consider other measures for assessing mental models. Although T-MITOCAR provides one type of mental model assessment, it does not explain the mental processes of learners in sufficient detail. The present study results have not provided concrete and specific answers as to how the different types of MCI affected the processing of mental models by learners. Therefore, qualitative assessment methods might help validate the results in this study of mental models. Analyzing the learners’ answers and observing learner problem-solving processes would help us better understand the effects of MCI and would provide us with more refined directions for future research.

In addition, during the study, 24 of 86 participants dropped out. Only data from the 62 students who completed the study were used in the analysis. Several factors may have contributed to the high drop-out rate. First, the nature of online instruction could be a primary reason. The drop-out rate of online instruction learners is known to be typically higher than those of on-campus learners (Kember 1995). Since instructors do not monitor their online learners, students can leave the instruction anytime because of other distractions (e.g., phone calls). Moreover, because this was not a required part of their coursework, those who experienced difficulty or challenges in solving ethical problems may have been more prone to drop out of the study. Even though this study satisfied the required sample size, having more subjects would increase statistical power and contribute to a greater possibility of achieving significant effects. Finally, meeting the educational needs of any type of student is a critical issue in today’s educational climate. Understanding how mental models and experience play a role in students’ motivation and learning may lead to improved instructional systems.