Introduction

Human working memory is defined as a component system responsible for the temporary storage and manipulation of information related to higher-level cognitive behaviors, such as understanding and reasoning (Baddeley 1992a; Becker and Morris 1999). Working memory, while able to manage a complex array of cognitive activities, presents a significant limitation in that only a few elements or chunks of information can be processed in working memory at a given time. Miller (1956) established that working memory can only maintain about seven elements of information at a time. In practical terms, human working memory is increasingly prone to error as the learning task becomes more complex, and under typical circumstances, can only hold elements active in working memory for a matter of seconds without rehearsal (Anderson et al. 1996; Baddeley 1992a; Miller 1956; Shiffrin and Nosofsky 1994).

In contrast, long-term memory effectively stores all of our knowledge (content, skills, and strategies) on a permanent basis with the ability to recall this information being somewhat more variable (Baddeley 1992b; Ericsson and Kintsch 1995). The robust nature of long-term memory is a function of schemata that allow an individual to treat multiple elements of information as a single element in terms of imposed working memory load. Given that schemata are managed in working memory as a single element, increased working memory is available to address the other elements of a problem state, especially if schemata are processed in an automated fashion (Cooper and Sweller 1987; Gobet and Simon 1996; Sweller et al. 1990; Sweller et al. 1983).

The mechanisms that underlie the cognitive task of learning and the factors that determine the difficulty of instructional materials have been the focus of much research over the past 30 years (Paas et al. 2003b, 2004; Sweller 1999; Sweller et al. 1998). Cognitive load theory (CLT), as conceptualized by Sweller (1988) and his colleagues in the late 1980’s, is concerned with instructional and message design principles that seek to improve the learning of complex cognitive tasks by managing the limited processing capabilities of working memory while capitalizing on the extensive capabilities of long-term memory.

Dimensions of cognitive load

Cognitive load theory proposes that total cognitive load or the total amount of mental load that is imposed on working memory is composed of three components: (a) intrinsic cognitive load, (b) extraneous cognitive load, and (c) germane cognitive load (Sweller et al. 1998). Intrinsic cognitive load (ICL) is imposed on the learner by the nature of the material being processed and learned (Sweller et al. 1998). For example, instruction that requires a novice learner to simultaneously process a high number of information elements (e.g., a time–distance problem) in working memory would be expected to impose high intrinsic load. This aspect of cognitive load is not under the direct influence of the instructional designer, but there is evidence to support the indirect manipulation of ICL by incorporating sequencing and layering strategies into instructional design processes and learning tasks (Pollock et al. 2002).

Extraneous cognitive load (ECL) is imposed by factors such as instructional strategies, message design, interface design, and the quality of instructional materials and learning environments. ECL is readily influenced by instructional design decisions and has been the focus of much investigation (Sweller et al. 1998). In simple terms, high ECL equates to a reduction in working memory resources available for developing schema, while low ECL equates to an increase in working memory resources available for schema development. Research related to the physical integration of diagrams and text and the elimination of unnecessary information in order to reduce demands on working memory has been conducted with much success in the knowledge domains of biology, computer-aided design/computer-aided manufacturing, electrical engineering, computer programming, and mathematics (Bobis et al. 1993; Chandler and Sweller 1991, 1996; Kalyuga et al. 1998; Leung et al. 1997; Sweller et al. 1998; Tarmizi and Sweller 1988).

The third and final dimension of cognitive load is germane cognitive load (GCL) and is described as the “load imposed by cognitive processes directly relevant to learning” (van Merriënboer et al. 2002, p. 12). Germane cognitive load is the remaining working memory capacity the learner uses to form schema. If the nature of the content imposes a high intrinsic load and poor design imposes additional extraneous load on working memory, then the learner may lack the working memory capacity (i.e., germane cognitive load) to form the schema needed for understanding. Consider the following example. After reviewing a web-based unit on how to calculate the mean and standard deviation in a spreadsheet that has professional narration with animated pictures, the designer decides to scroll a text version of the narration across the lower part of the screen that is synced with the voice narration. While the intrinsic cognitive load is medium to high for this content, the extraneous cognitive load is increased due to the redundancy effect created by the narration and scrolling script. The result is a reduction in working memory capacity for germane cognitive load that can prevent the formation of an appropriate schema. Germane load is indirectly influenced by manipulating extraneous cognitive load and is directly linked to schema formation and automation (Sweller et al. 1998). Instructional processes that seek to foster germane load and schema formation have been shown to be effective under certain conditions and have typically employed the use of worked examples, completion problems, and the means by which to transition such processes (Paas and van Merriënboer 1994; van Gerven et al. 2000; van Merriënboer et al. 2002). Last, given that intrinsic + extraneous + germane cognitive load equals Total Cognitive Load, the combination of ECL and ICL must leave sufficient cognitive resources available if germane load is to be addressed (Kirschner 2002; van Merriënboer et al. 2002).

Implication for instructional design

Prior research has suggested design strategies for structuring instructional material in domains such as biology, computer programming, and mathematics. The research suggests that complex instruction (i.e., high element interactivity) that exceeds working memory capacity can impede learning. In simple terms, there is strong evidence suggesting that high levels of cognitive load can be reduced by matching instruction and instructional processes with the cognitive architecture of human working memory. The purpose of this study was to investigate the applicability of cognitive load design strategies used to reduce redundancy and split attention to teaching complex cognitive and psychomotor skills (Paas et al. 2004; van Gerven et al. 2000; van Merriënboer et al. 2002) instructional designers can employ.

Literature review

In situations involving more complex cognitive tasks such as problem solving, demands placed on working memory that are not directly related to the problem can hinder learning by exceeding available cognitive resources. This problem is particularly salient in the context of a novice learner and new information (Sweller et al. 1998). In such situations, instructional principles that avoid overburdening working memory or direct the learner’s available cognitive resources are needed to design efficient and effective instruction. The following discussion provides examples of how working memory is affected by instructional difficulty, redundancy, and split attention.

Element complexity and interactivity

From the perspective of working memory, an element is described as any unit or chunk of knowledge to be learned and interactivity describes how the information is processed by working memory. In situations with low-element interactivity, such as with serial processing tasks, little or no overlap exists between elements (e.g., learning the primary colors) and the learning task will typically not be difficult unless the number of independent elements is rather high. In contrast, situations with high-element interactivity where understanding requires that all elements be maintained in working memory and manipulated simultaneously (e.g., solving a time–distance problem), learning tasks can become exceptionally difficult. In such instances the cognitive load imposed by trying to keep all information elements in working memory may exceed the processing abilities of working memory (Sweller and Chandler 1994). It is for this reason that cognitive load design strategies such as reducing split attention and redundancy, using a goal-free as opposed to means-end approach to problem solving, and using worked examples have been shown to be effective in areas involving more complex and novel learning tasks; where both element complexity and element interactivity are typically high and memory resources are likely to be taxed (Sweller 1994; Sweller and Chandler 1994; Sweller et al. 1998).

Split attention and redundancy

Split attention and redundancy are closely linked concepts that are typically managed with similar message design strategies. To explain, split attention or a split-attention effect occurs when a learner must cognitively integrate two or more divergent sources of information that cannot be understood in isolation (Sweller 1999). A common example is the reference in a text to a diagram that may be two or three pages removed from the discussion. In such instances, the learner devotes unnecessary cognitive resources remembering physical locations on the page and diagram that have nothing to do with the problem state, learning, or schema acquisition (Sweller 1999). In contrast, redundancy or a redundancy effect occurs when a learner is presented with two or more sources of information that can be understood in isolation. A common example is the text narrative and a diagram that include the same or equivalent information such that the learner can gain understanding by reading just one source. Similarly, redundancy occurs in multimedia instruction when the spoken narration is also scrolled as text on the screen. In cases such as these, redundant information can also place increased and unnecessary demands on cognitive resources resulting in increased extraneous cognitive load (Sweller 1994; Sweller and Chandler 1994).

To avoid split-attention effects, it is necessary to have all essential and related material physically positioned together, such as with an illustration or integrated diagrams that includes the relevant text included as part of the illustration or diagram. To avoid redundancy effects, it is necessary to eliminate duplicate sources of information. In short, unnecessary constraints placed on the learner’s cognitive resources by redundancy and split attention may increase extraneous cognitive load, resulting in reduced germane cognitive load that limits both instructional effectiveness and learning (Sweller 1990; Sweller and Chandler 1994).

In an early study, Tarmizi and Sweller (1988) found that worked examples for a geometry task that did not require the learner to split attention were superior to conventional problems. Their results supported the conclusion that split-attention effect may interfere with design strategies intended to promote germane load and further supported instructional formats that limit divergent sources of information. In a later study, Sweller et al. (1990) found that requiring the learner to integrate different sources of mutually referring (i.e., text and illustration) information interfered with learning mathematics and engineering materials, despite the use of schema-driven strategies, such as worked examples. A similar study by Ward and Sweller (1990) also provided support for instructional formats that limit divergent sources of information, regardless of the schema driven instructional strategies employed in the study.

Evidence of split attention and redundancy effects

The study of split attention and redundancy effects across single-format media studies has established significant increases in achievement, faster content processing times, reduced completion times, and decreased levels of cognitive load when split attention and redundancy were reduced through appropriate design interventions (Bobis et al. 1993; Chandler and Sweller 1991, 1992; Purnell et al. 1991; Sweller et al. 1990; Tarmizi and Sweller 1988; Ward and Sweller 1990). In contrast, given the situated nature of cognitive load research, specific levels of redundancy, split attention, or levels of unintelligibility are expressed as general strategies and not as formal prescriptive principles that readily transfer to other learners or knowledge domains (Bobis et al. 1993; Chandler and Sweller 1991, 1992; Purnell et al. 1991; Sweller and Chandler 1991; Tarmizi and Sweller 1988; Ward and Sweller 1990).

For example, the optimal instructional format can be expected to change as a function of learner knowledge, interactions between the structure and characteristics of the material to be learned and previously acquired schema, problem solving strategies, and learner involvement. As such, proper application of CL design principles must incorporate an understanding of the knowledge being taught, classification of learning tasks, and learner analysis (Bannert 2002; Cooper and Sweller 1987; Kalyuga et al. 2003; Paas et al. 2005; Sweller 1999; Sweller and Cooper 1985; Tarmizi and Sweller 1988).

Collectively, prior studies have provided support indicating that instructional material should typically be presented without redundant features, and that materials that cannot be understood in isolation should be physically integrated. Second, self-explanatory, integrated diagrams are presumed superior when redundant and incidental materials are removed. Third, learning and transfer are both favored by strategies that eliminate split attention and redundancy in technical areas.

Purpose of the study, hypotheses, and research questions

The purpose of this study was to test the effectiveness of instructional materials designed to control redundancy and split attention in the teaching of complex orthopedic physical therapy skills. The types of modifications employed consisted of integrating text and an illustration to reduce split attention and the removal of redundancy in the text and graphic in the original materials. The following hypotheses were tested to identify effectiveness of cognitive load design principles to the knowledge domain of physical therapy and to the teaching of specific psychomotor skills:

  1. 1.

    Participants who receive modified instructional formats will achieve higher written post-test scores as compared to control group participants who receive traditional instructional formats.

  2. 2.

    Participants who receive modified instructional formats will report lower subjective ratings of cognitive load as compared to control group participants who receive traditional instructional formats for both post-instruction and post-psychomotor performance.

  3. 3.

    Participants who receive modified instructional formats will be superior on the performance of manual physical therapy skills as compared to control group participants who receive traditional instructional formats.

  4. 4.

    Participants who receive modified instructional formats will have lower task completion times (instructional unit and examination) as compared to control group participants who receive traditional instructional formats.

In addition, we posed one research question asking if instructional materials designed in accordance with cognitive load theory design principles positively influence learner attitudes towards instruction.

Method

Participants

Participants were 41 graduate program Physical Therapy (PT) students who were recruited on a voluntary basis from two universities and were randomly assigned to either the modified instruction or control group. Seventeen participants from a physical therapy program at a large midwestern university (modified instruction n = 9, control n = 8) were recruited from a total of 20 possible participants. Twenty-four participants from a physical therapy program at a second, but smaller midwestern university (modified instruction n = 12, control n = 12) were recruited from a total of 28 possible participants. Because prior studies have shown that optimal instructional formats are in part dependent on the experience of the learner, only first professional year students who had no formal exposure to the instructional content were selected for this study (Pollock et al. 2002; Yeung 1999; Yeung et al. 1997). Specific to this study, participants were first year physical therapy students; 18 were in the process of applying for bachelor’s degrees, and 23 had already received bachelor’s degrees. Last, given the homogenous nature of accredited physical therapy programs and physical therapy curricula, participants from both applicant pools had completed equivalent prerequisite courses (i.e., basic patient care skills, surface anatomy and palpation, kinesiology, and human anatomy) required to understand the instruction provided in the two treatments.

Participant expectations included an understanding of study objectives, institutional regulations (e.g., voluntary participation and protection of participant anonymity), the importance of study content, and the need for authentic classroom participation without rewards or remuneration. Participants were also advised that the efficacy of two or more instructional formats was being studied and that instruction and testing would be implemented in a manner consistent with a typical physical therapy class.

Materials

Two questionnaires, two instructional units of equivalent content (i.e., actual and modified), written post-test, a clinical performance rubric (i.e., a mock patient assessment) and procedural protocols were developed for this study. All materials were developed using formal instructional design principles. Content was directly applicable to clinical practice and consistent with curricular objectives. The instructional materials used were an actual unit of instruction that was part of the required curriculum at both universities. The unit for this study was co-developed and utilized by instructors at both universities prior to this study (i.e., the actual instruction). An equivalent unit of instruction that was modified (i.e., the modified instruction) to eliminate split attention and redundancy effects present in the actual instructional unit. In this paper, we will refer to the group receiving the actual instruction as the control group.

Additionally, lesson content was a self-paced, paper-based required reading for proceeding integrative laboratory sessions. All of the above attributes were maintained in this study with one necessary modification. Specifically, participants scheduled an available study time slot and completed the instruction in a controlled classroom environment along with the data collection instruments discussed below.

Lesson content

The content covered in the instruction was localization testing. Localization testing describes a series of adaptive cognitive and psychomotor PT examination procedures (i.e., application of tests and measures) and respective PT evaluation (i.e., interpretation of tests and measures) used to increase or decrease patient symptoms (e.g., pain) to determine the regional, structural, and segmental origin of symptoms. The procedures presented in both variations of the instruction and tested during the clinical performance assessment required the demonstration of a series of appropriately sequenced steps that could be used to clinically localize the primary anatomical region of dysfunction for an orthopedic patient complaining of generalized low back, pelvic, and leg pain. Specifically, the participants were presented a patient with difficult to localize thigh, pelvic, and low back pain that only occurs with weight bearing onto the involved leg. This case is a typical clinical presentation in which it would be necessary to specifically reproduce and alleviate the problematic symptom in order to identify the specific anatomical region(s) of dysfunction. From a curricular perspective, these tasks are cognitively demanding because the novice student must maintain multiple pieces of data in working memory to solve the problem, apply prior knowledge (e.g., anatomy and kinesiology), utilize critical thinking skills, make clinical decisions, utilize specific psychomotor skills, and communicate with the patient.

Control group instruction

The control group instruction contained a brief introduction and two knowledge sections. Section 1 described the concepts and principles of orthopedic provocation and alleviation. Section 2 described a procedure for performing the provocation and alleviation procedure that was tested during the psychomotor assessment. For this study, the control group materials were modified by replacing rudimentary diagrams with color photographs of a therapist performing the technique and correcting typographical errors, as identified by a subject-matter expert review.

The control group instruction included redundant features (i.e., text passages and redundant diagrams and/or diagram captions), split-attention features (i.e., photograph of a therapist performing the technique with referring body text and/or figure captions). The unit was consistent with laboratory manuals, texts, and course room instruction used to teach PT curriculum. Following subject-matter expert review and revisions, the length of the unit of instruction was maintained at five and one-half content pages.

Modified instruction

Control group instruction was modified using strategies to remove redundancy and split attention. Content that was unintelligible in isolation was physically integrated with an illustration to remove split-attention effects. For example, Fig. 1 depicts a screen capture from the modified instruction, which physically integrates the body text, figure, and the figure caption from the actual instruction into a single diagram with all elements placed in close proximity. Following the elimination of redundant features, the length of the modified instruction was five fully occupied content pages.

Fig. 1
figure 1

Modified instruction page capture depicting the integration of body text, a figure, and the figure caption

Subject-matter expert review process

The control group instruction was reviewed for accuracy and appropriateness by five subject-matter experts, two of whom were American Physical Therapy Association Board Certified Orthopedic Specialists. The unit of instruction underwent minimal technical modifications and the final unit was based on full subject-matter expert consensus. Next, the modified unit of instruction was constructed as described above and reviewed by the same five subject-matter experts. The final modified instructional unit was based on full subject-matter expert consensus; the subject-matter experts judged the two units as equivalent units of instructional content. Last, both units were reviewed by three instructional designers familiar with cognitive load design principles for the correct application.

Instruments

The following paragraphs describe the instruments in the order the participants completed them.

Post-instruction questionnaire

The post-instructional questionnaire was used to collect participant reported educational and biographical data across seven questions (e.g., age, gender, GPA, prior academic degrees, and previous exposure to content knowledge). Participants were then asked to rate subjective mental workload (CL) associated with learning the instructional materials on a seven-point scale, as adopted from prior studies (Kalyuga et al. 1998, 1999, 2000). Specifically, the questions asked, “How easy or difficult was the instruction to understand,” and offered the responses: “extremely easy,” “very easy,” “easy,” “neither easy nor difficult,” “difficult,” “very difficult,” and “extremely difficult.” This subjective measure of cognitive load have shown to be valid, reliable, and sensitive to small differences in cognitive load, and correlate highly with objective measures (Kalyuga et al. 2000; Paas et al. 2003a; Tuovinen and Paas 2004). The last 10 questions on the instrument asked the participants to rate the instruction and learning: quality, difficulty, effectiveness, relevance, and confidence; using a standard 5-point Likert scale (i.e., “strongly agree,” “agree,” “neutral,” “disagree,” and “strongly disagree”). Cronbach’s alpha for the cognitive load and attitudes towards instruction questionnaires in this study were .76 and .73, respectively.

Written post-test

The written post-test consisted of 18 questions adopted for previous classroom examinations and selected to assess specific content features of the actual and modified instruction. Specifically, the first six items assessed content that was identical in both instructional units, the second six items assessed content that contained redundant features in the actual instruction, and the final six items assessed content that contained split-attention features in the actual instruction. Furthermore, each block of six questions included a knowledge, comprehension, application, analysis, synthesis, and evaluation question. Cronbach’s alpha for the written post-test was .48 in this study.

The moderate alpha value was attributed to the adoption of actual examination questions, the heterogeneous nature of authentic questions (e.g., testing of prior learning), and the complexity of content. For example, an evaluation question requested the structural origin of symptoms for a patient with complaints of pain with (1) “active shoulder flexion”, (2) “passive shoulder extension with an extended elbow and supinated forearm,” and (3) “resisted elbow flexion and resisted supination.” This content did not represent a one-dimensional construct. That is, provocation and alleviation constructs minimally included prior learning (e.g., terminology, anatomy, kinesiology), which was necessary to select the correct response (i.e., a contractile lesion involving the biceps brachii muscle).

Psychomotor performance grading rubric

A post-test psychomotor or physical therapy performance rubric was developed to evaluate the clinical, mock patient performance of each participant. Cronbach’s alpha for the physical therapy performance post-test was .83.

The rubric was constructed in a manner that was consistent with classroom testing of physical therapy students and contained three distinct criterion referenced grading sections (i.e., recall of knowledge, physical therapy examination, and physical therapy evaluation). All criteria were items that could be overtly stated by the participant or overtly observed by the proctors. The rubric protocol and mock patient responses were integrated into the instrument. For example, the introductory instruction read by a proctor stated, “Please demonstrate the one technique that was presented in the unit of instruction that you studied earlier today using the mock patient (i.e., as identified by first name). After you complete a step, we will tell you if their pain improved or stayed the same. When you complete the entire process, please identify that you are finished and we will ask you which region you believe is the source of pain. You have up to ten minutes, please begin.” The mock patient and both mock patient examiners were blinded to instructional format.

Additionally, the psychomotor-assessment instrument contained two scoring rubrics, one for “verbal” criteria and one for “procedure/technique” criteria. The verbal rubric included the following scale from high (4) to low (0): “answer is concise, accurate, and complete,” “answer is accurate and complete but lacks clarity and conciseness,” “answer is in part accurate with additions or deletions,” “answer is incomplete and/or inaccurate,” and “answer is unacceptable.” For example, the question “what region is the source of the patient’s pain” would be scored using the verbal criteria rubric. The procedure/technique criteria scale consisted of process-related criteria that specifically matched the sub-steps for the procedure presented in both instructional units and included the following scale from high (3) to low (0): “observed,” “partially observed,” and “not observed,” respectively.

Post-psychomotor cognitive load questionnaire

The post-psychomotor task instructional performance questionnaire asked the participant to identify “how easy or difficult was it to perform the procedure you just completed” using the same scale as described for the measurement of CL in the post-instructional questionnaire. It was administered immediately after performing the task.

Procedure

Participants at each institution were randomly assigned to one of the two treatment groups. In all, the 41 participants successfully completed the study in six data collection sessions at one university (modified instruction n = 12, control n = 12) and in five data collection sessions at the second university (modified instruction n = 9, control n = 8). Multiple data collection sessions during a 5 day period at each university allowed for greater learner autonomy as would be expected with self-study materials and limited each data collection session to a maximum of six participants (M number of participants = 3). The latter constraint was implemented to prevent lengthy waiting periods between the written assessment and the psychomotor assessment, which was limited to one participant at a time. Additionally, a written informed consent, written participant agreements for adherence to content concealment until study completion, and strict adherence to study protocols were used to minimize confounds between sessions.

Psychomotor or mock patient assessment

After completing the instructional materials, post-instruction questionnaires, and written post-tests; individual participants were escorted in random order to a separate laboratory designated for the individual psychomotor assessment. During the psychomotor assessment, all participants were asked to perform the same localization procedure that was taught in the instruction using a trained mock patient. Two proctors observed and questioned the participant per the assessment protocol. For example, in a properly sequenced mock patient examination, participants positioned the mock patient just into symptom for hip joint alleviation, un-weighted the hip joint via specific manual contact to the pelvis combined with a cranial force, identified mock patient symptoms and then repositioned the patient and performed the test for hip joint alleviation. Hip localization was followed by examination of the sacrum and lumbar spine for both provocation and alleviation, and the mock patient continued to state if symptoms increased, decreased or remained the same, when questioned. In order to standardize the examination process and test all components of the examination sequence, the mock patient provided responses that would lead to a single acceptable conclusion to the mock patient problem (i.e., the lumbar spine was the source of pain).

Two faculty proctors, who were APTA Board Certified Orthopedic Specialists, rated each individual participant. After the participant completed the task and left the room, the two proctors compared their ratings and arrived at a consensus rating. The post-psychomotor cognitive load questionnaire was administered immediately after the psychomotor assessment. Last, classroom protocols were administered by a non-physical therapy proctor.

Results

Multivariate and univariate analysis of all educational and biographical data (i.e., gender, age, year in program, cumulative GPA, semester GPA, prior academic degrees earned, type of degree(s), and prior knowledge and formal and/or informal exposure to instructional content) identified no significant effects between participants from the two universities and data sets were pooled for analysis. The four hypotheses were evaluated using a multivariate analysis of variance and the significance level was set at alpha = .05. Additionally, no consequential violations of normality and homogeneity of variance were observed.

A MANOVA on the data from the cognitive and psychomotor post-tests, and the rating of cognitive load yielded an overall significant difference between the control and modified instruction group, Pillai’s Trace: F(6, 34) = 6.213, p < .001, ES = +0.52. Support was provided for hypotheses predicting superior post-test performance, lower ratings of cognitive load on written and psychomotor tasks, and superior psychomotor performance by the modified instruction group. There were no differences in the time required to complete the instruction or written post-test. Descriptive statistics for both groups are presented in Table 1.

Table 1 Descriptive statistics for control and modified instruction groups for the written post-test, psychomotor assessment, and cognitive load ratings

Analysis of written post-test scores—hypothesis 1

A univariate ANOVA revealed a significant main effect for the written post-test scores, F(1,39) = 16.564, p < .001, MS e  = 2.12, ES = +0.30. The modified instruction group achieved significantly higher written post-test scores (M = 16.00) as compared to control group participants who received the actual instructional format (M = 14.15) as predicted by Hypothesis 2.

Follow-up univariate analysis of written post-test scores was conducted to assess the effectiveness of the instructional design strategies used with identical content, presence or absence of redundant content, and presence or absence of split-attention features. Univariate analyses of post-test scores revealed that the modified instruction group (M = 5.71) scored significantly higher than the control group with redundancy present (M = 5.20), F(1,39) = 6.82, p = .013, MS e  = .34, ES = +0.15. Similarly, the analysis found that when split-attention features were corrected, the modified instruction group (M = 4.81), F(1,39) = 9.73, p = .003, MS e  = .97, ES = +0.20; performed significantly better than the control group (M = 3.85). No differences were noted with identical presentation formats between the control and modified instruction groups (see descriptive statistics, Table 2).

Table 2 Descriptive statistics for identical, redundant, and split attention content post-test scores

Analysis of cognitive load ratings—hypothesis 2

A follow-up univariate ANOVA revealed a significant main effect for subjective ratings of cognitive load measured after the completion of the instruction, F(1,39) = 6.02, p = .019, MS e  = .69, ES = +0.13, and after the completion of the psychomotor performance task F(1,39) = 7.76, p = .008, MS e  = 1.02, ES = +0.17. As predicted by Hypothesis 2, participants who received the modified instruction reported significantly lower post-instructional subjective ratings of cognitive load (M = 2.71), as compared to control group participants (M = 3.35). Additionally, participants who received the modified instruction reported significantly lower subjective ratings of cognitive load measured after the psychomotor performance (M = 2.62), as compared to control group participants (M = 3.50).

Analysis of PT performance—hypothesis 3

A follow-up univariate ANOVA revealed a significant main effect for overall psychomotor rubric scores (i.e., physical therapy performance) F(1,39) = 29.15, p < .001, MS e  = 27.90, ES = +0.43. As predicted by Hypothesis 3 participants in the modified instruction group achieved significantly higher rubric scores on the performance of manual physical therapy skills (M = 39.76), as compared to control group participants (M = 30.85).

Follow-up univariate analysis of psychomotor rubric scores was conducted to assess the effectiveness of the instructional design strategies on physical therapy performance on the three distinct sections of the psychomotor rubric, which included: recall of knowledge, physical therapy examination (e.g., performing techniques and collecting data), and physical therapy evaluation (e.g., interpreting examination data).

Univariate analysis of physical therapy performance scores on the three sections of the performance rubric revealed that the modified instruction group scored significantly higher on the physical therapy evaluation section (M = 7.10): F(1,39) = 20.23, p < .001, MS e  = 8.91, ES = +0.34; and on the physical therapy examination section (M = 28.67): F(1,39) = 13.95, p < .001, MS e = 13.37, ES = +0.26; as compared to the control group (M = 2.90 and 24.40, respectively). No significant differences were noted on the recall of knowledge section between groups. Descriptive statistics for physical therapy evaluation, physical therapy examination, and recall of knowledge rubric scores are presented in Table 3.

Table 3 Descriptive statistics for psychomotor rubric scores: PT evaluation, PT examination, and recall of knowledge

Analysis of task completion times—hypothesis 4

A follow-up univariate ANOVA did not reveal significant differences between the two groups on time needed to complete the instructional unit or time needed to complete the written examination as predicted by Hypothesis 4. Descriptive statistics for both groups are presented in Table 1.

Analysis of attitudes

Data relating to one research question were analyzed to determine if the instructional modifications designed to reduce cognitive load would positively influence learner attitudes towards the instruction. As noted above, the last 10 items on the post-instruction questionnaire asked the participant to rate attitudes towards learning: quality (Q2 and Q3), difficulty (Q4), effectiveness (Q5, Q6, Q7, and Q8), relevance (Q9), and confidence (Q10 and Q11) using a standard 5-point Likert scale.

MANOVA and follow-up univariate analysis of attitudes towards instruction did not identify significant differences between the control group (M = 1.64) and modified instruction group (M = 1.74) with both groups reporting relatively high satisfaction with their respective instructional materials.

Discussion

The results of this study suggest that principles for controlling redundancy and split attention are applicable to the design of instruction that focuses on psychomotor instruction in addition to instruction focusing on knowledge and cognitive tasks.

Hypothesis 1: effectiveness of the modified instructional format

Hypothesis 1 predicted that the modified instruction group would achieve higher post-test scores as compared to the control group. The primary variables under assessment were written post-test scores, which entailed the further analysis of the scores on comparable content in the two treatment groups. As noted previously, this comparison allowed for analysis of content that was identical in both instructional units and content that presented with redundant features and with split-attention features in the control group instruction.

The results for cumulative scores and content structure scores indicated that there was a significant difference between the two instructional conditions in the expected direction with the modified instruction group scoring significantly higher. These findings suggest that the instructional complexity, interactivity of elements, and novelty of the content were capable of placing an appreciable load on the learner’s available cognitive resources. These results further suggest that the modified instruction allowed for GCL by reducing ECL as a function of sound design practices. Conversely, these results also suggest that the control group sufficiently increased ECL and sufficiently limited GCL which prevented participants from developing the appropriate schema and understanding of the content. The reduction of total cognitive load via the management of ECL is perhaps the most prominent cognitive load management principle and consistent with findings identified in prior research (Bobis et al. 1993; Chandler and Sweller 1991, 1992; Marcus et al. 1996; Purnell et al. 1991; Tarmizi and Sweller 1988).

Follow-up analysis was conducted to assess group differences between identical, redundant content, and split-attention features. In conditions where the instruction required mental integration (i.e., split-attention effect) for understanding or in situations where instructional materials were presented with redundant features (i.e., redundancy effect), the modified instructional group scored significantly higher than the control group on respective test questions. Because complex-learning situations composed of several highly inter-related elements creates the heaviest load on working memory, the differences in these scores between groups provides further support for the preliminary findings. That is, content containing redundant or split-attention features represented a discernable difference between groups in terms of the number of discrete elements that participants’ were required to maintain and simultaneously manipulate in working memory. Furthermore, the lower performance demonstrated by the control group suggests that the number of elements exceeded the processing abilities of working memory and sufficiently limited germane cognitive load. Finally, as would be expected, in situations where it was not necessary for the learner to integrate divergent sources of information or process redundant information, there was no difference between the two groups’ content scores (Sweller and Chandler 1994; Sweller et al. 1998).

In the context of procedural nature of the treatment materials and in consideration of more recent contributions to cognitive load theory, the modified instruction group may have chosen to learn or memorize the individual steps in isolation (isolated elements approach or serial processing) before attempting to integrate the entire process (Pollock et al. 2002). While the claim that the modified instruction group utilized such strategies is speculative, future studies might query the participants to determine what type of metacognitive strategies they used for the different tasks. A second approach would be to test a multi-stage approach in order to manipulate intrinsic load.

Last, consideration was given to the possibility that the integrated diagrams were simply more effective at communicating the procedures. However, as technique figures, photographs, and the sequencing of procedures were identical in both treatments and not applicable to all content, study findings would appear to be attributable to the reduction of ECL and freeing up of GCL.

Hypothesis 2: effect of modified instruction on cognitive load

The modified instruction group reported significantly lower subjective ratings of cognitive load post-instruction and post-psychomotor assessment when compared to control group participants, as predicted by Hypothesis 2. These significantly lower subjective ratings are consistent with the significantly higher objective performance measures achieved by the modified instruction group. Additionally, while the use of subjective ratings of cognitive load were not identified in prior research in the context of psychomotor assessment or the performance of manual physical therapy skills, the present findings suggest that such measures can be extended to the performance of psychomotor tasks. Specifically, significantly lower subjective ratings of cognitive load reported by the modified instruction group were correlated with significantly higher psychomotor assessment scores as discussed in the following paragraphs.

Hypothesis 3: effect of modified instruction on psychomotor performance

Hypothesis 3 predicted that participants who received the modified instructional format would achieve higher performance scores on manual physical therapy tasks. On this task, the modified instruction group scored significantly higher on total psychomotor performance in the expected direction. Follow-up analysis was conducted to assess group differences between the three sections of the scoring rubric and revealed that the modified instruction group scored significantly higher on the physical therapy evaluation section and the physical therapy examination section of the rubric, with no significant differences noted on the recall of knowledge section. These findings suggest that both groups understood the basic facts and concepts presented in their respective instructional treatments, though only the modified instruction group was able to demonstrate proficiency on task performance.

These findings could be attributed to both the content structure of the two treatments as previously discussed and to the level of complexity of the content. Specifically, the presentation of procedural tasks was very conducive to diagrammatic presentation and in fact, localization techniques were presented in diagrammatic formats in both treatments. In the control group, the participants needed to integrate the information to fully understand the procedures, a constraint that was not present in the modified instruction group. This reduction in ECL would have allowed for an increase in GCL for the modified instruction group and would have further allowed for development of appropriate schema.

In the framework of the above findings, the processes that underlie the acquisition of psychomotor learning and the type of psychomotor skills taught in the instruction are important considerations. To explain, Romiszowski (1993) suggested that psychomotor learning typically involves the acquisition of both skills and knowledge. He identifies knowledge as “information stored in the performer’s mind or available to the performer in some reference source” and skill as “actions (intellectual as well as physical) which the performer executes in a competent manner in order to achieve a goal” (pp. 130–131). Romiszowski identified a difference between “reproductive skills” that entail repetitive and automated actions and “productive skills” that entail the use of adaptive strategies and reasoning skills. This study employed psychomotor tasks that are consistent with Romiszowski’s definition of “productive skills” as the participants had little time to address repetition or automation and were required to problem solve and adapt strategies or make clinical decisions during the psychomotor assessment phase.

Productive or adaptive skills have been further studied by Anderson and Lebiere (1998) as presented in his Adaptive Control of Thought (ACT-R) model, which has been directed towards understanding procedural knowledge linked to cognitive skills relevant to decision making and problem solving. ACT-R states, “productions provide the connection between declarative knowledge and behavior” (Anderson 1983). Relative to this study, contributions by Anderson and Romiszowski (1993) help explain the link between declarative knowledge and behavior, and offer further explanation for the superior psychomotor performance demonstrated by the modified instruction group. It appears the modified instruction effectively reduced the cognitive load of the modified instruction group allowing them to develop the appropriate schema linking declarative knowledge and behavior. In contrast, the control group’s working memory was overwhelmed by the intrinsic and extraneous cognitive load leaving inadequate germane cognitive load to develop the appropriate schema.

In practical terms, it would have been necessary for participants to transfer immediate content knowledge, as well as prior prerequisite knowledge and both immediate and prior connections between knowledge and behavior to a suitable and likely expanded schema suitable for solving the patient problem. In this study, statistically superior performance on higher-level reasoning aspects of task performance (i.e., PT examination and PT evaluation) and statistically superior rubric scores by the modified instruction group, suggest that the use of CL design principles support both the transfer of declarative knowledge to behaviors and superior schema acquisition by reducing extraneous cognitive load and increasing the capacity for germane cognitive load.

Hypothesis 4: effect of modified instruction on task completion times

Hypothesis 4 predicted that participants who received the modified instruction would have lower task completion times for the instructional unit and the written examination. There were no significant differences between the groups on time needed to complete the instructional unit or the written examination. One plausible explanation is that the modified instruction group had to invest little mental effort, while the actual group felt overwhelmed and did not invest the additional effort needed to overcome the limitations of the materials needed to promote learning with understanding. The use of performance incentives tied to course achievement (Morrison et al. 1995) for both groups might motivate participants in the control group to invest more time in understanding the content.

Research question

The research question asked if instructional materials designed in accordance with cognitive load theory design principles would positively influence learner attitudes towards instruction. In this study, statistically significant results in the expected direction indicated that attitudes as a function of subjective ratings of cognitive load reported by the modified instruction group were positively influenced as compared to the control group. However, general attitudes towards instructional formats as measured by the post-instruction questionnaire in the areas of quality, difficulty, effectiveness, relevance, and confidence did not identify any significant differences between the two groups. Specifically, both groups rated their instruction high, which for the control group was in contrast to both objective measures and subjective ratings of cognitive load. As a possible explanation for this finding, the scheduled instructional times or single instructional time may have influenced subjective ratings while longer or multiple instructional periods may have provided different findings. Last, the control group may have rated their instruction in a favorable manner simply because it was in a format to which they were accustomed.

Conclusions

This study used ecologically valid materials in a realistic classroom setting. The results suggest that designers can increase the germane cognitive load by reducing the extraneous cognitive load through effective instructional design and message design practices when teaching complex cognitive information and psychomotor skills. The prior research was extended by examining the effect of lowered extraneous cognitive load on the performance of a psychomotor task. The significant increase in performance by the modified instruction group suggests that psychomotor performance is also enhanced by an increase in germane cognitive load capacity.

These findings provide support for the use of instructional and message design strategies that can minimize extraneous cognitive load in instructional materials for teaching cognitive knowledge and skills are consistent with prior research (Bobis et al. 1993; Chandler and Sweller 1991, 1992; Purnell et al. 1991; Sweller 1999; Sweller and Chandler 1991; Tarmizi and Sweller 1988; Ward and Sweller 1990). Furthermore, these findings suggest that such strategies can minimize extraneous cognitive load in instructional materials for teaching adaptive psychomotor knowledge; a previously unexplored construct in cognitive load literature.

Specifically, these findings support the use of integrated graphics and removal of redundancy in the instructional materials. Future research on cognitive load should consider four potential areas. First, does the structure of in the instructional materials affect extraneous cognitive load. For example, is the manner in which the instructional narrative or test items are written affect extraneous cognitive load? Two earlier studies found mixed results for rewording of tests (Dorsey-Davis et al. 1991) and the rewording of narrative (Britton et al. 1989). Future research could employ strategies similar to these studies to investigate potential affects of narrative structure on cognitive load.

Second, instructional designers should investigate if different instructional strategies such as generative strategies (Jonassen 1988; Wittrock 1974) impose varying levels of cognitive load on novice and expert designers to extend cognitive load research beyond message design strategies. Third, future research should determine if strategies for reducing split attention and redundancy are also applicable to more complex psychomotor tasks that involve both complex psychomotor skills and highly interactive cognitive information. Fourth, the current method of assessing cognitive load is with a single subjective measure (Kalyuga et al. 1998, 1999, 2000). Future research should investigate the application of physiological measures (e.g., eye movement, electroencephalography, blood pressure, and galvanic skin response), which can be used in future studies and used to validate the single subjective measure in current use.

Practical and clinical significance

Relative to practical and clinical significance in physical therapy programs is the observation that the modified instruction group achieved an 89% on the examination and 90% on the psychomotor assessment (practical examination), while the control group achieved a 79% and 70%, respectively. The latter grades would be considered failing by program standards. Additionally, practical examinations (formal psychomotor assessments) are often limited to a single “re-take” opportunity in many physical therapy programs. To this end, the differences in scores from a curricular perspective as a function of instructional format, as well as the direct applicability of the treatment materials to real world clinical practice are salient features of this study.