Introduction

Learning technologies are very widespread in higher education and students are often required to engage in some form of e-learning every day of their studies. For example, students may complete their course registration using campus websites, access digital library resources using an online library catalogue, and access electronic course materials through a course management system (CMS). Students who have difficulty using the learning technologies may therefore be disadvantaged in their studies. In many jurisdictions, there is legislation mandating that learning technologies be maximally accessible to all students to avoid scenarios where some students are disadvantaged. Such non-discriminatory legislation typically references web accessibility guidelines such as the Web Content Accessibility Guidelines (WCAG; W3C 2008) as mandated or suggested accessibility benchmarks. The guiding principles of the WCAG state that electronic content and technologies must be perceivable, operable, understandable, and robust (W3C 2008). Essentially this means that e-learning environments would be expected to be accessible if students can perceive and understand the information that is presented, and interact with the technology in ways that meet their needs. Ideally, meeting legislation requirements and/or adhering to accessibility guidelines such as these when implementing learning technologies should ensure that all students have equitable opportunities to utilize them.

One approach towards surveying e-learning accessibility in higher education is to evaluate the accessibility of institutional webpages. Several studies have been published on this topic, and these studies have found that there continue to be many accessibility barriers on post-secondary webpages in spite of non-discriminatory legislation (refer to Thompson et al. 2013 and references cited therein). In addition, recent legal suits in the United States have shown that post-secondary students with learning and sensory disabilities have felt disadvantaged when encountering barriers to e-learning accessibility (Goodall 2008; Inside Higher Ed. 2014; National Federation of the Blind 2012, 2014; U.S. Department of Justice 2013).

The goal of this study was to investigate methods of e-learning accessibility evaluation. There is emerging recognition of the need for such evaluation, with a growing body of literature discussing experiences of post-secondary students with disabilities with learning technologies (for examples, see Asuncion et al. 2010; Chen 2014; Habib et al. 2012; Roberts et al. 2011). However, the correlation between accessibility guideline conformance (and thus non-discriminatory legislation effectiveness) and actual accessibility experienced by students has not been explored, nor have methods to generate data on e-learning accessibility from students as research participants. These are critical areas to examine in order to provide additional guidance to educational institutions who may meet the obligation to demonstrate mandated conformance with accessibility guidelines, without understanding whether there has in fact been a concomitant increase in the accessibility of e-learning from the perspective of students. Additionally, as accessibility is generally referred to as an attribute that is relevant to students with disabilities (and not the student population as a whole), studies on e-learning accessibility that have been conducted to-date have focused on the experiences of students with certain types of disabilities. This approach may result in the perception that consideration of e-learning accessibility is necessary only in special cases and is not an essential proactive consideration for all students when e-learning environments are designed.

Theoretical framework

In this study, an accessible e-learning environment has been defined as one in which all students (regardless of whether or not they identify as persons with disabilities) have an equitable opportunity to succeed. That is, each individual learner may identify one or more means by which to meet their learning goals in which they do not encounter barriers that prevent them from doing so. This definition also acknowledges the role that human factors may play in determining the degree of accessibility of a given learning scenario. Such human factors may include those related to learning style (e.g., as related to instructional preferences, information processing style, and cognitive personality style, as reviewed by Cassidy 2004).

This conceptualization of accessibility is informed by the social and biopsychosocial models of disability. In contrary to the medical model of disability, which locates disability within the individual as the result of impairments (Marks 1997), the social model of disability posits that socially-constructed barriers lead to disability (Oliver 1996). Therefore according to the social model of disability, any person can be “disabled” by societal barriers and the burden is placed on society to change in order to remove barriers. The biopsychosocial model of disability blends the perspectives of the social and medical models of disability by acknowledging that both individual and social variables may interact and lead to disability (Ustun et al. 2003). While individual impairments play a role in contributing to disability, so too do social factors. When impairment is also considered a contextual variable (e.g., a poorly-lit workstation may result in vision loss for persons without visual impairments), the construct of disability may also be applied to persons who do not consider themselves to be disabled in the traditional medical model sense. Therefore, e-learning accessibility is essential to all students.

This conceptualization was applied to this study in two key ways. Firstly, both students who consider themselves persons with disabilities and students who do not were included in the study. The inclusion of non-disabled students in the study was in acknowledgment of the potential of the e-learning environment to lead to socially-constructed “disablism” (Goggin and Newell 2003), thus emphasizing the importance of accessibility for all students. Secondly, the data obtained from students with and without disabilities were not analyzed separately. Instead, attributes of the technologies (rather than the student participants) were examined for their ability to contribute to barriers to completion of the e-learning scenarios. This approach was taken to emphasize the role that the e-learning environment can play in contributing to disability, once again highlighting the importance of considering accessibility for all students.

Literature review

Relevant legislation and the need to increase e-learning accessibility

Non-discriminatory legislation relevant to e-learning is in effect in several countries, including the United States, The United Kingdom, and Canada. The United States was one of the first countries to develop such legislation, with the Americans with Disabilities Act (ADA) and the Rehabilitation Act (Americans with Disabilities Act 1990; Rehabilitation Act 1998). In addition to a 1996 ruling by the United States Department of Justice that the ADA applies to the Internet (Thatcher et al. 2006, p. 515), Section 508 of the Rehabilitation Act includes specific web accessibility standards, and the United States Department of Education has explicitly stated that Section 504 of the Rehabilitation Act applies to online courses, any online content, and emerging technologies (Joint Department of Justice and Department of Education 2011).

Similarly, in the United Kingdom, the Special Education Needs and Disability Act 2001 (SENDA; an extension of the Disability Discrimination Act of 1995) mandates that institutions of higher education are required to make reasonable adjustments where needed to ensure that students with disabilities are not disadvantaged (Her Majesty’s Stationary Office 2001). While SENDA does not itself explicitly mention online learning, the legislation has been interpreted to apply to e-learning infrastructure (Gerrard Gerrard 2007; Seale Seale 2003; Woodfine et al. 2008). The Equality Act 2010 (Her Majesty’s Stationary Office 2010) is relatively new United Kingdom legislation that also applies to higher education, and which “makes provisions for web accessibility and for reasonable accommodations to make information accessible” (Narasimhan 2012, p. 42).

In Canada, the accessibility of web content (except for Federal government websites) is under provincial jurisdiction. In the province of Ontario, for example, the Integrated Accessibility Standards of the Accessibility for Ontarians with Disabilities Act (AODA Integrated Accessibility Standards 2011) are applicable to colleges and universities in the province and are relevant to online learning materials. This legislation stipulates that new webpages (and content posted therein, including web-based applications) are required to adhere to web accessibility guidelines (the WCAG 2.0; W3C 2008) by phased conformance schedules with maximal conformance required by 2021.

While anti-discriminatory legislation relevant to e-learning in higher education is present in many jurisdictions, there is evidence to suggest that this has not effectively increased e-learning accessibility to acceptable levels for all students. To explicate this statement, it is helpful to look to United States case law where the ADA and Rehabilitation Act have been in effect for a comparatively long time. For example, a Capella University student with a learning disability sued the university following use of a new CMS (Carnevale 2005; Goodall 2008), as he claimed that he was unable to learn effectively due to the confusing software layout and navigation. However the judge in this case ruled that the university offered reasonable accommodations such as one-on-one interactions with instructors that would have circumvented issues related to the software (Goodall 2008). Thus the case was dismissed without a requirement on the part of the university to address the accessibility barriers to e-learning reported by this student. In other examples, the National Federation of the Blind has assisted in several suits in which blind students have sued universities for use of inaccessible course software and/or hardware and failure to provide materials in alternate accessible formats (National Federation of the Blind 2012; U.S. Department of Justice 2010a, b), and is currently assisting with legal suit against Miami State University (National Federation of the Blind 2014).

In order to understand why students with disabilities face barriers when engaging in e-learning in spite of the presence of non-discriminatory legislation, it is important to consider limitations associated with the accessibility guidelines referenced therein. The frequently referenced WCAG 2.0 (W3C 2008) are often described as lengthy and ambiguous with obscure jargon, making them difficult for even experts to understand and interpret (Clark 2006, May 23; Kapsi et al. 2009a, b; Ribera et al. 2009). Moreover, the minimal representation of learning disability experts within the WCAG 2.0 working group (Kennedy et al. 2011), and the comparative complexity of learning disabilities in relation to web accessibility have also been cited as important variables to consider (Friedman and Bryen 2007; Laff and Rissenberg 2007; McCarthy and Swierenga 2010; Nicolle and Paulson 2004; Seeman 2006). For example, while the WCAG 2.0 calls for clear and simple language to enhance understandability, measuring the level of understandability of text is not as straightforward as measuring other parameters such as text-to-background contrast. Therefore, a crucial step towards increasing the level of e-learning accessibility for all post-secondary students is to develop an understanding of how to effectively evaluate the accessibility of learning technologies in a manner that is most likely to be representative of the needs of a diverse student body.

Methods to evaluate e-learning accessibility

Two methods that may be used to evaluate the accessibility of learning technologies are conformance testing and user testing. Conformance testing involves determining whether technology meets pre-defined accessibility guidelines (e.g., the aforementioned WCAG), while user testing involves soliciting feedback from actual users of the technology (i.e., students).

Conformance testing

Conformance testing of e-learning technologies may be conducted manually (by accessibility experts that compare the technologies against a list of accessibility guidelines) or automatically (by automated tools that examine the underlying code of the technology). Major benefits of conducting automated accessibility evaluation are that it is fast and easy to generate data (within seconds), there are several free tools available for use, and it is relatively inexpensive compared to manual conformance testing or user testing which requires recruitment of experts and/or anticipated users of the technology (Vigo and Brajnik 2011). The accuracy of conformance testing is related to the guidelines used and the interpretation of those guidelines.

User testing

User testing (also referred to as usability testing) is another methodology that can be applied towards e-learning accessibility evaluation. While engaging with the technology under study, users (in this context, the users are students) are typically asked to “think aloud” by verbalizing their thoughts (Nielsen 1993). Usability testing traditionally takes place in a testing laboratory and is moderated in that the user works with the technology in the presence of one or more researchers (Nielsen 1993; Rubin and Chisnell 2008). This method of testing may also include use of a video camera to record the user as they work (Rubin and Chisnell 2008). There is emerging interest in conducting unmoderated usability testing, such as remote testing which takes place over the Internet with the help of special testing software, including asynchronous testing where there is no real time user-researcher interaction (Bolt and Tulathimutte 2010). Interest in remote unmoderated testing stems from the increased convenience for users who are unable to attend a testing laboratory due to scheduling conflicts, geographical location, or disability (Baravalle and Lanfranchi 2003; Houck-Whitaker 2005; Power et al. 2009).

A few studies have compared data from traditional (synchronous) laboratory-based usability testing with that from synchronous or asynchronous remote usability testing. These studies have found that the number of critical incidents (accessibility barriers) were not affected by the testing format (Andreason et al. 2007; Hartson et al. 1996; McFadden et al. 2002; Selvaraj 2004), though there are conflicting data as to whether the time on task or amount of qualitative data are affected by these methods of testing (Andreason et al. 2007; Hartson et al. 1996; McFadden et al. 2002; Petrie et al. 2006).

Different perspectives appear in the literature regarding the relationship between usability and accessibility (for examples, refer to Henry 2007; Kapsi et al. 2009b; Vanderheiden 2000). Kapsi et al. (2009b) have proposed that accessibility and usability have both unique and overlapping attributes, with usable accessibility representing the “grey area” in the center where they intersect. Usable accessibility—the ability to utilize a system effectively and efficiently—arises when users can easily learn and remember how to use a system (i.e., if it is usable) and can also perceive, operate, and understand the system (i.e., if it is accessible). This linkage between usability and accessibility is also demonstrated within the WCAG (Ellcessor 2010; Ribera et al. 2009), which are frequently referenced as accessibility benchmarks in non-discriminatory legislation. As such, it stands to reason that usability testing methods can be applied towards studies with the goal of accessibility evaluation. There is however, a dearth of such studies in the literature at the intersection of higher education and e-learning, with most of the earlier studies focusing on usability or accessibility of digital library materials (refer to Denton and Coysh 2011; Dermody and Majekodunmi 2010; Jung et al. 2008), and few reporting on the use of other e-learning technologies (refer to Foley 2011; Power et al. 2010; Pretorius and van Biljon 2010).

Research objectives

The goal of this study was to investigate methods of e-learning accessibility evaluation, with emphasis on comparing accessibility guideline conformance with actual accessibility for students and methods of conducting accessibility testing with student users. The research objectives for this study were to:

  1. (1)

    Determine the extent to which data from automated tools used to measure the accessibility of sample online units is predictive of the subjective experiences of students; and to

  2. (2)

    Compare data obtained from moderated and unmoderated student-centered accessibility evaluation of sample online units.

Method

Figure 1 is a summary of the methodology, and more details are presented in the following text.

Fig. 1
figure 1

Flowchart summarizing the methods (materials, data obtained, and data analysis) used towards meeting the research objectives of the study

Online course and units

An online course titled Introduction to Digital Literacy was created within Moodle 2.0, and this course included two units titled Scholarly Resources and Credible Resources that were each presented as a single page with step-by-step instructions and embedded media and links. The units were not intentionally seeded with accessibility barriers, and it was anticipated that their level of accessibility would be comparable, based on their similarity. Completion of each unit required viewing an embedded PowerPoint presentation prepared for this study (consisting of 5 slides), viewing an embedded YouTube video with related content (less than 5 min long), completion of a task involving websites external to the unit Moodle page, and posting a one to two sentence comment to a Moodle discussion forum regarding completion of the task. The primary difference between the units was in relation to the task that students completed while visiting webpages external to Moodle: The task to complete for the Scholarly Resources unit was to use a university library digital catalogue to locate a journal article and to then read the abstract of the article, while the task to complete for the Credible Resources unit was to compare the credibility of two webpages. The study participants had not previously worked with the version of Moodle or the digital library catalogue used in this study.

Automated accessibility evaluation

Each component of each unit was evaluated for accessibility using automated tools. Websites that students accessed were evaluated for accessibility by AChecker (http://achecker.ca), the Qompliance add-on for the Firefox web browser (https://addons.mozilla.org/en-US/firefox/addon/qompliance/), and the WAVE web accessibility evaluation tool (http://wave.webaim.org/). AChecker and WAVE were chosen because they are commonly used and well regarded open source tools that are recommended by organizations and post-secondary institutions, and Qompliance was selected as a third open source tool with the expectation that the use of three different tools would generate a comprehensive set of automatically generated data. All of these tools, and other similar tools, examine the accessibility of websites against web accessibility guidelines such as the WCAG and report on known and likely/potential barriers (in the form of a standalone report and/or annotations on the evaluated website). The built-in accessibility evaluation feature of PowerPoint 2010 and PowerTalk (http://fullmeasure.co.uk/powertalk/; text-to-speech software for PowerPoint presentations) were used to examine the unit PowerPoint slides for possible accessibility barriers. The data obtained from AChecker, Qompliance, and WAVE were collated and tabulated to prepare a single list of unique potential accessibility problems for each webpage, and the data obtained from the PowerPoint accessibility evaluation tool and PowerTalk were collated and tabulated to prepare a single list of potential accessibility problems for each presentation.

Student-centered accessibility evaluation

Participants

Twenty-four undergraduate students (N = 12 students who self-identified with one or more learning disability, and N = 12 students who did not self-identify as persons with disabilities) were recruited for the study. Students with learning disabilities were recruited as participants for several reasons: students with learning disabilities represent a large and growing proportion of post-secondary students with disabilities (Fichten et al. 2009); it has been shown that there is overlap between the needs of students with learning disabilities and students with vision loss and mental health disabilities with respect to e-learning accessibility (Evett and Brown 2005; Grabinger 2010), suggesting that data from students with learning disabilities may be relevant to students with other types of disabilities; and students with learning disabilities may experience barriers to accessibility related to various academically-relevant skills, including listening, understanding written and oral instruction, and written expression, and thus may be expected to provide comprehensive data on accessibility that is relevant to many students.

Participants were recruited by posting notices on bulletin boards in common areas across campus, which included the offer of a gift certificate for the campus bookstore as compensation for participation. The same call for participation was also distributed through the listserv of the Learning Disability Services unit of the campus Counselling and Disability Services office. Both the recruitment flyers posted across campus and the recruitment email indicated that desired participants include “students who identify themselves as persons with learning disabilities as well as students who do not identify as persons with disabilities,” however, only the listserv message led to recruitment of students with learning disabilities.

All students who received the listserv message were registered with the Learning Disabilities unit, which includes students with documented learning disabilities, attention disorders, and autism spectrum disorders. While the call for participation requested participants with learning disabilities, students were not required to disclose the specific disability or disabilities that they identified with. This was to avoid discouraging students who may feel uncomfortable discussing their disability from participating, and because the data were not analyzed in relation to attributes of the students (i.e., the emphasis of this study was to examine how the technologies could themselves be disabling). Additionally, as all of the participants with learning disabilities were recruited upon their response to the listserv message, it was not deemed necessary to ask the students to verify their registration with the Learning Disabilities office.

Testing laboratory and software

Participants completed the online units at a workstation in a university classroom which was arranged according to a traditional usability testing laboratory format (Nielsen 1993; Rubin and Chisnell 2008). The workstation was a desk with a desktop computer with wired Internet connection, mouse, and external microphone. A video camera on a tripod was available in front of the workstation to capture the student’s facial expressions and interactions with the workstation, and the classroom was visible from a one-way mirror from an adjacent observation room. Each student completed one unit in a moderated session (in the presence of a researcher and video camera) and one unit in an unmoderated session (working alone in the classroom with the video camera turned off). During moderated sessions, the researcher reminded the student to think out loud if necessary following a speech communication form of the think aloud protocol (Boren and Ramey 2000).

The online tool OpenVULab (http://openvulab.org) was used to create screencasts of the students’ on-screen activities while they completed each unit and the screencasts were also synced with the think aloud verbalizations. After completing the unit, the software administered a post-unit questionnaire, which included 13 Likert-scaled questions with a five-point scale. Of these, 10 questions asked students to rate the ease with which they were able to perceive presented information, understand presented information, and interact with e-learning technologies within the unit. These questions were based on the guiding principles of the Web Content Accessibility Guidelines (WCAG; W3C 2008), which also informed the definition of accessibility barriers that was applied in this study (discussed further as critical incidents in the following section). The final three questions asked students to rate the overall ease of completing the unit, ease of participating in the (moderated or unmoderated) session, and comfort level with participating in the (moderated or unmoderated) session.

Procedure

A 2 × 2 counterbalanced within-subject testing design was employed. The 2 × 2 designation refers to the inclusion of two independent variables (sample online unit and format of accessibility testing), each with two levels (Scholarly Resources or Credible Resources for the unit; moderated or unmoderated for the testing format). Both the format of the evaluation session (moderated or unmoderated) and the online course unit completed (either the Scholarly Resources or the Credible Resources unit) were defined as independent variables. It was important to consider the units themselves as independent variables because different tasks were involved in the completion of the two units, and it was not known whether those tasks would have a significant impact on the dependent variables under study. Dependent variables that were examined for as indicators of the accessibility of the online units were critical incident counts, verbal frustration counts, verbal pleasure counts, and efficiency of unit completion (min). Critical incidents were defined as challenges that students encountered with respect to perceiving, understanding, and/or interacting with the e-learning content/technologies; verbal frustration and verbal pleasure refer to expressions of frustration or pleasure, respectively, uttered verbally by students as they completed an online unit; and the efficiency of unit completion was the length of time required for a student to complete a unit.

The design was within-subject in that each student participant completed both sample online units and experienced both testing formats, and counterbalanced in that the order of unit completion and format of testing varied amongst the participants, as shown in Table 1. For the first variable, sample online unit, this design was employed to allow for comparisons of a particular student across the online units, and to minimize the effect of learning that may be carried over from one unit to the next. For the second variable, the format of accessibility testing, this design was employed to allow for comparison of data related to the subjective experiences of each student across the two testing formats, and to minimize the effect of changes in the subjective experiences of the students as they completed the second unit.

Table 1 Counterbalanced within-subject accessibility evaluation design

There were an equal number of participants with and without learning disabilities assigned to each testing condition. For example, 6 of the 24 participants completed the Scholarly Resources unit in an unmoderated accessibility testing session. Of these six participants, three were students with learning disabilities. This method of assigning student participants was intended to prevent potential unwanted effects of the presence or absence of disability on the data from a given testing condition.

Each session lasted approximately 1 h, and was organized as follows (approximate average time in brackets): a verbal introduction was read to the student and informed consent was obtained (5 min), the think aloud protocol was described and demonstrated and then practiced by the student (5 min), the student completed Condition 1 (20 min), the student completed Condition 2 (20 min). This was followed by a semi-structured interview (10 min) that was conducted to learn more from students about what they found to be most and least difficult about completing each of the online units, what they liked most and least about each of the session formats (moderated and unmoderated), and other thoughts they had about completing the units online.

Data analysis

Digital files of OpenVULab screencasts and video camera recordings were analyzed using Transana video analysis software (http://transana.org/). An initial continuous viewing of each screencast and video file was conducted to gain an overall impression of the data and to identify areas to examine more closely upon repeated viewing. Next, an additional complete viewing of each individual file was conducted in order to tag relevant segments and to prepare a list of observations from each segment. This list was reviewed with a second researcher and agreement was reached regarding the identification of critical incidents and sentiment displays that were to be coded as follows: The codes ci (critical incident), vf (verbal expression of frustration), and vp (verbal expressions of pleasure) were applied to relevant screencast segments, and the efficiency of unit completion (min) was recorded from all screencasts in which the student was observed successfully completing the unit. Unit success was deemed completion of all aspects of the unit, namely viewing the PowerPoint presentation, viewing the YouTube video, visiting required websites outside of Moodle and gleaning the requested information, and posting a culminating comment on the Moodle discussion forum. The code nf (non-verbal expression of frustration) was applied to relevant video camera segments. This coding scheme is similar to that previously reported by Olmsted-Hawala and colleagues (Olmsted-Hawala et al. 2010a, b). Once all of the files were coded, the coded segments were organized according to the code(s) applied to facilitate another viewing of each collection of coded segments to ensure coding consistency.

The critical incident (ci) count and efficiency data were subjected to statistical analyses. The sample size of N = 24 participants may be deemed relatively small for statistical analyses. However, because (a) it was expected to be challenging to recruit a large number of participants with learning disabilities, (b) there are examples in which statistical analyses have been conducted with usability testing data from small sample sizes (refer to studies by Andreason et al. 2007; Petrie et al. 2006 in which data from 24 and 8 participants, respectively, were analyzed), and c) supporting qualitative data from interviews were also collected in this study, it was felt that N = 24 would be a reasonable number of participants to include.

The Mann–Whitney U test was performed with the post-unit questionnaire data for each unit to determine whether there was a statistically significant (α = 0.05) effect of the accessibility testing format on these data. The Bonferroni correction (Bland 2000) was applied to reduce the likelihood of Type I error while conducting these multiple univariate comparisons. All analyses were conducted using SPSS.

Digital audio recordings of interviews were transcribed and open coded according to the emergent themes of familiarity, simplicity, and engagement and learning style. Quotes related to the format of accessibility testing were also collated.

Comparison of automated and student-centered data

Each online unit component was examined separately. First, the list of potential accessibility problems identified by automated accessibility evaluation were compared against the critical incident (ci)-coded screencast segments to determine the predictive ability of the automated testing. Next, additional forms of data obtained from the sessions with student participants were examined to consider the extent to which they aided in providing a thorough understanding of the accessibility of the online units. This included sentiment displays from screencasts and videos, post-unit questionnaire data, and interview transcripts.

Comparison of data from moderated and unmoderated accessibility testing conditions

This comparison included examination of data from videos and screencasts, and post-unit questionnaires and interviews. The amount of helpful data from the videos was previously determined as non-verbal frustration (nf) counts, and the nf-coded video segments were reviewed to determine whether they lent novel insight that was not evident from other sources of data. From the screencasts, the critical incident (ci) count data were analyzed to determine whether they were significantly affected (α = 0.05) by either independent variable (sample online unit or format of accessibility testing) alone or in combination. Based on the data distribution, the ci count data were analyzed by Poisson regression analysis and the efficiency data by two-way ANOVA analysis. All analyses were conducted using SPSS. The verbal pleasure (vp), verbal frustration (vf), and non-verbal frustration (nf) counts were infrequent and were not subjected to statistical analyses.

Data from Mann–Whitney U tests performed on questionnaire data were examined to determine if student perceptions of accessibility of the unit components, ease of participation, and/or comfort level associated with participation were significantly affected by the format of the accessibility testing. Interview quotes related to the accessibility testing format were also reviewed.

Results

Automated versus student-generated data on online unit accessibility

The use of automated accessibility evaluation tools generated a list of potential accessibility barriers associated with the online units, while the completion of the online units by student participants allowed for the generation of a list of observed accessibility barriers (critical incidents). These data are presented for comparison in Table 2.

Table 2 Potential and observed accessibility barriers of sample online units

There were no common accessibility barriers identified by these two methods of evaluation. The potential barriers identified by automated tools were related to the user interface and the majority of these were not relevant to the students in this study, as they were related to potential barriers to perceiving, understanding, or interacting with the e-learning content/technologies while using assistive technologies (e.g., text-to-speech software). The students in this study reported not normally using assistive technologies with webpages, and did not use any when completing the online units in this study. Frequency counts of critical incidents (ci; i.e., accessibility barriers), verbal frustration (vf), and verbal pleasure (vp) are presented in Table 3. Twenty-three of the participants encountered between one and nine accessibility barriers while completing both online units, and only one student did not encounter any barriers, with the observed accessibility barriers related to the subject matter (understanding of content that the students encountered while completing the online units) and the user interface (most commonly related to understanding the purpose of hyperlinks). The sentiment display data from screencasts and the post-unit questionnaire data correlated with the critical incident findings from the screencasts.

Table 3 Frequency counts of critical incidents and sentiment displays according to e-learning accessibility evaluation condition

Data that were mostly unique to the interview transcripts were information about aspects of the units that students felt contributed positively to their accessibility (while there were only two counts of verbal expressions of pleasure identified from review of the screencasts). Additional qualitative data obtained from the interviews has been previously reported (Kumar and Owston 2012), and the emergent themes of familiarity, simplicity, learner control, and engagement as related to these e-learning scenarios are briefly summarized next for completeness.

Prior familiarity with the e-learning technologies encountered generally enabled students to use the same technology without encountering accessibility barriers and/or by removing accessibility barriers by trial-and-error. An exception was instances where students recalled prior difficulty with a given technology and as a result did not attempt to remove the barrier by trial-and-error (e.g., enlarging YouTube videos). The single-page structure of the units with the embedded media was viewed as enabling by most students, as this presented the material in a simple organizational structure and negated the need to locate and/or download other materials. A sense of learner control arose from the ability to review the learning materials as needed, and students reported different levels of engagement (and concomitant comprehension) from different instructional materials, highlighting the role that human factors may play in determining e-learning accessibility for individual students.

Moderated versus unmoderated accessibility evaluation sessions

The videos and interviews associated with the moderated format of evaluation generated very little data regarding unit accessibility that was not already gleaned from the screencasts. From the videos, there were N = 4 instances of non-verbal frustration which correlated with screencast critical incidents, and N = 7 possible indications of text being difficult to read in places (perhaps being too small). Interviews highlighted aspects of the units that positively contributed to accessibility, and were the primary source of this type of data.

Statistical analyses of data obtained from the screencasts allowed for further comparisons of the data. Poisson regression analysis of the critical incident (ci) count data and two-way ANOVA analysis of the efficiency of unit completion data revealed that the format of accessibility testing did not significantly affect the number of critical incidents that were recorded or how long it took students to complete the units (Wald Chi square = 0.134 and p = 0.714 for the ci counts; F = 0.058 and p = 0.810 for the efficiency data). The sample online unit completed had a significant effect on the ci counts (Wald Chi square = 9.928 and p = 0.002; higher counts from completion of the Scholarly Resources unit) but not on the efficiency of completion (F = 1.575 and p = 0.217). There were no significant interactive effects of the format and unit on the ci counts or efficiency data (Wald Chi square = 1.002 and p = 0.317 for the ci counts; F = 2.666 and p = 0.111 for the efficiency data). Prior to conducting the ANOVA analyses, the normal distribution and equality of variances of the efficiency data were confirmed by the Shapiro–Wilk test for normality (W = 0.968; p = 0.288) and Levene’s test for equality of variances (F = 0.207; p = 0.891).

Note that learning disability was not included as an independent variable in the analyses described above, as to do would be contrary to the theoretical framework that informed the study design. However, readers may be interested to know that such analyses were carried out, and the presence or absence of learning disability did not have a statistically significant effect on the dependent variables of ci and efficiency (data not shown).

The impact of the independent variables on the verbal frustration (vf) counts was less clear, due to the low frequency of this dependent variable. Half of the participants (N = 12) did not verbally express frustration during completion of either online unit. Most of those who did (N = 9) did so only once during completion of a unit, even when several critical incidents were observed. In contrast, three students verbally expressed frustration four or more times during completion of a unit. It is possible that some students were not prone to expressing frustration verbally under the study conditions unless a certain threshold of frustration occurred, though additional investigation is required to understand these data.

Results of Mann–Whitney U tests on the post-unit questionnaire data were different for the two online units. Following completion of the Scholarly Resources unit that the students reported as being more challenging, responses regarding the ease of participating in the session and comfort level participating in the session were not significantly affected by the format of accessibility testing (U = 50.5 and p = 0.219; and U = 71 and p = 0.997, respectively). However, following completion of the Credible Resources unit, ease of participation and comfort level with participating were both rated higher following unmoderated sessions (U = 36 and p = 0.039; and U = 29.5 and p = 0.012, respectively).

Data for several questions regarding the accessibility of individual unit components were interesting but below statistical significance when the Bonferroni correction was applied. The unit components that were observed to generate the highest number of critical incidents from the Scholarly Resources unit were deemed more difficult when completed in an unmoderated session, while students rated several components of the Credible Resources unit easier when it was completed in an unmoderated session. These contrasting data may suggest that students found difficulties easier to handle while in the presence of the researcher when completing the more challenging unit, but otherwise found unit completion easier when working alone.

This speculation is supported by interview data, such as the finding that most participants (N = 15) preferred participating in the unmoderated testing format and felt more comfortable working while alone. For example, some students felt self-conscious in the presence of a researcher and video camera, and felt as though they could behave more “naturally” on the computer during the unmoderated session where they could “let their guard down” and “click around more” when working alone. When reflecting on their experiences in both testing formats, students did concede that a benefit of the moderated format was that they could ask the researcher for assistance (even though the response for requests for help was to proceed as they would if they were working independently). For example, one participant indicated,

I guess just knowing that you were in the room [was helpful] so that if there were any problems then I know that you would be in here and you could kind of guide me through any problems. But, um, you know, luckily there were no problems.

A third participant indicated that she would appreciate a form of orientation prior to taking part in an unmoderated session, in which she could ask questions about what to expect.

Discussion

Automated versus student-centered e-learning accessibility evaluation

Data from automated tools used to measure the accessibility of the components of the units in this study were not predictive of the subjective accessibility experiences of the student participants. The automated tools were particularly good at identifying potential barriers to accessibility if assistive technologies were used (related to perception, understanding, and interaction with the unit content/technologies), while the observed barriers were primarily related to understanding the unit content and how to interact with the technologies in the absence of assistive technologies. Because automated accessibility evaluation tools examine for potential problems based on pre-determined accessibility guidelines, these findings demonstrate that the guidelines themselves and/or the ways in which the automated tools used interpreted the guidelines were not effective towards identifying actual accessibility barriers that students in this study encountered.

These findings are in agreement with Bohman and Anderson (2005), who have indicated that adherence to accessibility guidelines may not prevent certain types of accessibility barriers, with the acknowledgement that it can be particularly difficult (if not impossible) to develop computer algorithms to evaluate understandability. It is also important to note that this study included students with and without learning disabilities, and barriers related to the understandability component of accessibility were especially troublesome for most of the students. Therefore in this example most students (and not just students with disabilities) would have benefitted from a means of detecting (and addressing) barriers to understanding.

Moderated versus unmoderated student-centered e-learning accessibility evaluation

Data from moderated and unmoderated student-centered accessibility evaluation of sample online units were compared. The number of observed accessibility barriers associated with the units were not significantly affected by whether the evaluation took place in a moderated or umoderated session, nor was the time taken by the students to complete the units. It is unclear whether verbal or non-verbal sentiment displays would be affected by the format of accessibility evaluation, however, these were low in frequency and did not provide additional information that was not also available from other data sources.

When reflecting on the unit that was found to be comparatively easy, students indicated in questionnaires and interviews that it was easier to participate and they felt more comfortable participating in unmoderated sessions. However, there was not a statistically significant difference in how students rated their ease and comfort level with participating in moderated versus unmoderated sessions when completing the more challenging unit, and some students indicated in interviews that they took comfort in the researchers’ presence when difficulties were encountered. These findings indicate that it may be useful to conduct moderated accessibility evaluation sessions when it is anticipated that a large number and/or highly troublesome accessibility barriers are anticipated (e.g., during the early stages of development of a new e-learning technology), and that the student-preferred unmoderated format for e-learning accessibility evaluation warrants further exploration.

There are several additional research questions that can be examined with respect to unmoderated accessibility evaluation. For example, does the ecological validity of the data increase concomitant with increased comfort level of participants? What variables associated with the evaluation conditions may positively or negatively influence the comfort level of participants? How do the data and student perceptions of participation differ from different types of unmoderated sessions (e.g., synchronous or asynchronous; taking place in a laboratory or at home)? What data collection tools are available to support the different types of unmoderated sessions?

Additionally, the lack of correlation between data from automated versus student-centered accessibility evaluation begs the question—how relevant it is to evaluate the accessibility of e-learning technologies in isolation (i.e., devoid of learning context and human factors)? To this end, the findings of this study and future work can be applied towards the development of methods to evaluate the accessibility of learning outcomes rather than focusing solely on learning technologies. That is, a more holistic approach to evaluating e-learning accessibility would involve acknowledging the value of a variety of possible learning pathways within a given e-learning scenario which would accommodate learner diversity, and which would allow students to meet the learning outcomes in the manner that is most accessible to them. This idea is supported by previous work by Kelly, Phipps, and Swift (2004) and Sloan and Kelly (2008), and is an important area for continued exploration.

Limitations

It is important to consider limitations of this study when interpreting results. To this end, the number and characteristics of participants, and the nature of the e-learning scenarios themselves, warrant consideration.

The student-centered e-learning accessibility evaluation sessions were conducted with 24 students. This is a relatively small sample size for statistical analyses of quantitative data, and more data may have been particularly helpful in determining whether the counts of verbal frustration were significantly affected by the online unit and/or session format (moderated or unmoderated). As such, including a larger sample size would be advisable for future studies which further investigate these or related research objectives.

We sought to recruit students with a wide array of learning needs and preferences in order to acquire a comprehensive set of data on the accessibility barriers within online units, though we did not obtain information about the nature of learning disabilities that the students identified with, and our only means of confirmation that the students did indeed have documentation to support having a learning disability was the fact that they responded to an email that they received as a student registered with a learning disabilities services office. Moreover, as we have asserted that the learning environment, learner characteristics, and environment-learner interactions may all contribute to the presence of accessibility barriers, the results of this study are therefore reflective of both the nature of the online units and participant characteristics. For example, it is possible that outcomes may have differed had the online units been intentionally seeded with accessibility barriers. In addition, as we did not know how similar or different the disabilities that half of the participants identified with were, we cannot be certain that we did in fact recruit a population of participants with a wide array of learning needs and preferences in order to obtain comprehensive data on the accessibility of the online units.

Conclusions

This study has demonstrated that student-centered methods are an essential component of e-learning accessibility evaluation, and both moderated and unmoderated forms of student-centered evaluation sessions can generate useful data. Automated evaluation tools and the accessibility guidelines that they are based on are not effective at identifying all potential barriers to accessibility, and can omit barriers that can have a significant impact on the success of students. The results of this study indicate that this may be particularly true with respect to preventing barriers to understanding, one of the guiding principles of the WCAG 2.0. This has highlighted a critical point, which is the importance of directing efforts towards ensuring that students can understand both the e-learning content and how to use the e-learning interface, which is best achieved by conducting student-centered accessibility evaluation. Furthermore, this study has shown that it is not only students with disabilities who can be disadvantaged by inaccessible e-learning environments, and thus accessibility is a variable that is important to all students. Continued work in the area of developing methods to evaluate e-learning accessibility is thus urgently needed, and can benefit from the findings of this study.