1 Introduction

In recent years, e-government worldwide has grown widely, as it is expected to reduce costs and improve service delivery in the public sector. The Saudi government is no exception in this growth. In a press release by the Ministry of Communication and Information technology, they reported that Saudi Arabia “ranked 8th globally in terms of the efficiency of ICTs use by the government departments in the performance of its works and improvement in government ICT services quality, while ranking 9th in government’s promotion of ICTs use, according to The Global Information Technology Report 2016 issued by the World Economic Forum”.Footnote 1

With this increase in Saudi Arabia ranking in ICT internationally, did its e-government accessibility also improve? In 2010, the authors conducted an exploratory study on 34 Saudi government websites to evaluate their accessibility [4]. The results were disappointing indicating many government websites suffering from predicted accessibility mistakes. However, in this study, we revisit the same set of websites evaluated previously and assess their accessibility. This may help in gaining an idea about the level of awareness about accessible e-government services in Saudi Arabia and whether the new policies the Saudi government has enforced on governmental websites have been effective or not.

Technically speaking, the contribution of this paper compared to the previous evaluation is clear in two aspects:

  1. (1)

    The use of a method that implies thorough evaluation techniques;

  2. (2)

    The report of current state of Saudi e-government accessibility, especially after the adoption of new regulations provided by Yasser e-government program.Footnote 2 The Yasser regulation enforces e-government websites to follow a set of guidelines for e-readiness of their services, which include guidelines for web accessibility.

The rest of this paper is organized as follows: Sect. 2 provides a brief overview of accessibility guidelines and technical standards as well as accessibility evaluation methods. Section 3 presents selected studies that evaluated the accessibility of e-government websites from diverse regions, ranging from domestic websites in the same country, continent and cross-continents. Section 4 describes the methodology pursued to re-evaluate the accessibility of Saudi e-government websites. Section 5 discusses the obtained evaluation results and Sect. 6 concludes the paper with some recommendations for improving Saudi e-government websites, also discussing future implications.

2 Background

2.1 Accessibility guidelines and technical standards

In order to move toward a more accessible web, many governments and public organizations worldwide are creating policies and regulations regarding web accessibility and the accessibility standards that need to be followed [24]. The W3C’s Web Content Accessibility Guidelines (WCAG), for example, encourages this evolution through the standardization and explanation of technical guidelines that web designers and developers should follow during the web development life cycle. The latest version released in 2008 is WCAG 2.0, which included more advanced features, such as more coverage of web content technologies and disability types [20]. It has three conformance levels: Level A, the lowest conformance; Level AA, intermediate conformance; and Level AAA, the highest conformance. These reflect the extent to which a website is following accessibility guidelines.

WCAG 2.0 consists of four principles that represent the basics of an accessible website. Under these four principles, 12 non-testable guidelines represent a broad description of some basic requirement(s) for achieving accessible web content [4]. Each of these guidelines is composed of testable success criteria (SC), which determine the level of accessibility conformance. In order to be easily tested for accessibility conformance evaluation, each SC is written as a true or false statement. However, while some of these criteria could be tested using automated testing tools, some of them should be tested by experts. For each SC, a list of informative techniques provides practical guidance for web developers and testers on how this criterion could be met based on the different contexts and web technologies used [4, 20]. The following figure (Fig. 1) summarizes WCAG 2.0 components [20].

Fig. 1
figure 1

Components of WCAG 2.0

In addition to WCAG, the W3C has initiated other international benchmarks and policies that help validate website accessibility. User Agent Accessibility Guidelines (UAAG), for example, are a set of guidelines that web developers can use to evaluate how their websites are accessible to user agents and assistive technologies. Recommended by W3C in 2008, UAAG draft version 2.0 was published in 2013, which has five principles (perceivable, operable, understandable, programmatic access, and specifications and conventions) and 28 guidelines [27].

Another example of important, though not commonly used W3C accessibility standards is the Accessible Rich Internet Application (ARIA) technical specifications, which allow developers to evaluate the accessibility of their websites and, consequently, permit people with disabilities to conveniently access and interact with the web, especially with dynamic content [24, 25].

Some countries have expended considerable effort to create their local accessibility guidelines and policies. For example, an accessible website in the USA is defined as one that follows U.S. Section 508 regulations. Section 508 requires that technologies—websites, operating systems, hardware, and telecommunications devices—have to be completely accessible for people with disabilities [26].

2.2 Accessibility evaluation methods

Website Accessibility Conformance Evaluation Methodology (WCAG-EM) is “an approach for determining how well a website conforms to Web Content Accessibility Guidelines” [30]. This conformance evaluation is usually conducted using one or a combination of the following methods:

  1. 1.

    Web Accessibility Evaluation Tools (WAET). They are “software programs or online services that are used to check a website’s accessibility level under web accessibility guidelines” [19]. These tools differ in many characteristics, such as efficiency, conformance guideline support (e.g., WCAG 1.0), and supported conformance levels (A, AA, AAA). Automated evaluation tools could be considered as the simplest and fastest accessibility testing approach. However, they have some limitations such as the inability to check how screen readers, used by impaired people, will interpret websites content or ensure the correct reading order of web pages content [9]. Numerous automatic accessibility tools are used widely today, such as AChecker, APrompt, Cynthia Says, and EvalAccess 2.0, to name but a few [19, 24].

  2. 2.

    Expert evaluation involves a manual analysis of websites to check the conformance to certain guidelines. In order to get acceptable results, recruiting a sufficient number of evaluators who are familiar with the target guidelines is needed [1, 18]. Although this approach is more accurate than automated evaluation in recognizing broader accessibility violations, it requires more time and labor [17]. On the other hand, it is more cost effective than user testing [17].

  3. 3.

    User testing involves “a group of users to systematically work through the website, testing usability and accessibility from their point of view” [17]. This method requires some pre-evaluation resources setup, such as a test environment with required assistive technologies, selection of participants, and training [22, 23].

  4. 4.

    Surveys for web administrators and developers, which aim to investigate the reasons behind website inaccessibility issues [12].

3 Literature review

In order to improve the effectiveness of e-government websites and guarantee that all citizens benefit from online governmental services regardless of their situations, the accessibility of e-government websites has been extensively researched. This section illustrates selected studies from diverse regions, ranging from domestic websites in the same country (e.g., United Arab Emirates and the USA), continent (e.g., within Asia and South America), and internationally (e.g., United Nations member states’ portals). These studies evaluated the accessibility of e-government websites based on different international accessibility standards, such as WCAG 2.0, and domestic accessibility standards, such as Section 508 in the USA.

3.1 Domestic accessibility evaluation of e-government websites

Many federal and state portals still lack many accessibility aspects, even though there are many existing accessibility regulations in the USA. Several e-government accessibility evaluations have been conducted. Olalere and Lazar [26] evaluated the accessibility level of 100 federal homepages by using the number of violated US Section 508 guidelines on a web page as a significant factor in determining its accessibility. They also analyzed accessibility policy statements and examined the association between these statements and Section 508 compliance. They found that over 90 % of the websites tested have violated some accessibility guidelines. Some of the potential reasons behind the lack of accessibility on federal websites include: The absence of governmental effort in accessibility testing since 2001 and the lack of clear technical standards on how to develop accessible websites and how often they should be tested. Additionally, they found that more than half of the websites had accessibility statements. However, some of them were misleading, particularly when claiming full compatibility with Section 508.

Wentz et al. [31] conducted 156 accessibility evaluations of signing up for online emergency alert service in 26 regions in the USA (six evaluations per region), based on Section 508 guidelines. The choice of this service was driven by its significance to US citizens. The results showed that 21 of the 26 regions’ services had at least one accessibility violation. However, all of the issues discovered were easy to solve technically though required adequate awareness of these accessibility problems, and the authors recommended some points to increase this awareness. Youngblood [32] also re-examined the accessibility evaluation of a number of Alabama State website homepages, conducted in 2002 by Potter [28]. The main goal was to find out whether or not there has been a considerable change in the accessibility conformance level since that time, especially after the adoption of the Universal Accessibility Standard, which aims to help with Section 508 compliance guidelines. Potter tested the level of conformance to Alabama’s ITS 1210C00S2, Section 508, and WCAG 2.0 Guidelines. Overall, he found that there was no considerable improvement in the websites’ accessibility, which may imply that even though a standard that promotes accessibility exists, it lacks adequate compliance. The author suggested some points to improve the accessibility of Alabama State’s portals. These included increasing organizational support by providing the training required to meet accessibility requirements of a website and increasing governmental enforcement of website accessibility conformance within these organizations.

A number of accessibility evaluations of governmental websites have also been conducted in the United Arab Emirates, due to the continued lack of equality between disabled and non-disabled people in benefitting from the government’s e-services. Al Mourad and Kamoun [1] examined the degree to which 21 Dubai e-government websites were considered accessible based on WCAG 1.0 standards. The main goal was discovering the major difficulties facing disabled people while visiting these portals and recommending some solutions to overcome their limitations. According to the authors, this was the first time such a study was conducted for Dubai e-government websites. The results revealed that only two Dubai e-government portals provided the basic W3C accessibility conformance level (WCAG 1.0 Level A), which implies that disabled people could not utilize fundamental operations on these websites. Moreover, none of the tested portals fulfilled WCAG AA nor WCAG AAA accessibility conformance levels, with the vast majority of them lacking WCAG AA level. In light of these outcomes, the authors suggested some recommendations to motivate accessibility standards conformance. These included collaboration between governmental and public sector organizations to define best design for accessibility standards that comply with WCAG guidelines, publish them among web developers and administrators, and evaluate these portals continuously.

On the other hand, Kamoun and Almourad [20] evaluated the accessibility of 21 Dubai e-government website homepages in order to assess the level of accessibility while ranking them in the Dubai e-government websites’ statuses. They found that none of the tested websites passed the basic WCAG 2.0 accessibility conformance level (WCAG 2.0 Level A). This implies that disabled people could not utilize fundamental operations on these websites. Moreover, website accessibility was not considered to be a significant factor when assessing a site’s quality and ranking in the Dubai e-government websites ranking. Thus, no positive association between website accessibility and quality was discovered. Additionally, the size and simplicity of a website seemed to have a noticeable impact on achieving a higher accessibility score, due to the decreased potential violations a developer may produce. Based on these results, the author recommended that WCAG 2.0 standards should be considered as a fundamental factor while evaluating e-government websites.

As WCAG 2.0 guidelines have come to follow the International Organization for Standardization/International Electrotechnical Commission international accessibility standard, Kamoun et al. [19] investigated the demand for re-evaluating the accessibility of websites versus WCAG 2.0 guidelines to find whether there were any undiscovered accessibility violations by the WCAG 1.0 conformance test. A number of WCAG level 1.0 and 2.0 conformance evaluations for all Dubai e-government website homepages were conducted. Compared to when WCAG 1.0 guidelines were used, the study showed that there were additional accessibility violations reported by WCAG 2.0 guidelines and that all WCAG 2.0 specific accessibility barriers required significant effort to fix them, except for two criteria. Thus, there is a demand for re-evaluating the accessibility of websites that already meet WCAG 1.0 guidelines versus WCAG 2.0 guidelines. This result contradicts the previous assumptions in [1], which found that websites conforming to WCAG 1.0 standards did not need much effort to adapt to WCAG 2.0, and some websites may not require any modifications to conform to WCAG 2.0. The authors suggested that there should be a transition from WCAG 1.0 to WCAG 2.0 while evaluating e-government websites.

Another study [18] examined the usability and accessibility of 155 Malaysian e-government homepages. The results found a high percentage of accessibility issues. Some checkpoints had an extremely high number of violations, and federal government websites had a much higher percentage of usability and accessibility guidelines violations than state governmental websites did. This indicated the need for an urgent fix. Based on the results, the authors suggested that website developers need to be concerned with both usability and accessibility. This would contribute to making governmental websites usable and accessible by a wide range of people.

In Bangladesh, Baowaly and Bhuiyan [12] assessed the degree to which 10 e-government portal homepages were accessible by evaluating them according to WCAG 1.0 guidelines and identified reasons, if any, for accessibility or its absence. The authors found a significant lack of accessibility on the tested websites and suggested several points to improve the accessibility of e-government portals.

Bakhsh and Mehmood [11] conducted an accessibility evaluation of 45 Pakistani central government portals. By doing so, they intended to shed light on the ignorance of disabled people, particularly visually impaired people, during website development life cycles, preventing them from getting benefits from e-government services. They found that the vast majority of the tested websites showed lack of accessibility guidelines conformance. This implied that, even though the standards for building accessible websites exist, website developers are unaware of the accessibility importance during website construction. The authors suggested some points for the improvement in accessibility, such as increasing the awareness of the importance of accessible government websites among all governmental organizations.

Grantham et al. [17] assessed the top 20 private and top 20 government Australian websites in relation to web disability discrimination regulations in Australia. The results showed that government websites generally tend to be more accessible than private ones. The authors stressed the significance of complying with W3C automated checker as a standard that all websites should follow.

Al-Radaideh et al. [5] examined the accessibility of 25 homepages of Jordanian e-government websites. Generally, accessibility was not considered in all tested e-government websites, which prevents disabled people from taking advantage of their services. Based on the results, the authors suggested that website developers should be concerned with accessibility. This would contribute to making their governmental websites accessible to a wide range of people.

3.2 Cross-countries and cross-continents e-government accessibility evaluation

Patr et al. [27] analyzed the accessibility of 15 government websites of some Asian countries. The goal from these investigations was to assess the accessibility level of these countries’ government websites, which may promote more efforts toward accessibility. Although there were limited efforts in adopting accessibility guidelines by some countries, the vast majority of the websites in all three categories had accessibility violations, regardless of the country’s population. This implies a lack of awareness of the consequences of accessibility ignorance in these countries. Some recommendations were made for improving website accessibility with respect to layout and design.

Goodwin et al. [16] investigated and compared the accessibility of public websites of 192 United Nations (UN) member States to discover the most common features that countries with accessible and inaccessible websites share, which might help in identifying the factors that affect website compliance with accessibility standards. An examination of web accessibility literature, such as the association between website quality and accessibility, was analyzed as well. According to the authors, this was the first time such results of a global study conducted for UN member states’ government websites became available to the public. In general, there was a noticeable difference between tested websites across continents. Europe had the lowest percentage of tests that found accessibility barriers (24.9 %), followed by Oceania and America, with 32.9 and 34.8 % of the tests, respectively, finding some accessibility barriers. On the other hand, Asia and Africa had the highest percentage of detected barriers, with 39.5 and 42.4 %, respectively. The results also confirmed the majority of the tested web accessibility hypotheses while disproving others. Based on the outcomes, the authors recommended that governments should put more effort into initiating anti-discrimination regulations and policies.

Another example is the study conducted by Lujan-Mora et al. [24] that aimed to measure the extent to which governmental websites have adopted accessibility standards, thus reflecting how much could disabled people benefit from these websites. Twelve e-government websites from South America and Spain were tested. Spain was involved in the study because it is a developed country and has enforced web accessibility for more than 10 years. Consequently, it could be used as the gold standard in the comparison. Overall, there was not sufficient compliance and only minimum accessibility standards were met in the vast majority of the websites. However, there was a huge difference in the number of accessibility violations in the websites. The authors concluded that the existence of accessibility legislation is not enough. E-government website developers need to consider accessibility much more while constructing e-government websites, which may help increase the number of disabled people using them.

3.3 Accessibility evaluation of Saudi e-government websites

AlJarallah [3] investigated the potential accessibility issues blind users may face while utilizing and interacting with online services such as e-government services with an in-depth exploration of how blind users perform this kind of interaction mentally. She focused on the national e-government portal as a representative Saudi e-government portal. The results showed that blind users were not satisfied while using the website and revealed some of the reasons for the difficulties blind users might encounter during the interaction, such as insufficient usability and accessibility levels, even though WCAG recommendations were followed. Based on the outcomes, the author proposed a cognitive and mental model-based navigational landmark that aims to enhance some of the accessibility aspects that would benefit blind users by helping them create mental maps of websites and consequently improve their navigation in these websites. Ultimately, the proposed solution was evaluated and a noticeable improvement in blind users’ interaction was observed. Another study by Al-Faries et al. [2] evaluated the accessibility, based on WCAG 2.0 guidelines, and the usability, based on expert reviews, of the top 20 Saudi e-government services. The findings showed that all websites tested had at least one accessibility issue and none of them were fully conforming to WCAG 2.0 standards. In contrast, the majority of the tested websites were user friendly. The three most frequently occurring accessibility issues identified were: Text Alternative for non-text elements (Level A), Accessibility of Keyboard (Level A), and Compatibility (Level AAA). Based on these results, the authors suggested the adoption of international accessibility and usability standards while developing Saudi e-government websites. They also stressed the importance of increasing awareness on the significance of web accessibility in order to ensure equal opportunities for disabled and non-disabled people in benefiting from online services.

The results from previous studies emphasize the importance for governments to put more effort into website accessibility, as an approach that people with disabilities can significantly benefit from. However, despite the global intention for evaluating e-government accessibility, especially domestically, insufficient efforts have been made to evaluate the Saudi Arabia governmental websites’ accessibility. The majority of research has focused on reviewing the current state of Saudi e-government in general, citizen adoption, and the challenges facing it. This motivates us to revisit the same e-government websites evaluated in 2010 to observe how their accessibility has evolved over the last 5 years.

4 Research methodology

The objective of this study was to re-evaluate the accessibility of a number of Saudi governmental websites to see whether any improvements have occurred since 2010. To achieve this objective, we followed the steps illustrated in Fig. 2.

Fig. 2
figure 2

Research methodology

Next, we present in further details the procedure followed in each step.

4.1 Sampling

4.1.1 Sample websites

Since the aim was to examine the improvement in Saudi e-government web accessibility, we evaluated the same sample used in the previous study conducted in 2010. This sample consists of 36 fully functional government websites that represent a wide range of governmental sectors, selected from the Saudi national e-government portal (http://www.saudi.gov.sa) [4]. The selection was based on the government website high profile, importance and delivery of key services to Saudi citizens, residents, businesses, and visitors at that time. However, because some of the ministries had, at the time of carrying out the survey, been merged into one, this number decreased to 34 websites. Before starting any evaluation, all websites’ URLs were checked to see whether they were still valid and updated appropriately.

4.1.2 Exploration of websites

In this step, we investigated the sample websites in order to get an overview of their usage, services, and functionalities, as recommended by W3C [30]. According to W3C “the initial exploration carried out in this step is typically refined in the later steps [and] helps identify web pages that are relevant for more detailed evaluation later on.” The outcomes from this step were a list of functionalities (e.g., create account) and a list of page types (e.g., page with forms, tables, and multimedia) for each website.

4.1.3 Sample web pages

The previous study [4] assumed that evaluating website homepages will give more indication of the organization and content of a website than any other page. Also, a number of accessibility evaluations followed this approach of web page sampling, including Kamoun and Al Mourad [20]; Calle-Jimenez et al. [13]. However, following this approach risks that “some aspects of accessibility will be underrepresented or not represented at all” Gilbertson and Machin [15].

Consequently, since evaluating entire websites is not practical, especially if qualitative evaluation methods are used, representative web pages or functionalities from each website were selected during the analysis, instead of focusing on the homepages only. These representative sample pages or functionalities have different purposes and designs, and are used by the majority of users, as suggested by another study [21]. Moreover, the number of these representative samples will vary based on the evaluation method used, as will be seen in the following section. Doing so may help ensure that the majority of WCAG 2.0 SC are covered, and we are able to “catch as much SC violations as possible” [29].

Furthermore, because the majority of Saudi e-government websites have Arabic and English versions, we have checked whether both versions of the same representative sample pages for each website are similar, to ensure that they have equivalent content (Table 1).

Table 1 Number of government websites by sector

4.2 Accessibility evaluation

4.2.1 HTML and CSS validity

Hypertext markup language (HTML) code validation is a primary step in evaluating web accessibility [12]. In order for assistive technologies to work properly, HTML and cascading style sheets (CSS) documents need to follow international technical specifications [12, 17]. HTML and CSS code validation include detailed rules regarding syntax or grammar in relation to specific elements within a document. Additionally, the CSS of each website should be compared against CSS specifications [17].

In order to determine whether the selected websites follow international standards, we evaluated them using HTML Markup Validation Service (https://validator.w3.org) and CSS Validator Service (http://jigsaw.w3.org/css-validator/), which are both available online for free from W3C.

4.2.2 Presence of validation icons and conformance icons

As noted in another study, there was no correlation between the indication of accessibility statements in websites and the level of their accessibility conformance. Similarly, the validation icons did not necessarily reveal the actual up-to-date accessibility conformance [15]. In the present study, we examined the presence of validation icons and conformance icons as an indication of the awareness of web accessibility in Saudi governmental websites. Moreover, if the conformance icon was present on a website, we observed whether WCAG 1.0 or WCAG 2.0 conformance icons are used and whether they reflect the actual conformance level. The same measures were applied to HTML and CSS validation icons. We checked the presence of HTML and CSS validation icons and, if they existed, we checked whether they represent up-to-date version of the respective website.

4.2.3 Guidelines and technical standards

WCAG 2.0 was used for conformance testing for a number of reasons. First, similar to the reasoning in our previous study, “WCAG 2.0 success criteria gave clearer guidance over WCAG 1.0 checkpoints. Besides, each success criterion is more easily testable by a human expert” [4]. WCAG 2.0 can also be applied to a wider range of web technologies compared to WCAG 1.0 [11]. Second, based on an investigation conducted on twenty-three heterogeneous accessibility evaluation studies to observe the current state of accessibility evaluation methods [10], we found that half of the studies followed WCAG 2.0 to assess accessibility levels. Finally and most importantly, WCAG 2.0 has become an ISO/IEC 40500 International accessibility standard since October 2012, which implies that its adoption has wider acceptance worldwide with many countries updating their national accessibility guidelines to adopt it [24].

The outcomes of the 2010 study showed that none of the tested websites passed the basic WCAG 2.0 accessibility conformance Level A, which indicates a serious problem in reaching the minimum level of conformance required by any accessible website. This might hinder many disabled people from benefiting from the services provided by a website. In light of this, in this study, we evaluated the conformance to all three WCAG 2.0 levels (A, AA, AAA). Additionally, each accessibility guideline and technical specification covers different accessibility features [32], and WCAG 2.0 consists of guidelines that appear to be more technology neutral, so they could apply to more situations (Australia n.d.). The current usage of ARIA standards is also assessed to make sure that the Saudi e-government websites are following the latest W3C recommendations. Assessing the current usage of these standards will show how assistive technologies can easily adapt to dynamic web content.

Despite the importance of User Agent Accessibility Guidelines (UAAG) in checking media players’ accessibility conformance, yet, there was a very limited number of Saudi e-government websites with embedded media players. Therefore, we did not consider UAAG in the accessibility evaluation.

4.2.4 Accessibility evaluation methods

In contrast to the 2010 study that followed a manual evaluation procedure with the help of some automatic tools, the current study used a combination of automated and manual techniques.

The automated evaluation varies in terms of features and choice of automated tools, which depends on the required criteria [20]. In the present study, we based our choice of selecting the automated tools on their availability, the ability to evaluate WCAG 2.0 (A, AA, and AAA) conformance levels, and the support of ARIA technical specifications.

A previous study analyzing the effectiveness of six accessibility evaluation tools (AChecker, SortSite, Total Validator, TAW, Deque, and AMP) in terms of coverage, completeness, and correctness found that the tools tested show different behaviors based on website types and accessibility principles [29]. For example, while the TAW tool showed high coverage and completeness, it showed low correctness. On the other hand, AChecker showed higher correctness and lower coverage and completeness. Hence, because automated testing tools have different benefits and drawbacks and are unable to detect all accessibility problems, one suggested solution to improve the low effectiveness is to combine the results from multiple tools [13, 29].

Furthermore, based on the authors’ investigation of 23 heterogeneous accessibility evaluation studies to observe the current state of accessibility evaluation methods [10], a preference for Web Accessibility Checker (AChecker) followed by Test de Accesibilidad Web (TAW) for automated evaluations was identified. However, because the TAW version for WCAG 2.0 conformance testing is limited [24] and is still in beta version, it is usually used for WCAG 1.0 checking. Thus, it did not seem appropriate in our analysis, the Total Validator tool was used instead.

Due to the reasons previously outlined, each sample web page was validated for WCAG 2.0 A, AA and AAA conformance levels using AChecker and Total Validator.

AChecker is an open-source tool that checks HTML pages for conformance with W3C accessibility standards [6]. It can be used for conformance testing against different guidelines, such as WCAG 1.0 and 2.0 and Section 508. Furthermore, it is able to identify code parts that require human decisions and has an option for HTML and CSS validations [32]. The problems identified by AChecker are classified into three levels: known problems (problems that definitely create accessibility barriers, as determined by the tool), likely problems (problems that require human judgment, as determined by the tool), and potential problems (problems that require human judgment that the tool is unable to identify). In our study, and for the sake of achieving a higher level of certainty, we documented the number of known problems recognized for WCAG 2.0 A and AA conformance levels.

Total Validator is an HTML accessibility validator [7]. It can validate accessibility against WCAG (1.0 and 2.0) and the US Section 508 standards. Total Validator is available in two versions: a basic free downloadable version with core features sufficient for our study and a pro version, which is commercial and has more advanced features [7].

Besides, in order to assess the usage of ARIA standards on the tested websites by detecting features such as header, footer, ARIA landmarks, and roles, the WAVE online automatic evaluation tool was used (Anon n.d.). After analyzing a web page, this tool shows the number of detected ARIA features in its generated summary.

Based on the list of page types for each website generated in the sampling step, we selected three representative web pages from each website. From the diagnostic reports generated from the above tools, we identified the number and type of violated SC for each conformance level (A, AA, or AAA). For ARIA usage, the number of detected ARIA features for each web page was noted. Table 2 summarizes the automatic tools used in the accessibility evaluation.

Table 2 Automatic tools used in the accessibility evaluation

Although the results of two accessibility tools were combined in our analysis, many studies have found that following an automated evaluation with human evaluation is a necessity, not an option. This is particularly true for WCAG 2.0 due to its highly interpretive nature [29]. This may help in eliminating any errors, such as false positives and false negatives, caused by automated testing tools and in ensuring that SC requiring subjective human evaluation is correctly assessed [15, 29].

In order to verify the obtained results from the automatic evaluation phase and compare how automated and manual evaluations differ in their outcomes, three accessibility evaluators were recruited to inspect exactly the same selected representative web pages from each website independently. The evaluators determined compliance with WCAG 2.0 Level A, AA, and AAA by checking the HTML code of each web page using the View Source operation in a web browser and testing whether the page has met WCAG 2.0 success criteria or not for the criteria inspected. Consequently, for each success criterion in WCAG 2.0, evaluators entered one of the following judgments: The webpage has failed to fulfill the success criterion, the webpage has passed the success criterion, or the webpage has partially passed the success criterion, whereas (N/A) indicates that the success criterion cannot be tested (does not exist). In order to verify results, manual inspection results were rechecked at least once.

5 Results and discussion

In this section, we illustrate the findings of the accessibility evaluation for the sample of 34 Saudi government websites using the aforementioned evaluation techniques.

5.1 HTML and CSS evaluation

Following the evaluation of three web pages from each website using HTML and CSS validation services, we calculated the average validation errors. Figure 3 shows the average number of HTML and CSS validation errors in Saudi government websites.

Fig. 3
figure 3

Average number of HTML and CSS validation errors for the Saudi government websites

As can be seen, departmental websites have the highest number of HTML validation errors with an average of 160 errors. On the other hand, websites of Directorates have the lowest number of errors with only 3.5 average errors. Funds and Ministry websites have almost the same numbers of errors with 100 average errors each. Commissions’ websites have an average of 65 errors. Other governmental websites which were included in the study have on average of 148 errors.

For CSS validation results, Departments and Directorates websites are the only websites that have less than 50 errors on average, with an average of 25 and 26.5 errors, respectively. Furthermore, Funds’ websites have the highest number of errors with 200 errors on average. Ministry websites have 121 errors on average, while the websites of Commissions and Authorities have on average 70 errors each.

In order to identify if the Saudi government websites have improved since 2010, a comparison of the results obtained from the two studies (2010 and 2016) was made. Table 3 shows the average HTML validation errors for the previous and the current results.

Table 3 Comparison between the number of HTML validation errors for the previous and current study

Interestingly, the number of HTML validation errors decreased in all Saudi government websites in 2016. Directorates’ websites have the highest reduction percentage with 97.45 %. However, Departments’ websites have the least reduction percentage with only 8.57 %.

5.2 Presence of validation icons and conformance icons

In addition, we examined the presence of WCAG, HTML, and CSS validation icons and found that only four websites, most of them Ministries, have the conformance icons, while all of them reflect the actual validation level.

The WAVE tool was also used to detect the usage of ARIA standards in the Saudi government websites. We found that most of the websites use ARIA standards. Table 4 shows the percentage of ARIA usage in Saudi government websites.

Table 4 Percentage of the ARIA usage in Saudi governmental websites

A manual evaluation was also conducted in order to identify whether the Saudi governmental websites are mobile friendly or not. We browsed each website using smartphones and measured the website design adoption. We found that only 26 % of the websites were mobile friendly. Next, we looked at the English version of each website: Approximately 79 % of Saudi websites support an English language version with equivalent content to the Arabic version.

5.3 Guidelines and technical standards evaluation

5.3.1 Automatic evaluation results

Next we present the accessibility evaluation results for the automatic tools (AChecker and Total Validator). For WCAG 2.0 conformance testing using AChecker and Total validator tools, we calculated the average failed success criteria for level (A, AA, AAA) for both tools. Figure 4 illustrates the results of average failed success criteria for Saudi governmental websites using both automatic tools.

Fig. 4
figure 4

Average failed success criteria for Saudi governmental websites using automatic tools

Ministries, Funds and other governmental websites ranked the highest in failing Level A criteria. The most failed success criteria (SC) includes: SC 1.1.1 (Non-text Content), SC 1.3.1 (Info and Relationships), and SC 3.2.2 (On Input). Furthermore, Directorates and Commissions websites have almost the same number of failed success criteria whereas Departments websites have the least Level A success criteria.

With regard to Level AA failed success criteria, Funds websites have the highest failed success criteria with average of 129 SC. The rest of the governmental websites have a low number of failed success criteria with an average lower than 40 SC. The most failed success criterion appeared in Level AA according to AChecker and Total Validator tools were SC 1.4.4 (Resize text).

For Level AAA failed success criteria, Funds websites again have the highest number of failed success criteria. Other governmental websites have the second highest failed success criteria. Directorates, Ministries and Commission websites have almost the same average failed success criteria ranging between 27 and 22 SC. The smallest number of failed success criteria in Level AAA was indicated in Departments websites. The common failed success criterion in Level AAA was SC 1.4.6 Contrast (Enhanced).

5.3.2 Manual evaluation results

After evaluating each of the representative web pages using automatic tools, we evaluated the same representative web pages manually by checking the WCAG 2.0 success criteria checklist. We calculated the average for the results provided by the three evaluators. This step was done to complement the results of automated tools and emphasize them. Figure 5 shows the average failed success criteria for Saudi governmental websites using manual evaluation.

Fig. 5
figure 5

Average failed success criteria for Saudi governmental websites using manual evaluation

For Level A SC, Directorates websites have the highest number of failed success criteria. Funds and other governmental websites have the same number of failed success criteria. Furthermore, Departments websites have, on average, 2.5 failed SC. However, Ministries and Commission websites have the smallest number of Level A failed success criteria.

Moving to Level AA SC, again, Directorates websites have the highest number of failed success criteria, followed by Ministries, Commissions, and other governmental websites. Departments and Fund websites have the smallest number of failed success criteria with only one failed success criterion.

As for Level AAA failed SC, Directorates websites have the highest number of failed success criteria. Fund, Ministries, and other governmental websites have the same number of failed SC. The smallest number of failed SC for Level AAA was indicated in Departments and Commissions websites.

While Directories websites where the least problematic category in the automated evaluation, in contrast, it was the category with most failed SC in manual evaluation. One explanation for this can be attributed to the maturity and coverage of automated tools and to the evaluation checklist used by manual evaluators and their subjectivity.

On the other hand, the e-government websites evaluated did successfully apply some WCAG 2.0 success criteria. Table 5 lists the common success criteria for level (A, AA and AAA) which appeared in the manual evaluation.

Table 5 Common success criteria for level (A, AA and AAA) in manual evaluation

Summarizing, Table 6 compares the obtained results of the manual evaluation of our previous study (2010) with the present study (2016). As one can see, the average number of WCAG 2.0 violations has dramatically decreased. This can be attributed to several factors one of them being the application of Yasser e-government program guidelines. Moreover, the increased usage of mobile devices to access government services, enforced web developers to create mobile friendly and accessible websites.

Table 6 Difference between the results of manual evaluation (2010 Study vs 2016 study)

6 Conclusion and recommendations

This paper presented the results of a re-evaluation of the accessibility of Saudi e-government websites after a lapse of 5 years. Conducting such an evaluation periodically showed how things can change overtime, especially in a dynamic field such as web development.

The evaluation method employed in this study was thorough and took many technical dimensions into account, which was not possible to employ in the previous study. This included the evaluation of ARIA, mobile friendliness and also the increase in tested pages from one (home page) to three representative pages.

The evaluation results are promising and indicate an increase in awareness for applied web accessibility standards compared to the results of the 2010 study. This might be attributed to the increase in awareness of developing accessible e-government services enforced by the Yasser e-government program.

Overall, a considerable improvement in the government websites’ accessibility has been found, which implies the existence of standards promoting web accessibility in Saudi e-government websites. However, the authors suggest some points to improve the accessibility of Saudi e-government websites. These include increasing government support by providing required training to meet accessibility requirements of a website and increasing governmental enforcement of website accessibility conformance.

Finally, this study has certain limitations. The manual evaluation might be subjective and prone to error, so further evaluation is required. A recommended evaluation method is user testing. It is usually considered the most precise evaluation method, as it boosts the reliability of results by revealing the various difficulties that actual users may encounter [14]. On the other hand, it requires some resources—such as time, experienced testers and a carefully chosen testing environment—in addition to the potential that the found accessibility barriers are related only to the needs of disabled participants [17].