The World Wide Web is one of the most common methods for obtaining health-related information (Fox and Jones 2009; Fox and Rainne 2002; Khoo et al. 2008; Provost et al. 2006; Wainstein et al. 2006). Most consumers locate online health information using popular search engines such as Google and Yahoo (Eysenbach and Köhler 2002; Khoo et al. 2008). There are conflicting reports regarding consumer’s confidence in the quality of the information located through search engines (Khoo et al. 2008). Concern over website content is likely justified. A recent review of research on the quality of websites providing health-related information showed that quality was problematic in a majority of the studies reviewed (55 of 79, 70%) and up to 90% of websites on a topic (e.g., weight management) had inaccurate information (Eysenbach et al. 2002). More recently, Scullard et al. (2010) found that using Google to seek information on autism and the mumps, measles and rubella vaccine produced inaccurate results on 45 of the first 100 websites located.

For parents of children with autism spectrum disorders (ASDs), the World Wide Web is most frequently the method utilized to obtain information (Chowdhury et al. 2002; Mackintosh et al. 2005; Mansell and Morris 2004). In 1999, Tony Charman found 104,950 results from a search of the term “autism” using the Altavista search engine (Charman 1999). In June of 2009 and 2010 our search of “autism” using the Google search engine returned 19,900,000 and 12,200,000, respectively. This large increase in the number of autism websites mirrors the general increases in awareness and interest in ASDs.

Although interest in ASDs continues to increase, little information has been published about autism websites. In 2002, Chowdhury et al. reviewed 145 online sites and found that 80% of the websites had information that could not be verified as accurate. They concluded there was an immediate need for a system to evaluate autism specific sites for public use. This is the only published evaluation of the content of autism websites. What has been more common are lists of recommended websites (e.g., Abbey 2009; Bloomquist 2005; Charman 1999; Coates 2009; D’Auria 2010; Polirstok and Lesser 2003; Sabo and Lorenzen 2008; Seeman 2005) and recommendations on how the World Wide Web can be used to support individuals and families (Bradford 2010; Ferdig et al. 2009; Jordan 2010).

Currently, there has been little guidance available for how to locate high quality websites containing information on ASDs. Multiple quality assessment tools have also been created to aid consumers in appraising general health websites (Bernstam et al. 2005; Burkell 2004; Charnock 1998; Fallis and Fricke 2001; Kim et al. 1999) and standards for online health-related information have been proposed (e.g., eEurope 2002; Health on the Net Foundation 2010; MedlinePlus 2010; Silberg et al. 1997). However, the utility of the assessment tools and quality standards have been questioned (Bernstam et al. 2005; Gagliardi and Jadad 2002; Jadad and Gagliardi 1998; Kunst et al. 2002) and neither have been systematically applied to websites containing information on ASDs.

Given the widespread availability and frequent use of the World Wide Web by parents of children with ASDs, evaluation of the quality of autism websites is needed. We sought to evaluate the quality of autism websites in two recent studies. In the first study, we evaluated nine characteristics of the most highly ranked websites when a keyword search of term “autism” was conducted using three popular search engines. Although the first study provided a snapshot of website characteristics, it did not assess quality. We therefore conducted a follow-up study in which we selected 30 websites using the Google search engine and conducted an online survey of autism experts that evaluated the websites characteristics and quality.

Study 1

Method

Sample

The sample for Study 1 consisted of the top 100 websites located when “autism” was entered into the Google (http://www.google.com), Yahoo (http://www.yahoo.com), and Bing (http://www.bing.com) online search engines on July 4, 2009. We chose three search engines because they had the largest US market share, at 64.7, 19.3, and 8.9%, respectively (comScore 18 August 2009) when our search was conducted. We only included the ranked sites; i.e., sponsored links or sites were not included. We found overlap between search engines (e.g., Wikipedia was ranked on all three search engines) and some common website domains appeared multiple times within a search engine’s top 100 query (e.g., Wikipedia page on autism, Wikipedia page on the causes of autism). To resolve the overlap, we randomly selected one unique record locator (URL) from the website to serve as a representative sample. Eliminating overlap reduced the final sample from 300 websites to 164 websites, which are shown in Appendix 1.

Data Collection

We selected, defined, and coded nine website characteristics. We selected these characteristics from a review of other published assessments of health related websites. Operational definitions for each characteristics and the reference for the assessment in which they were used are shown in Table 1. The nine characteristics included three of the standards of accountability proposed by the editors of the Journal of the American Medical Association (attribution, authorship, currency, and disclaimer; Silberg et al. 1997) and six other characteristics (contact information, promotion of a non-evidence-based treatment, purpose, commercial product or service, reading level, and top-level domain). One trained coder (second author) rated all 164 websites and 40 randomly selected websites (24%) were rated independently by a second trained rater (first author); the average agreement across characteristics was 94%.

Table 1 Operational definitions of website characteristics

Results and Discussion

The characteristics of the 164 websites included in Study 1 are shown in Table 2. Almost 85% of the websites were registered using a .com, .net, or .org top-level domain; only 6% of websites were registered with .gov or .us, and only 5% of websites were registered with .edu. The most frequently coded website purpose was freestanding clinic or organization (38%), followed by individual’s site, forum, or blog (17%), and health information site (15%). Collectively, these top three categories accounted for the purpose of 70% of all websites. Overall, two quality indicators were present on nearly all websites: all information was available without providing personal information and a method of contacting the website was provided. About one-half of the websites were current, provided author information, and/or contained a medical disclaimer, and nearly one-third of the websites contained references. More than one of five websites (21%) offered a product or service for purchase and 17% of the websites promoted a treatment that is not evidence-based. The overall reading level of the websites, as measured by the Flesch-Kincaid grade equivalence, was high (mean = 13, median = 12.8), suggesting 50% of the websites were written at a collegiate, which is higher than the average reading level of parents in the United States (i.e., 7th–8th grade; Davis et al. 1994), of other health-related websites (e.g., Eysenbach et al. 2002; Ghidella et al. 2005), and pediatric patient education materials (Davis et al. 1994). While this advanced reading level might be due to the technical nature of information presented on the websites, it might preclude some consumer’s full comprehension of the information.

Table 2 Website characteristics of the 164 Websites Evaluated in Study 1

Study 1 had limitations. First, we only examined the top 100 websites across three search engines. It is not known if this strategy provided a representative sample of all websites, making it impossible to generalize the results of the analysis to the overall population of websites containing information on ASDs. Second, we did not measure the quality of the content on the websites. We acknowledge this was a major limitation that ultimately prevented us from being able to draw conclusions regarding which characteristics were associated with high quality websites. Although previous evaluations of health-related websites have suggested some of the characteristics we measured might be a proxy for quality we did not feel we could draw conclusions about the quality of the websites from the data we collected, which led us to design and conduct Study 2.

Study 2

Method

Participants

The sample for the online survey was comprised of the 1,448 individuals whose email address was included in the 2009 or 2010 International Meeting for Autism Research (IMFAR) Annual Conference Programs. Ninety-one email addresses were returned as undeliverable and 13 individuals asked to be removed from the survey. Of the remaining 1,344 potential respondents, 299 (22%) participated by evaluating at least one website. We asked eight demographic questions, which are shown in Table 3.

Table 3 Characteristics of survey respondents (N = 299)

Procedures

Website selection. We selected 30 websites in June 2010 that contained information on one or more of the following topics: general characteristics of autism, signs of autism, symptoms of autism, causes of autism, and treatments of autism. First, we printed the top ten results for Google keyword searches of the terms “autism,” “autism spectrum disorder,” “autism vaccine,” “autism causes,” “autism symptoms,” “autism cure,” and “autism treatment”). No university websites were in any of these searches, so we oversampled for university affiliated sites by selecting the first five university websites containing general autism information (i.e., not a university site describing only research or clinical services) that appeared when searching the term “autism.” Finally, we ensured that our sample contained both ranked sites and sponsored links. The sample size of 30 was chosen to help ensure we would receive at least 15 ratings per website with a 10% response rate. The 30 websites included in Study 2 are shown in Appendix 2.

Survey. Participants were recruited for the online survey through an email solicitation. The online survey was designed and conducted using the Survey Gizmo software. Each participant access the survey using a unique hyperlink sent to their personal email address. Each survey contained five pages and queried the participants about three randomly selected websites. The survey remained active for 12 days (Monday August 16 through Friday August 27), during which we sent two reminder emails (one on the 7th day and one on the 10th day).

Each survey contained three pages evaluating three predetermined randomized websites (each website was randomly assigned to one of 30 groups where each website was first, second, and third in different groups). On each website page, there was a text box with information from one website and three questions. The text box contained information on the general characteristics, signs, symptoms, causes, and treatments of autism, which was copied from each website, pasted into a word processor, and removed of all identifying information (e.g., the name of the website). The de-identified text was then imported into the survey program for each website and is the information about the website that the respondents rated. Thus, respondents were likely blind to the website’s identity.

We asked three questions about each website. Participants were required to complete all three questions about each website before moving to the next page. The first question asked the participants to rate the accuracy of the information on a 5-point Likert scale (very inaccurate, mostly inaccurate, neither inaccurate or accurate, mostly accurate, very accurate). The second question asked the participants to rate the currency of the information using a similar 5-point scale. The third question asked participants to whom (i.e., parent, physician, academic, clinician, no one, other) they felt a website containing the information that was provided would be helpful (e.g., to whom they would recommend the website). For this question, participants were able to select multiple choices. The final page thanked the participants for their participation and provided a box in which the respondent could provide written comments.

Website characteristics. Two trained coders (first and third authors) evaluated the nine website characteristics used in Study 1 and five additional characteristics that we created and defined for Study 2 (all characteristics are shown in Table 1). The two coders independently evaluated all 30 websites, and disagreements were resolved through mediation and re-evaluation of the specific website characteristic.

Faculty ratings. Finally, we had two faculty members of the study team (first and last authors) independently rate all 30 websites. The faculty answered the same questions as the online survey respondents. There was one key difference between the faculty ratings and the online survey. The faculty members were not blind to the website’s identity and were not limited to the blinded information presented in the surveys; i.e., the faculty members could view all pages within a website domain.

Survey Data Analysis

The main dependent variables for this study were the survey respondent ratings of website accuracy and website currency. Most participants rated all three websites presented to them; 274 (92%) respondents evaluated all three websites they were presented, 15 (5%) respondents evaluated the first two websites they were presented, and 10 (3%) respondents only completed the first website that was presented. Because each rater only rated three websites, and each rater may have interpreted the rating scale differently, it was necessary to estimate characteristics of the raters. Individual raters may be more harsh or lenient in their ratings of the same website, but the overlap of websites by raters, e.g., the same website was rated by multiple raters allowed us to place all websites on the same scale. Website accuracy and currency were estimated by using a modified Graded Response Model (Samejima 1997). In this analysis it is easiest to think of websites as students taking a test, and raters as items on the test, with some items being more or less difficult. While not all websites were rated by all raters, the overlap of raters and websites linked all of the websites allowing us to place them on a common scale. However, because each rater only rated three websites, the data set was very sparse, thus we used a Bayesian Markov Chain Monte Carlo model in WinBugs (Spiegelhalter et al. 1999) to estimate the parameters of the data (website quality and rater harshness). The model was based code from Curtis (2010) and placed the estimates of website accuracy and currency on a continuous scale with a mean of zero rather than an ordinal 1–5 scale as the original ratings. Because we felt accuracy and currency are both characteristics of high quality and the accuracy estimates and currency estimates were highly correlated, we combine the scores for each website to create a website quality estimate, which was used for all subsequent analyses.

Results and Discussion

The characteristics of the 30 websites were analyzed using descriptive statistics and are shown in Table 4. The 299 respondents provided 858 distinct website ratings; the median number of ratings per website was 29 (range 16–42). We used the website quality estimate to examine relations between website quality and website characteristics. As shown in Table 4, we did not find a statistically significant relation for most characteristics and website quality. Two characteristics, whether the website offered a product or service for sale and the promotion of a non-evidence-based practice had statistically significant relations to website quality. Websites containing a product or service for sale and websites promoting a non-evidence-based practice were each more likely to have a lower website quality estimate, r s  = −.45, p = .014 and r s  = −.73, p < .001, respectively. For this sample, these two characteristics were highly correlated (r s  = .68). Scatterplots of these relations are shown in Figs. 1 and 2.

Table 4 Website characteristics of the 30 Websites Evaluated in Study 2 and the relation of each characteristic to website quality
Fig. 1
figure 1

Top. Scatterplot for website quality estimate by whether the website contained a product or service for sale

Fig. 2
figure 2

Top. Scatterplot for website quality estimate by whether the website promoted a non-evidence-based practice

Although there was not a statistically significant association between top-level domain and website quality (r s  = .32, p = .09), analyses of scatterplot data, which is shown in Fig. 3, suggested the possibility of relations. Further sub-analyses revealed one statistically significant finding. Websites that had a. gov top-level domain were significantly more likely to have a higher website quality estimate than websites with a commercial oriented top-level domain (e.g., .com, .org; t = 2.8, p = .02). Although this same comparison was not statistically significant for websites with a .edu top-level domain (t = .68, p = .52), analysis of the scatterplot (shown in Fig. 3) shows 4 of 5 websites in this sample with a .edu top-level domain had positive website quality estimates, suggesting further evaluation of this relation using a larger sample size is warranted.

Fig. 3
figure 3

Top. Scatterplot for website quality estimate by the website’s top-level domain

Visual analysis revealed emerging associations for two additional characteristics: seals and purpose. The scatterplots of website quality estimates for seals and purpose are shown in Figs. 4 and 5. Five of six websites with a seal had website quality estimates greater than the mean (see Fig. 4). If this trend was found significant using a larger sample, the result would replicate the finding of Fallis and Fricke (2001) who found websites with a HONcode (Health on the Net Foundation 2010) seal had significantly higher quality than those without. Six of seven websites we coded as a health information sites had website quality estimates greater than the mean (see Fig. 5). Although these relations were not statistically significant, these relations were measured in a study with a very small sample and further evaluation of these relations are warranted.

Fig. 4
figure 4

Top. Scatterplot for website quality estimate by whether the website contained a seal

Fig. 5
figure 5

Top. Scatterplot for website quality estimate by website purpose. Key: 1 General informational site, 2 government, 3 university, 4 individual’s site/forum/blog, 5 clinic or organization, 6 health information site

Respondents also indicated the type of individual (i.e., parent, clinician, teacher, therapist, physician, academic, no one) to whom they would recommend each blind website information they rated (i.e., participants indicated to whom they would recommend the information they were presented, not a specific website per se). The top three website recommendations for each type of individual are shown in Table 5. The two most frequently recommended websites were The Association for Science in Autism Treatment and Wikipedia, which were recommended to four and three groups of individuals, respectively. Websites of government agencies were also frequently recommended to parents; two of the top three recommended websites for parents were government agencies. Although the information contained in Wikipedia was rated highly in our study, we feel that we should note that the accuracy and utility of Wikipedia has been shown to be mixed in other health research (e.g., Clauson et al. 2008; Laurent and Vickers 2009) and we urge caution when using Wikipedia and other user-edited sites as resources.

Table 5 Most recommended websites shown as the percentage of respondents recommending the blinded website information presented in the online survey by the type of individual to whom survey respondents would recommend the information

Most analyses of health-related website content and quality are evaluated by one or two individuals (see Eysenbach et al. 2002). We included faculty ratings to compare this more common methodology (i.e., having one or two individuals assess website quality) to our online survey. The mean faculty ratings were highly correlated with the online survey ratings; r s  = .79, p < .001. Although conducting a large survey might provide better estimates, it is a time and labor-intensive process. The high correlation between the faculty ratings and online expert ratings suggest that having each website rated by a smaller number of individuals can provide similar results to mass surveys and might be a more efficient method for future research.

Although we addressed the major limitation of our first study (i.e., not measuring website quality), Study 2 had limitations. First, although the 22% response rate of the survey is lower than many surveys, it did provided us with the amount of data we needed to conduct our analyses. Second, although we opted to have a small sample of websites (30) to help ensure obtaining multiple ratings for each website, it nonetheless limits our ability to draw conclusions and generalizations to the larger population of websites containing information on ASDs. Finally, we designed the survey such that each respondent only rated up to three websites to increase the response rate. If participants had rated a greater number of websites our ability to draw more definitive conclusions about respondent behavior would have been better, which would increase our confidence in our website quality score estimates.

Conclusions

Having informed consumers is a cornerstone of evidence-based practice (Reichow and Volkmar 2011; Straus et al. 2005). The World Wide Web has the potential to be a great resource for increasing consumer knowledge, especially given the finding suggesting that searching for health-related information is one of the most common uses of the World Wide Web (Fox and Fallows 2003). Our initial hope was that our research would help us identify website characteristics that could lead consumers to high-quality websites with information on ASDs. However, finding high quality information may not be an easy task, which is evidenced by previous work suggesting consumers have limited trust in search engines (Khoo et al. 2008), difficulty finding trustworthy sites (Khoo et al. 2008; Wainstein et al. 2006) and difficulty understanding website content (Wainstein et al. 2006). Based on our collective experiences evaluating autism websites over the past 2 years, we feel consumers must exercise great caution when using the World Wide Web to obtain information on ASDs and strongly recommend that consumers use the World Wide Web as a supplement, not a replacement, to information they obtain from professionals (e.g., pediatrician, psychologist, psychiatrist, special educator).

In Study 2, we found three positive associations between website characteristics and website quality. First, websites from universities and government agencies (which in the United States have a top-level domain of .edu or .gov, respectively) appear more likely to contain higher quality information (see Fig. 3). This is not surprising to us since many government and university websites are likely to have standards and/or procedures for placing information online that must be followed (e.g., institutional review board, publication policies). Although this association might seem obvious, our data are the first to ascertain such a relation for autism websites. Second, websites with a seal (e.g., HONcode, Utilization Review Accreditation Commission [URAC]) appear more likely to contain high quality information (see Fig. 4). However, consumers must still use caution since many seals can be gained without an assessment of the quality of information contained on the website and might be misleading (Burkell 2004; Gagliardi and Jadad 2002; Jadad and Gagliardi 1998). Finally, websites that we coded as being a health information site appear to have higher quality than websites with more general purposes (e.g., individual’s site/forum/blog, news site; see Fig. 5). While this last association is encouraging given that these websites are often created for consumers, it is a category that we created and there is not always an obvious indication that a site is what we consider a “health information site.” Thus, the utility of this finding and characteristic might be less than that of the top-level domain and seals, which are more obvious characteristics. Although we are very encouraged by the findings from our research, one must remember that these recommendations are based on data from a small sample of websites and should be considered tenuous until further analysis of these characteristics can be carried out.