Introduction

The validity of a distinction between various autism spectrum disorders or pervasive developmental disorders (PDD)Footnote 1 has been questioned (Beglinger & Smith, 2001; Buitelaar, Van der Gaag, Klin, & Volkmar, 1999; Fein et al., 1999; Mahoney et al., 1998; Pomeroy, 1998; Prior et al., 1998; Schopler, 1996; Volkmar et al., 1994). The Diagnostic and Statistical Manual of Mental Disorders, 4th edition, Text Revision [DSM-IV-TR] (American Psychological Association [APA], 2000) includes several subtypes of PDD. Three subtypes can be associated with normal intelligence and intact formal language skills: Asperger syndrome (AS), the so-called high-functioning subgroup of autism (HFA), and pervasive developmental disorder not otherwise specified (PDDNOS). HFA is characterized by a triad of symptoms: (1) qualitative impairment in social interactions, (2) qualitative impairment in communication, and (3) restricted, repetitive and stereotypic patterns of behaviors, interests and activities. Children with AS show no clinically significant general delay in cognitive and language development. Children with PDDNOS fit within the autistic spectrum, but do not meet full criteria for the three domains (Filipek et al., 1999).

Several studies have compared subjects with HFA and AS in their behavioral characteristics, symptom patterns, and cognitive or neuropsychological profiles. Children with HFA often show higher rates of speech delay and deviant language (i.e., delayed echolalia, pronoun reversal, unusual intonation, little or no reciprocal verbal exchange, and use of neologisms) than children with AS (Eisenmajer et al., 1996; Gilchrist et al., 2001; Kugler, 1998; Twachtman-Cullen, 1998). However, subjects with AS are reported to have a more pedantic style of speech (i.e., marked verbosity with lengthy speech or incessant monologues) than subjects with HFA (Ghaziuddin & Gerstein, 1996; Kugler, 1998). Children with AS exhibit fewer repetitive behaviors (e.g., insistence on sameness, stereotypies, rituals), but more abnormal preoccupations, all-absorbing interests, and higher rates of motor problems (e.g., clumsiness, manual speed, dexterity, coordination and balance problems) and anxiety (e.g., Gillberg, 1989; Klin & Volkmar, 1997; McLaughlin-Cheng, 1998; Szatmari, Bartolucci, & Bremner, 1989). Szatmari, Archer, Fisman, Streiner, and Wilson (1995) found that subjects with HFA scored worse than subjects with AS on many PDD symptoms (e.g., reciprocal social interaction, rituals, and resistance to change), adaptive behaviors, and cognitive measures of language competence, but not on aspects of nonverbal communication and cognition, or motor development. However, other researchers found no or little differences between children with HFA and AS (Ghaziudinn, Butler, Tsai, & Ghaziuddin, 1994; Manjiviona & Prior, 1995, 1999; Ozonoff, Rogers, & Pennington, 1991). Ozonoff, South, and Miller (2000) argued that HFA and AS involve the same fundamental symptomatology, but differ primarily in degree or severity of impairment (see also Prior et al., 1998).

Cognitive studies comparing children with HFA and AS have mainly focused on three approaches. First, there is the Theory of Mind (ToM) that refers to the ability to mentalize or to attribute mental states to others and to one-self (Baron-Cohen, 1995). The ToM explanation of autism suggests that children with autism lack the ability to think about thoughts, and so are specifically impaired in social, communicative, and imaginative skills (Happé, 1994). Of interest for present purposes is how ToM tasks differentiate HFA and AS. Individuals with AS passed first- and second-order ToM tasks better than those with HFA (Ozonoff et al., 1991; Ozonoff & McEvoy, 1994; Ziatas, Durkin, & Pratt, 1998). This may represent a true, distinguishing characteristic between HFA and AS (Twachtman-Cullen, 1998). However, Frith (1991) argued that, although individuals with AS may be superior to those with HFA on ToM tasks, they nevertheless show ToM deficits at higher levels of complexity because they may apply alternative strategies in solving ToM tasks (see also Bailey, Phillips, & Rutter, 1996; Bowler, 1992). Dahlgren and Trillingsgaard (1996), who failed to find differences in ToM prestations between HFA and AS, suggested that verbal intelligence, rather than diagnosis is a crucial predictor for solving ToM tasks. It has been suggested that both may show equal ToM problems in real-life social situations (Frith, 1991; Happé, 1994).

A second approach that has been used is executive functioning (EF; Ozonoff, 1997; Russell, 1997). Executive functions (EFs) are mental control processes that enable self-control necessary for the attainment of a future goal (Denckla, 1996; Lezak, 1995; Pennington & Ozonoff, 1996; Welsh & Pennington, 1988). EF refers to cognitive functions mediated by the prefrontal cortex (Becker, Isaac, & Hynd, 1987; Cabeza & Nyberg, 2000; Fuster, 1997; Reitan & Wolfson, 1994; Rezai et al., 1993; Tranel, Anderson, & Benton, 1994), such as inhibition, working memory, cognitive flexibility or set-shifting, planning, and verbal fluency (Ozonoff, 1997; Pennington & Ozonoff, 1996; Reader, Harris, Schuerholz, & Denckla, 1994; Weynandt & Willis, 1994). Multiple studies have identified EF deficits in preschoolers, children, adolescents as well as adults with autism (Geurts, Verté, Oosterlaan, Roeyers, & Sergeant, 2004; Hill, 2004; Ozonoff, 1997; Pascualvaca, Fantie, Papageorgiou, & Mirsky, 1998; Russell, 1997; Shu, Lung, Tien, & Chen, 2001; Turner, 1999a; but see Griffith, Pennington, Wehner, & Rogers, 1999; Ozonoff & Strayer, 1997; 2001; Russell, Jarrold, & Hood, 1999). Most studies have not made a distinction between the different PDD subtypes. Children with autism often have problems with planning, cognitive flexibility, working memory, and verbal fluency. In contrast, it is often found that inhibition is a relatively spared domain in children with autism (Ozonoff, 1997; but see Geurts et al., 2004). Most studies that compared the EF profiles of children with HFA and AS concluded that both groups show relatively equivalent EF profiles. Like children with HFA, many children with AS have problems in planning, cognitive flexibility, and working memory (e.g., Klin, Volkmar, Sparrow, Cicchetti, & Rourke, 1995; Manjiviona & Prior, 1999; Miller & Ozonoff, 2000; Ozonoff et al., 1991, 2000; Szatmari, Tuff, Finlayson, & Bartolucci, 1990). In contrast, Rinehart, Bradshaw, Moss, Brereton, and Tonge (2001) showed that children with HFA have more problems than children with AS in shifting attention.

A third cognitive approach for comparing subjects with HFA and AS has been to use neuropsychological tasks with a nonspecific character. Children with AS have more problems than children with HFA for visual-motor integration, visual memory and visual–spatial perception, nonverbal concept formation, and emotion perception (Klin et al., 1995; Morris et al., 1999). Furthermore, findings generally suggest higher verbal IQ and lower performance IQ in AS, while the opposite trend is often observed in HFA (Ehlers et al., 1997; Klin et al., 1995; Ozonoff et al., 1991; Volkmar et al., 1994). Klin et al. (1995) concluded that there is a marked convergence of the neurocognitive profile of children with AS with the Nonverbal Learning Disability (NLD) profile of deficits and assets. NLD characteristics include: lower performance IQ relative to verbal IQ, poorer on visual-spatial tasks than on auditory linguistic tasks, better memory for verbal than nonverbal information, poor and clumsy handwriting, impairments in mathematics, and interpersonal difficulties (Rourke, 1989). Klin et al. (1995) suggested that this pattern of deficits is similar to that seen in AS, but not in HFA and that AS may reflect a dysfunction in the right hemisphere, while for HFA was proposed a left hemisphere dysfunction. However, in a study of our research group it was concluded that the neuropsychological NLD profile resembled the neuropsychological profile of children with AS as well as that of children with HFA (Verté et al., 2001). Still others have not found differences between HFA and AS on some of these non-specific domains (Ozonoff et al., 1991; Szatmari et al., 1990) or found the opposite pattern (Miller & Ozonoff, 2000).

While there are a growing number of studies comparing children with HFA and AS, PDDNOS is a neglected group. Only a few studies compared both subtypes with the PDDNOS subtype. To some degree, a subjective impression of less severe social deficits, better joint attention, and less ritualistic qualities seems to define the PDDNOS group (McArthur & Adamson, 1996; Pomeroy, 1998). Some have argued that high-functioning children with PDDNOS are not distinct from those with AS (Szatmari, 1998; van Engeland, 1996). Mahoney et al. (1998) found that the differentiation of PDDNOS from the other PDD subtypes was no better than chance. Mayes, Volkmar, Hooks, and Cicchetti (1993) reported that items distinguishing children with PDDNOS from those with autism were related to the degree of social relatedness (e.g., abnormal comfort seeking), with children with PDDNOS showing less severe disturbance. Prior et al. (1998) argued that the most salient discriminating features within autistic samples are those related to the ability level of the individual rather than to particular behavioral patterns. There are no studies that have analyzed the neuropsychological profiles of children with PDDNOS. Most researchers do not focus on this subgroup, probably because of a lack of specific diagnostic criteria and the heterogeneity of the diagnosis (Mayes et al., 1993; Walker et al., 2004).

Overall, across studies, findings for a differentiation between the three subtypes are inconclusive. More arguments are needed in favor of a distinction at the behavioral level as well as at the cognitive or neuropsychological level. Differences between studies are to a large extent due to different and often vague or unspecified criteria used to differentiate between the disorders. Furthermore, inconsistencies across studies may be the result of circularity. In some studies children with HFA and AS are compared on measures that are dependent on the diagnostic features used to distinguish both disorders, such as measures of intelligence (Szatmari et al., 1995; Volkmar & Lord, 1998). Although, currently, there is little indication of differential EF profiles within the autistic spectrum, EF measures may provide external discriminant validation, independent of the clinical diagnostic process (Ozonoff et al., 2000).

This study, based on Pennington and Ozonoff’s EF classification (1996, p. 53), used five domains: inhibition, visual working memory, planning, cognitive flexibility, and verbal fluency. Non-specific neuropsychological tasks were also used to investigate other cognitive domains. Some innovations in the current study were made in order to deal with inconsistent findings of previous studies. First, children with HFA, AS, and PDDNOS were compared on a battery of EF tasks covering the five EF domains, while most studies covered only two or three domains. For most tasks there exists evidence that they activate prefrontal functions selectively including studies with brain damaged subjects, and reports using functional magnetic resonance imaging and positron emission tomography (e.g., Gaillard et al., 2000; Riehemann et al., 2001; Rowe, Owen, Johnsrude, & Passingham, 2001).

Second, in this study an extensive selection procedure with explicitly-defined criteria was used to establish specific groups (see Method section). The attempt to distinguish disorders with respect to their EF profile can only be established with thoroughly defined clinical groups (Sergeant, Geurts, & Oosterlaan, 2002).

Third, comorbidity was specifically taken into account, since deficiencies in EF have been linked to other developmental disorders, including attention deficit hyperactivity disorder (ADHD), obsessive-compulsive disorder (OCD), and Tourette syndrome (TS; see for reviews Ozonoff, 1997; Pennington & Ozonoff, 1996; Sergeant et al., 2002). All PDD subtypes can co-occur with these disorders (Barnhill & Horrigan, 2002; Caron & Rutter, 1991; Eaves, Ho, & Eaves, 1994; Ehlers & Gillberg, 1993; Fombonne, 1998; McDougle et al., 1995; Ringman & Jankovic, 2000; Volkmar, 1999). Hence, how far earlier findings reflect deficits specific to PDD subtypes or are due to comorbidity with disorders such as ADHD, OCD or TS is unclear. On the one hand, a diagnosis of autism excludes a comorbid diagnosis of ADHD or OCD according to the DSM-IV-TR. On the other hand, comorbidity of autism and TS is possible (APA, 2000). Hence, it is important to control for the contribution of ADHD and OCD characteristics, and to exclude TS.

This study had two major aims. The first aim was to determine the EF profile of children with HFA, AS, and PDDNOS compared to each other and compared to normal controls. It was investigated if and to what extent the three subtypes could be distinguished on the EF domains. Because of inconsistent findings, it is still unclear whether children with HFA and AS can be distinguished in their EF profile. However, both groups are expected to show more problems than the NC group for most EF domains. Since children with PDDNOS have subthreshold PDD characteristics, we speculated that these children have more EF difficulties than the NC group, but less dysfunction than the HFA and AS groups. If differential EF profiles within the autistic spectrum are found, this will provide external validation for a differentiation between higher-functioning subtypes. The second aim was to investigate the contribution of comorbid symptoms of ADHD and OCD in HFA, AS, and PDDNOS, with respect to EF.

Method

Participants

Four groups of children participated in this study: 50 children with HFA, 37 children with AS, 25 children with PDDNOS, and 47 normal control children (NC). All children were in the age range of 6–13 years.

Before participation, parents were informed about the aims of the study, received a full description of the study, and written consents were obtained. Only children with a clinical diagnosis based on a multidisciplinary assessment and children who did not use medication (or used medication that could be discontinued, e.g., methylphenidate) participated in the study. Children were excluded if parents reported a history of epileptic seizures and if children fulfilled the diagnostic criteria for TS (APA, 2000). A three-stage selection procedure was used. At the first stage, parents and teachers were asked to complete questionnaires in order to obtain a broad view on the overall functioning of the child. Parents completed the Children’s Communication Checklist (CCC; Bishop, 1998; Dutch translation: Hartman et al., 1998), the Disruptive Behaviour Disorder rating scale (DBD; Pelham, Gnagy, Greenslade, & Milich, 1992; Dutch translation: Oosterlaan, Scheres, Antrop, Roeyers, & Sergeant, 2000), the Leyton Obsessional Inventory-Parent Version (LOI-PV; Berg, Whitaker, Davies, Flament, & Rapoport, 1988; Dutch translation: Scholing & Veenstra, 1997), and the Tourette Syndrome Symptom List (TSSL; Cohen, Leckman, & Shaywitz, 1985; Dutch translation: Buitelaar & Van de Wetering, 1996). Teachers completed the same questionnaires, except the LOI-PV. The questionnaires were used as selection instruments for the NC group only (see below). The questionnaires were used in the clinical groups to obtain a description of possible comorbid disorders. Intellectual functioning was assessed at the second stage. Four subtests (Vocabulary, Arithmetic, Picture Arrangement and Block Design) of the Wechsler Intelligence Scale for Children Revised (WISC-R; Van Haasen et al., 1986) were administered. The IQ estimated on the basis of these subtests correlates strongly (r = .93–.95) with full scale IQ (FSIQ; Groth-Marnat, 1997). Children were excluded from the study if their estimated FSIQ was below 80. At the third stage, diagnoses of the children in the clinical groups were validated using the Dutch translation of the Autism Diagnostic Interview-Revised (ADI-R; Lord, Rutter, & Le Couteur, 1994) and the parent version of the Diagnostic Interview Schedule for Children for DSM-IV (DISC-IV; Shaffer, Fisher, Lucas, Dulcan, & Schwab-Stone, 2000; Dutch translation: Ferdinand, Van der Ende, & Mesman, 1998). Group assignment to the clinical samples was based on the assessment of the children in these interviews.

PDD

One hundred and sixty children with a clinical diagnosis of PDD were recruited for participation through rehabilitation centers, special school services, and other agencies specialized in the care of children with autism. Two children were excluded because of a history of epileptic seizures. Eleven children were excluded because of the use of medication that could not be discontinued. Twelve children with an FSIQ below 80 were also excluded. Twenty-three children were excluded because they fulfilled the DSM-IV-TR criteria for TS. Group assignment to the clinical groups was based on the ADI-R scores. The DSM-IV-TR states that a child with AS who also meets criteria for autism should be given a diagnosis of autism (APA, 2000). Often, it is suggested that if this hierarchy rule is applied, a diagnosis of AS becomes unworkable (Eisenmajer et al., 1996; Ghaziuddin, Tsai, & Ghaziuddin, 1992; Manjiviona & Prior, 1995, 1999; Mayes, Calhoun, & Crites, 2001; Miller & Ozonoff, 1997, 2000; Ozonoff et al., 2000; Szatmari, Bryson, Boyle, Streiner, & Duku, 2003). In the present study, following the majority of studies, children with HFA and AS had to meet the criteria for impairment in communication, reciprocal social interaction, and repetitive and stereotyped behaviors. The criteria used for the differentiation between both groups were: children were assigned to the HFA group if they showed definite delays in language development (i.e., no single word speech by 24 months and/or no phrase speech by 36 months). Children were assigned to the AS group if they had no delays in language development, either in their use of words or phrases. These modified criteria for AS appear to be clinically useful (Howlin, 2003; Mahoney et al., 1998; Manjiviona & Prior, 1999). Children were assigned to the PDDNOS group if they attained the criteria for only two of the three domains on the ADI-R (i.e., the social and/or the communication, and/or the activities/interests domain). Hence, based on the selection criteria, 50 of the children were assigned to the HFA group, 37 were assigned to the AS group, and 25 were assigned to the PDDNOS group. Most of the children with a clinical diagnosis of PDDNOS also met the criteria for PDDNOS on the ADI-R. All the children with PDDNOS met the cutoff criteria for the social domain and approximately one half also met the criteria for the communication domain, but not for the restricted interests domain, while the other half showed the opposite pattern. These diagnostic profiles are consistent with DSM-IV-TR criteria (APA, 2000).

In the HFA group, 14 met DSM-IV-TR criteria for OCD, 5 for ADHD (combined type), and 4 for both OCD and ADHD (combined type) on the basis of the DISC-IV. In the AS group, 10 met criteria for OCD, 10 for ADHD, and 2 for both OCD and ADHD on the basis of the DISC-IV. In the PDDNOS group, 1 met criteria for OCD, 11 for ADHD, and none for both OCD and ADHD on the basis of the DISC-IV. Besides ADHD as such, we also searched for comorbidity with inattention and hyperactivity/impulsivity separately, because a different distribution may exist for both subtypes of ADHD. The children in the HFA group showed the following profiles: 32 met criteria for inattention and 12 for hyperactivity/impulsivity. The following profiles were obtained for the AS group: 27 met criteria for inattention and 15 for hyperactivity/impulsivity. The children in the PDDNOS group showed the following profiles: 22 met criteria for inattention and 13 for hyperactivity/impulsivity. Chi-square tests were performed to analyze if the distribution of comorbidity (i.e., ADHD, inattention, hyperactivity/impulsivity, and OCD) was equal within the different groups. The three PDD subgroups did not differ in terms of the distribution of cases with comorbid ADHD (χ2(2) = 1.97, ns) and inattention (χ2(2) = 1.10, ns), but did differ for comorbid hyperactivity/impulsivity (χ2(2) = 5.92, p = .05) and OCD (χ2(2) = 9.15, p = .01).

NC

Parents of 63 children from three regular schools approved their assignment to the study. Children were excluded from the study if (a) the parent or the teacher stated that the child had ever had a clinical diagnosis or used medication that could not be discontinued, (b) their FSIQ estimate was below 80, (c) the score on one of the four scales of the DBD exceeded the 80th percentile, (d) the pragmatic composite score on the CCC fell within two standard deviations of the mean score of the HFA group, (e) tic symptoms were reported on the TSSL, or (f) the teacher refused to complete the questionnaires. Forty-seven children fulfilled the inclusion criteria for the NC group.

Neuropsychological Measures

Both EF and non-specific tasks were administered in this study (see Table I for an overview). The EF tasks were selected to measure the domains of EF as suggested by Pennington and Ozonoff (1996, p. 53). The commonly used dependent variables were selected for each task. Since EF tasks are never pure measures of a single EF domain (e.g., Ozonoff, 1997), more than one task was included for some domains (e.g., inhibition, cognitive flexibility, and verbal fluency) to ensure that the domain was adequately covered.

Table I. Overview of Tasks and their Dependent Variables

EF Tasks and Dependent Measures

Change Task (De Jong, Coles, & Logan, 1995; Logan & Burkell, 1986; Oosterlaan & Sergeant, 1998). The change task was included to measure: (1) inhibition of a prepotent response, and (2) cognitive flexibility. Several studies have found that the right prefrontal cortex is involved while performing the stop signal task (Logan, 1994), a variant of the change task (e.g., Rubia et al., 1999). The task consisted of two types of trials (go trials and stop trials) that were presented in blocks of 64 trials. Go trials required children to locate the position of an aircraft that was displayed to the left or right of a fixation point on a computer screen by pressing a left or right button. Stop trials were identical to go trials, but in addition an auditory stop signal was presented, which directed children to (a) inhibit their response, and (b) immediately perform a different response, the change response (i.e., pressing a third button). Oosterlaan and Sergeant (1998) provided a detailed description of the change task used in this study. The following EF measures were derived from this task: (1) Stop Signal Reaction Time (SSRT), a measure of the latency of the inhibitory process, (2) Change Mean Reaction Time (MRT) as a measure of the latency of the set-shifting process, and (3) accuracy of cognitive flexibility (set-shifting) as measured by the number of change response errors.

Circle Drawing Task (Bachorowski & Newman, 1985, 1990). The Circle Drawing Task was used as a measure of inhibition of an ongoing response. The task consisted of a large circle with the words “start” and “stop” indicating the starting and the finishing point of the tracing. The task was administered under two conditions: first with neutral instructions (“trace the circle”) followed by inhibition instructions (“trace the circle again, but this time as slowly as you can”). The dependent variable in this task was the time used to trace the circle in the inhibition condition minus the tracing time in the neutral condition. The greater the inhibition time, the better a participant was able to inhibit (slow down) a continuous tracing response.

Test of Everyday Attention for Children, Subtest Opposite Worlds (TEA-Ch; Manly et al., 2001). The TEA-Ch was used as a measure of inhibition (interference control). In this test, the child was required to inhibit an automatic or prepotent verbal response. In the neutral condition, the child has to name the digits 1 and 2 that are scattered along a path. In the suppression condition, the child was required to say “one” when he saw a “2” and “two” when he saw a “1”. The dependent variable was the difference between the mean time required to complete two neutral conditions and two suppression conditions.

Self-Ordered Pointing Task, Abstract Designs (SoP; Petrides & Milner, 1982). The SoP was included to measure visual working memory. The SoP is one of the rare tests that have been validated as a relatively selective frontal cortex measure, especially activating the mid-dorsolateral frontal cortex (Petrides, Alivisatos, Evans, & Meyer, 1993). In this task, the children were presented with four series of cards containing respectively 6, 8, 10, and 12 abstract designs. For each series, the children were shown one card at a time (the positions of the designs varied randomly) and were instructed to point to a different design on each of the cards. The dependent variable in this task was the number of errors (i.e., the number of times a design was responded to more than once). Furthermore, the difficulty level (6, 8, 10, or 12 items) was taken into account. It was expected that there would be a linear relationship between the difficulty level and the number of errors. Therefore, the number of errors was measured for each level of difficulty. It was expected that, if children have a deficit in visual working memory, the number of errors would increase with greater difficulty compared to children without a visual working memory problem.

Tower of London (ToL; Krikorian, Bartok, & Gay, 1994). The ToL was selected to tap planning (Shallice, 1982). Materials and procedures for administration and scoring were derived from Krikorian et al. (1994). Several studies suggest that ToL performance activates frontal cortex functioning, especially the left frontal cortex (e.g., Baker et al., 1996; Dagher, Owen, Boecker, & Brooks, 1999; Levin, Mendelsohn, Lilly, & Fletcher, 1994; Rowe et al., 2001). Starting from a fixed arrangement of three colored balls (red, blue, and yellow) on two of three pegs, the child is required to copy a series of depicted end-states by rearranging the balls. Twelve problems of graded difficulty were presented with allowance of a maximum of three trials to solve each problem. Three measures were derived. The main dependent variable was the ToL score, which was calculated by assigning points based on the number of trials required to solve a problem. There were three difficulty levels. The maximum ToL score for each level of difficulty was 12 points. Total item scores were calculated for the whole test as well as for each of the three difficulty levels. Two temporal measures were derived for the whole test as well as for each level of difficulty: (1) decision time, which is the time between the presentation of a problem and the initiation of the first move of a trial (ball leaves peg), and (2) execution time, which is the time between the initiation of the first move to the completion of the final move of a trial. These measures were derived for the first attempt on each problem. It was expected that there would be a linear relationship between the difficulty level and the dependent variables.

Wisconsin Card Sorting Test (WCST; Grant & Berg, 1948; Heaton, Chelune, Talley, Kay, & Curtiss, 1993; Heaton, 1981). The WCST is a widely used measure to tap cognitive flexibility or set-shifting. In this study, the paper and pencil version of Grant and Berg (1948) was used (see Heaton et al., 1993 and Heaton, 1981). Several studies have found that WCST performance activates the right dorsolateral frontal cortex (e.g., Berman et al., 1995; Lombardi et al., 1999; Riehemann et al., 2001). The dependent variable was the percentage of perseverative responses. This percentage was calculated from the number of trials in which the child continued sorting by a previously correct category despite negative feedback, and the total number of cards the child needed to complete the task. A computer based scoring program was used to calculate the dependent variables (Harris, 1990).

Verbal Fluency (Benton & Hamsher, 1978). An adaptation of the Controlled Word Association Task was used to measure the capacity to generate novel responses. Several studies have shown that verbal fluency tends to activate the left prefrontal cortex (e.g., Frith, Friston, Liddle, & Frackowiak, 1991; Gaillard et al., 2000; Phelps, Hyder, Blamire, & Shulman, 1997; Schlosser, Aoyagi, Fulbright, Gore, & McCarthy, 1998). The children were required to name as many examples of a particular category within one minute. The dependent measures in this task were the total number of admissible words across the semantic categories “animals” and “food”, as well as across the letter categories K and M.

Non-specific Neuropsychological Tasks and Dependent Measures

Change Task, Primary Part (De Jong et al., 1995; Logan & Burkell, 1986; Oosterlaan & Sergeant, 1998). The change task was also included to measure the response execution process. These measures were calculated from the primary part of the task (i.e., a two-choice reaction time task that consisted of go trials only). The following response execution measures were derived: (1) MRT, a measure of the latency of the response execution process, (2) variability in the latency of the response execution process (response variability), and (3) accuracy of responding as measured by the number of errors on the go trials (including both omission errors and commission errors).

Benton Visual Retention Test (BVRT; Sivan, 1992). The BVRT measures visual–spatial abilities and visual short-term memory. This task was included to control for visual short-term memory in the SoP. The BVRT consists of ten designs, each containing one or more figures. The child was required to reproduce the designs immediately after they were presented for 10 seconds. The number of correct designs was the dependent measure (Lezak, 1995; Sivan, 1992).

Corsi Block Tapping Test (Corsi; Lezak, 1995; Milner, 1971; Schellig, 1997). The Corsi measures visual–spatial memory span (Berch, Krikorian, & Huha, 1998; Della Sala, Gray, Baddeley, Allamano, & Wilson, 1999; Lezak, 1995). The Corsi was included to control for visual short-term memory in the SoP. In this task, the child has to begin to copy a 3-block item. The number of items was increased by one after a particular difficulty level was successfully completed. There were three trials for each difficulty level. Schellig (1997) provided a detailed description of this task. The dependent variable was the visual memory span of the child, which was defined as the difficulty level for which the child was able to finish at least two trials successfully.

Categories of the Snijders–Oomen Nonverbal Intelligence Test Revised (SON-R; Snijders, Tellegen, & Laros, 1989; Tellegen & Laros, 1993). The subtest Categories is one of the subtests of the SON-R and measures semantic memory and the ability to categorize. This task was included to control semantic memory capacities in verbal fluency. In previous research the fluency task has not only been used for tapping EF, but also as a semantic memory task (e.g., Elwood, 1997; Rosen, 1980). The task was also included to control for the ability to categorize, which is required in the WCST (Grant & Berg, 1948; Heaton, 1981). In Categories, the child was first shown three pictures and has to decide what they have in common. Next, five pictures were presented to the child and the child was required to choose those two pictures that depict the same semantic concept. The dependent variable was the number of correct items.

Beery Visual Motor Integration (Beery-VMI; Beery, 1997). The Beery-VMI was designed to assess visual-motor integration or the degree to which visual perception and finger-hand movements are co-ordinated. The task consists of 27 geometric forms of increasing complexity presented on paper. The child was required to copy these forms. The Beery standard score was used as the dependent variable.

Procedure

When written consent was obtained from the parents, families were contacted by phone and appointments for the diagnostic interviews and neuropsychological testing of the children were made. The screening questionnaires for the parents and teachers were sent by mail. All the children were tested individually on three different occasions. During the first session, the WISC-R was administered. During the second and third session, the neuropsychological measures were administered. The tests were administered in a fixed order.

For the clinical groups, testing took place at the university or in the setting where the children were treated. Controls were tested at school during class hours. Twenty-three children from the clinical groups were on methylphenidate, but discontinued medication at least 20 h before testing (Barkley, DuPaul, & Connor, 1999) allowing for a complete wash-out (Greenhill, 1998). The children discontinued the use of methylphenidate after their morning dose on the day before testing. Each session ended with a small reward for the child. For practical reasons, for some children the WISC-R was administered during the third instead of the first session or testing took place before the diagnostic interviews.

Statistical Analyses

Six group contrasts were analyzed: (1) HFA versus NC, (2) AS versus NC, (3) PDDNOS versus NC, (4) HFA versus AS, (5) HFA versus PDDNOS, and (6) AS versus PDDNOS. The alpha level was adjusted to compensate for the number of comparisons made. For each contrast, alpha was set at .01.

First, the EF and non-specific measures were analyzed using ANOVAs with group (4 levels) as the between subject factor. MANOVAs were conducted instead of ANOVAs when a task had more than one dependent variable (i.e., the cognitive flexibility and response execution measures of the change task, and the fluency measures). Repeated measures were performed for the SoP and the ToL with one between factor group (4 levels). The within subject factor for the SoP consisted of the 4 levels of difficulty (6, 8, 10, or 12 items) for the number of errors. The within factors for the ToL consisted of the 3 levels of difficulty (2/3, 4 or 5 moves) for three dependent measures: total score, decision time at the first attempt, and execution time at the first attempt.

Second, groups were compared, while covarying for FSIQ and age. FSIQ was covaried because there were significant group differences for FSIQ. Age was covaried because EFs are still developing during the age range in this study, and this might influence the outcome despite the fact that there were no group differences for age.

Third, ADHD and OCD characteristics were controlled for because these characteristics are frequently associated with PDD. Furthermore, in this study, both characteristics were not always equally divided within the groups (i.e., hyperactivity and OCD). Therefore, a canonical correlation analysis was performed. This analysis allows investigating the relationship between two sets of dependent variables (Stevens, 1996; Tabachnick & Fidell, 1996). On the one hand, for each neuropsychological measure, a z-score was calculated and for each domain (i.e., inhibition, visual working memory, cognitive flexibility, planning, verbal fluency, and the non-specific domain) an aggregate score was obtained by averaging the z-scores of the dependent variables of each domain. Six neuropsychological measures (i.e., one aggregate measure for each domain) were entered into the analysis. On the other hand, 3 behavioral measures were used (i.e., PDD, ADHD, and OCD). A composite score of the parent CCC pragmatic score, as well as the subscales Social Relations and Interests was made to measure PDD characteristics. Hence, the triad of characteristic symptoms of PDD was covered. The mean score of the parent DBD (combination of the attention and the hyperactivity/impulsivity subscales) measured ADHD characteristics. The LOI-PV composite score assessed OCD characteristics. The psychometrical qualities of these behavioral measures have been well established (Bishop & Baird, 2001; King, Inglis, Jenkins, Myerson, & Ollendick, 1995; Oosterlaan et al., 2000).

Missing Data and Outliers

Data were missing for some children due to technical reasons. For each group and for each dependent measure, children with extreme scores (i.e., values more than three boxplot lengths from the upper or lower edge of the box) were identified and removed from the analyses. For the MANOVAs and MANCOVAs, only those children who had extreme scores for more than one of the dependent measures were excluded. The number of missing data and/or extreme scores ranged from zero to seven. The distribution of missing data over the groups was as follows: (1) one missing case, and 0 to 2 extreme cases for each dependent variable for the NC group, (2) one missing case, and 0 to 3 extreme cases for each dependent variable for the HFA group, (3) one missing case, and 0 to 3 extreme cases for each dependent variable for the AS group, and (4) no missing cases, and 0 to 2 extreme cases for each dependent variable for the PDDNOS group. Furthermore, missing and extreme cases were equally divided across the age groups (range from 0 to 3 in the younger age groups and from 0 to 2 in the older age groups). Younger children achieved an adequate baseline score on the tasks. Overall, missing data and extreme outliers were not more common among younger children or children in a particular diagnostic group.

Results

Group Contrasts

Table II provides the ages, gender composition, estimated FSIQs, rating scale, and interview scores for the groups. A chi-square test was performed for gender. Group differences for the other measures were studied with ANOVAs, using an overall alpha level of .05.

Table II. Group Means and Standard Deviations for Gender, Age, IQ, and Rating Scale Scores

The groups did not differ with respect to gender or age. The groups differed with respect to FSIQ and verbal IQ. Children with HFA and PDDNOS had lower IQs than the NC group. The groups did not differ with respect to performance IQ.

In general, the NC group showed significantly less difficulties on all rating scale scores in comparison to the three clinical groups. Overall, the clinical groups were clearly distinguishable from the NC group on the basis of the questionnaires.

Parents and teachers rated the HFA group significantly lower, indicating more impairment, on the pragmatic composite score and the Interests scale of the CCC than the PDDNOS group. Furthermore, teachers rated the AS group as more impaired on the Interests scale of the CCC than the PDDNOS group. In line with expectations, the HFA and AS groups showed significantly more symptoms of autism than the PDDNOS group on the ADI-R. The HFA group was rated as having more social problems and more repetitive or stereotyped behavior than the AS group.

The clinical groups could not be differentiated on any of the parent or teacher DBD subscales. The clinical groups did not differ on the behavior disorders section of the DISC-IV. Disruptive behavior problems were approximately the same for all clinical groups.

The clinical groups could not be differentiated on the LOI-PV and on the OCD-scale of the DISC-IV. This indicates that the clinical groups did not differ in their ratings of obsessions and compulsions.

The clinical groups could not be differentiated on the parent or teacher TSSL and on the TS-scale of the DISC-IV. This indicates that the clinical groups did not differ in the number of tic symptoms.

Correlations between Dependent Measures

Interpretation of the correlations between the dependent measures was: a correlation of r = .10 is low, a correlation of r = .30 is moderate, and a correlation of r = .50 is high (Cohen, 1988). The mean correlation between the dependent variables of the EF tasks was moderate (r = .31, range r = |.01|–r = |.62|). This implies that tasks within the EF domain share some variance. The mean correlation between the dependent variables of the non-specific tasks was moderate (r = .45, range r = |.28|–r = |.72|). This indicates common variance between the non-specific variables. The mean correlation between the EF variables and non-specific variables was low (r = .27, range r = |.15|–r = |.60|), indicating that some distinguishment between the EF domain and the non-specific domain was possible. However, the pattern of correlations did not reveal unambiguously that the EF and non-specific domains were independent, because the mean correlation between the EF and non-specific measures was not significantly different from the mean correlation among the EF measures themselves. This analysis was performed with the formula provided by Hays (1981).

HFA, AS, PDDNOS, and NC Group Comparisons

Table III presents the results of the EF measures. Table IV provides the results of the non-specific measures. Table V gives the results of the repeated measures for both the SoP and the ToL. As mentioned earlier, for each measure, six group contrasts were analyzed: (1) HFA versus NC, (2) AS versus NC, (3) PDDNOS versus NC, (4) HFA versus AS, (5) HFA versus PDDNOS, and (6) AS versus PDDNOS. Below, only significant contrasts will be reported.

Table III. Group Means and Standard Deviations for Executive Function Tasks
Table IV. Group Means and Standard Deviations for Non-specific Neuropsychological Tasks
Table V. Group Means and Standard Deviations for Visual Working Memory and Planning (Repeated Measures)

EF Domains

Inhibition

There was a main effect of group for SSRT, F(3,147) = 6.81, p < .001, η 2 = 0.12. The three clinical groups had more difficulty in inhibiting a prepotent response than the NC group (p = .001 for the HFA group, contrast 1; p < .001 for the AS group, contrast 2; and p = .01 for the PDDNOS group, contrast 3). The three clinical groups could not be differentiated from one another. The HFA and AS groups remained slower than the NC group after covarying for age and FSIQ (contrasts 1 and 2).

There was a significant effect of group in the time used on the Circle Drawing Task, F(3,153) = 3.46, p = .02, η 2 = 0.06. The HFA group used less time than the NC group (p =.002, contrast 1), and thus had more difficulty in inhibiting an ongoing response. The other contrasts were not significant. This contrast was no longer significant after controlling for age and FSIQ.

There was a significant group effect for the time difference on the TEA-Ch, F(3,155) = 4.79, p = .003, η 2 = 0.09. The AS group had a larger time difference than the NC group (p < .001, contrast 2), and thus had more problems in interference control. None of the other five contrasts were statistically significant. This contrast remained significant after controlling for age and FSIQ. When age and FSIQ were taken into account, the AS group was also significantly different from the PDDNOS group (contrast 6).

Visual Working Memory

There was a significant group effect for the number of errors committed in the SoP, F(3,154) = 6.96, p < .001, η 2 = 0.12. The NC group had significantly less errors than the HFA (p < .001, contrast 1) and AS groups (p = .01, contrast 2). These contrasts remained significant after controlling for age and FSIQ. Furthermore, when age and FSIQ were taken into account, the HFA group was also significantly different from the PDDNOS group (contrast 5).

A significant interaction between group and increasing difficulty was found, F(9,373) = 4.17, p < .001, η 2 = 0.08. The contrast analyses showed that the increase in the number of errors with the four levels of difficulty was greater for the HFA and AS than for the NC group (p < .001, contrast 1, and p = .002, contrast 2). The AS group committed more errors with increasing difficulty than the PDDNOS group (p = .003, contrast 6). These contrasts remained significant after controlling for age and FSIQ.

Planning

The groups differed significantly on a combination of the three planning measures (ToL score, decision time, and execution time), Wilks’ Λ = .77, F(9,365) = 4.51, p < .001, η 2 = 0.08. A significant group effect was found for the ToL Score, F(3,152) = 5.00, p = .002, η 2 = 0.09. The NC group scored higher than the HFA (p < .001, contrast 1) and AS groups (p = .005, contrast 2). After controlling for age and FSIQ, these contrasts disappeared. Significant group differences were obtained for the ToL decision time, F(3,152) = 5.12, p = .002, η 2 = 0.09. The three clinical groups had faster decision times than the NC group (p = .002 for the HFA group, contrast 1; p = .01 for the AS group, contrast 2; and p = .001 for the PDDNOS group, contrast 3). After covarying for age and FSIQ, the HFA and PDDNOS groups remained faster than the NC group (contrasts 1 and 3). Significant group differences were also obtained for the ToL execution time, F(3,152) = 7.06, p < .001, η 2 = 0.12. The HFA group required more time to complete the task than the NC (p < .001, contrast 1) and PDDNOS groups (p = .002, contrast 5). The AS group needed significantly more time to complete the task than the NC group (p = .006, contrast 2). Only the distinctions between the HFA group with the NC and PDDNOS groups remained after controlling for age and FSIQ (contrasts 1 and 5).

When the difficulty level was taken into account, there was no significant interaction between group and ToL score (F(6,306) = 2.02, ns, η 2 = 0.04), ToL decision time (F(6,298) = 2.16, ns, η 2 = 0.04), or ToL execution time (F(6,300) = 2.00, ns, η 2 = 0.04). This indicates that the problems with planning in the clinical groups were independent of the increasing planning load of the task.

Cognitive Flexibility

The groups differed significantly on a combination of the two cognitive flexibility measures of the change task (MRT and errors), Wilks’ Λ = .91, F(6,286) = 2.45, p = .02, η 2 = 0.05. There was a significant effect for change MRT, F(3,144) = 4.35, p = .01, η 2 = 0.08, but not for the number of errors in the change task, F(3,144) = 1.56, ns, η 2 = 0.03. The three clinical groups were significantly slower than the NC group (p = .01 for the HFA group, contrast 1; p = .005 for the AS group, contrast 2; and p = .005 for the PDDNOS group, contrast 3). After controlling for age and FSIQ, none of the contrasts survived.

On the second measure of cognitive flexibility, the WCST, there was a significant effect of group for the percentage of perseverative responses, F(3,149) = 7.56, p < .001, η 2 = 0.13. The three clinical groups made more perseverative responses than the NC group (p < .001 for the HFA group, contrast 1; p = .002 for the AS group, contrast 2; and p < .001 for the PDDNOS group, contrast 3). After covarying for age and FSIQ, the HFA and PDDNOS groups remained significantly more perseverative than the NC group (contrasts 1 and 3).

Verbal Fluency

There was a significant group effect for a combination of the two verbal fluency measures (semantic and letter categories), Wilks’ Λ = .82, F(6,306) = 5.36, p < .001, η 2 = 0.10. The groups differed significantly for the semantic category (F(3,154) = 9.62, p < .001, η 2 = 0.16). The three clinical groups had fewer correct responses than the NC group (p < .001 for contrasts 1, 2, and 3). The results did not alter after controlling for age and FSIQ. Groups differed significantly for the letter category (F(3,154) = 6.50, p < .001, η 2 = 0.11). The HFA and PDDNOS groups had fewer correct responses than the NC group (p < .001, contrast 1, and p = .002, contrast 3). The significant difference between the HFA and NC groups remained after controlling for age and FSIQ (contrast 1).

Non-specific Domains

Response Execution

There was a significant group effect for a combination of the three response execution measures of the primary part of the change task (MRT, response variability, and errors), Wilks’ Λ = .85, F(9,346) = 2.65, p = .01, η 2 = 0.05. The groups did not differ on response execution MRT (F(3,144) = 1.11, ns, η 2 = 0.02). Furthermore, all the groups achieved an acceptable response level in the baseline response condition (above 80% correctly according to Logan’s guidelines). The following response levels were obtained: 93.9% in the NC group, 88.8% in the HFA group, 91.8% in the AS group, and 92.4% in the PDDNOS group. There was a significant effect for the standard deviation of reaction times (response variability), F(3,144) = 5.19, p = .002, η 2 = 0.10. The three clinical groups demonstrated greater variability in speed of responding than the NC group (p = .01 for the HFA group, contrast 1; p = .001 for the AS group, contrast 2; and p = .003 for the PDDNOS group, contrast 3). None of these contrasts survived after controlling for age and FSIQ. There was also a significant effect for the number of errors, F(3,144) = 6.10, p = .001, η 2 = 0.11. The HFA and AS groups committed more errors than the NC group (p = .001, contrast 1, and p < .001, contrast 2). After controlling for age and FSIQ, the same contrasts remained robust.

Short-term Memory

There were significant group differences on both measures of visual short-term memory: the Corsi, F(3,153) = 5.67, p = .001, η 2 = 0.10, and the BVRT, F(3,155) = 8.95, p < .001, η 2 = 0.15. The HFA and AS groups had lower scores for the Corsi than the NC group (p = .001, contrast 1, and p < .001, contrast 2). These contrasts were not significant after controlling for age and FSIQ. The three clinical groups had lower scores for the BVRT than the NC group (p < .001 for the HFA group, contrast 1; p < .001 for the AS group, contrast 2; and p = .001 for the PDDNOS group, contrast 3). The HFA and AS groups had poorer performance than the NC group, after controlling for age and FSIQ (contrasts 1 and 2).

Categorization

A significant effect of group was found for the number of correct responses on the categorization task of the SON-R, F(3,154) = 4.54, p = .004, η 2 = 0.08. The three clinical groups gave fewer correct responses than the NC group (p = .002 for the HFA group, contrast 1; p = .01 for the AS group, contrast 2; and p = .005 for the PDDNOS group, contrast 3). None of the contrasts were significant after controlling for age and FSIQ.

Visual-motor Integration

A significant effect of group was found for the Beery standard score, F(3,154) = 2.99, p = .03, η 2 = 0.06. The PDDNOS group had a lower score than the NC group (p = .005, contrast 3). This contrast disappeared after controlling for age and FSIQ.

Canonical Analysis

A canonical correlation analysis was performed to analyze the relationship between the 6 neuropsychological measures (i.e., inhibition, visual working memory, planning, cognitive flexibility, verbal fluency, and non-specific), and 3 behavioral measures (i.e., PDD, ADHD, and OCD). The standardized weights of the canonical correlation analysis can be found in Table VI. Because there do not exist clear cut-off scores for interpreting a measure as part of a canonical correlation, guarding ourselves from making overinterpretations, we only interpreted canonical weights of .40 or more as being significant.

Table VI. Standardized Weights of Canonical Correlation Analysis with Aggregate EF and Non-specific Measures, and PDD, ADHD and OCD Characteristics (n = 159)

The three canonical correlations were respectively .50, .40, and .26. The three correlations were significantly different from zero, Wilks’ Λ = .59, F(18,391) = 4.43, p < .001. The latter two correlations were significantly different from zero, Wilks’ Λ = .79, F(10,278) = 3.53, p < .001. However, the third correlation was not, Wilks’ Λ = .93, F(4,140) = 2.48, ns. Therefore, only the first two canonical correlations were interpreted. The standardized canonical weights of the first canonical correlation showed that planning (−.64) and to a lesser extent, verbal fluency (−.45) was related to ADHD (.88). The second canonical correlation showed that visual working memory (−.61), planning (−.57), and to a lesser extent, verbal fluency (.47) was related to PDD (1.36) and ADHD (1.27).

Discussion

This study investigated whether children with different PDD subtypes (HFA, AS, and PDDNOS) can be differentiated from each other and from a normal control group on their neuropsychological EF profile.

Compared to normal control children, the children with HFA had a deviant EF profile for cognitive flexibility, visual working memory, planning, verbal fluency, and inhibition of a prepotent and ongoing response. Children with AS had difficulties with cognitive flexibility, visual working memory, planning, verbal fluency (semantic), inhibition of a prepotent response, and interference control. The PDDNOS group had difficulties with cognitive flexibility, verbal fluency, and inhibition of a prepotent response. When the three PDD subtypes were compared to each other, only three significant differences were found. No significant differences emerged between the HFA and AS groups. These results partially confirm earlier studies, suggesting that the EF profiles of children with HFA and AS are relatively equivalent (e.g., Klin et al., 1995; Miller & Ozonoff, 2000). We may conclude that, overall, the EF profile of the PDDNOS group lies between that of the NC group and that of the HFA and AS groups. In contrast to most other studies (e.g., Ozonoff, 1997), we also found inhibition problems in the PDD groups. A possible reason for these results is that the children in the clinical groups who also met the diagnostic criteria for ADHD were not excluded. Barkley (1997a, 1997b) considers an inhibition dysfunction as the core deficit in ADHD. Hence, in this study, the inhibition deficits in the PDD groups may be partly due to ADHD. However, Geurts et al. (2004) could not distinguish children with HFA (without ADHD) and children with ADHD (without autism) in the inhibition domain. Furthermore, poor verbal fluency was not restricted to the HFA group here.

Poor performance in children with HFA, AS, and PDDNOS was also found for many of the non-specific neuropsychological measures compared to the NC group. However, no significant differences emerged within the autistic spectrum. The non-specific profiles of children with HFA, AS and PDDNOS were relatively equivalent. Hence, the finding that children with HFA and AS can be differentiated in the domain of visual-motor integration, visual memory and visual-spatial perception, and non-verbal concept formation (Klin et al., 1995) was not replicated in this study. Those aspects of the NLD profile that were measured in the present study seem to be equally disturbed in children with HFA as in children with AS. The three PDD subtypes did not differ with respect to IQ. However, as reported earlier (e.g., Ehlers et al., 1997; Klin et al., 1995), and in this study, the AS group had a slightly higher verbal IQ than performance IQ, and the HFA group had a higher performance IQ than verbal IQ. This difference was significant only for the HFA group (t(49) = −3.83, p < .001). Overall, it may be concluded that some of the EF deficits in the PDD groups may be partly due to these non-specific cognitive processes. However, more research is needed before the conclusion can be drawn that autism is characterized by a variety of cognitive impairments.

The second aim of this study was to investigate the contribution of ADHD and OCD characteristics in PDD with respect to EF. The canonical correlations showed that visual working memory was related to PDD. There was no unique contribution for OCD characteristics. Planning and verbal fluency were related to ADHD symptoms. Future research needs to take into account that ADHD symptoms in children with PDD can influence performance on EF tasks, especially in the domains of verbal fluency and planning. This is an important issue, because in the present study, the PDD subgroups did differ in terms of the distribution of cases with comorbid hyperactivity/impulsivity. Hence, comorbidity may play a different role in the EF-profiles of different autism spectrum disorders. In order to come to grips with the exact role of comorbidity in EF studies of children with PDD, it is important that future studies outline the contribution of comorbidity and how this differs across subgroups employed.

This study adds further support to the view that executive dysfunctioning plays an important role in autism. However, this is the case for the whole autistic spectrum. Hence, the usefulness of a distinction between different PDD subtypes has not been demonstrated here (see also Howlin, 2003; Kugler, 1998; Schopler, Mesibov, & Kunce, 1998). We also conducted a discriminant analysis to determine whether a combination of the six aggregate neuropsychological measures (i.e., inhibition, visual working memory, planning, cognitive flexibility, verbal fluency, and non-specific domain) could predict group assignment. When we tried to predict the NC group and PDD group as a whole, 79.9% of the cases were correctly classified: 55.3% of the NC group and 90.2% of the PDD children. This means that the neuropsychological measures could accurately differentiate PDD from non-PDD children. However, when we tried to predict the specific PDD groups, only 46.5% of the cases were correctly classified: 68.1% of the NC group, 52.0% of the HFA group, 18.9% of the AS group, and 56.0% of the PDDNOS group. Hence, a battery of EF tasks is not a powerful predictor of PDD subtypes.

In order to conclude that EF deficits are primary core deficits of children with PDD, it must be demonstrated that the level of EF deficits is related to the level of behavioral symptoms that characterize PDD (e.g., Liss et al., 2001). Therefore, the twelve EF measures were correlated with a composite score of the CCC pragmatic score, as well as the subscales Social Relations and Interests. The mean correlation for parents was r = .31 (range r = |.11|–r = |.38|), and for teachers r = .34 (range r = |.17|–r = |.44|). All the correlations were significant, with the exception of the correlations between the parent and teacher pragmatic score with the number of errors in the Change task. Contrary to Liss et al. (2001), most of the correlations remained significant when verbal IQ was partialled out. We conclude that, overall, EF is moderately related to behavioral symptoms of PDD. However, the present results indicate that EF deficits in the PDD groups may be partly due to non-specific cognitive problems or to comorbidity with ADHD. Hence, we cannot completely rule out that EF problems in children with PDD are secondary problems rather than a primary deficit (e.g., Zelazo & Müller, 2002). Liss et al. (2001) argued that impaired EF is not universal in PDD and is unlikely to cause the syndrome. This same remark may also apply to ToM research. Usually, little association is found between ToM problems and symptoms of PDD (e.g., Frith, 2003; Turner, 1999b).

Some limitations of this study should be noted. First, a compelling model or framework of EF is lacking (Denckla, 1996; Eslinger, 1996). What has emerged in the field is a broad and poorly defined construct of EF (Rabbitt, 1997). Hence, various theoretical perspectives claim different relationships between the five EF domains (e.g., Barkley, 1997a, 1997b; Fuster, 1997; Miyake et al., 2000; Pennington, Bennetto, McAleer, & Roberts, 1996). Future research should focus on the development of an EF framework that is useful in children and adults as well as in normal and abnormal development. If one can start from a general EF framework, one can analyze more clearly whether specific disorders are associated with different profiles of EF strengths and EF weaknesses.

Second, this study used a categorical distinction between children with HFA and AS based on language development. Although many researchers have used these same selection criteria, there is some doubt regarding the use of early language delay as a differential criterion for AS and HFA (Eisenmajer et al., 1998; Prior et al., 1998). There is a need for clear criteria in order to make a more stringent distinction between subgroups within the autistic spectrum. In the literature, different suggestions can be found. Gillberg and Gillberg (1989) developed their own criteria for AS on the basis of clumsiness, impaired EF, or pedantic speech. However, these authors concluded that it was impossible to distinguish AS from HFA on the basis of these criteria. Pomeroy (1998) suggested to define three PDD subtypes without mental retardation: (1) a classical autistic disorder without mental retardation, (2) an autistic-like disorder with speech and language deviation, but less evidence of motor abnormalities or rigid behavior, and (3) an autistic-like disorder in which speech is well preserved, but severe social difficulties and rigid behaviors predominate. Others have argued that it might be more useful to assess children in terms of their IQ or current language level (Beglinger & Smith, 2001; Fein et al., 1999; Manjiviona & Prior, 1999; Rutter & Schopler, 1992). Several authors argued that a dimensional view of the autistic spectrum is more appropriate than a categorical approach (Leekam, Libby, Wing, Gould, & Gillberg, 2000). Prior et al. (1998) suggested that a promising method is to use statistical approaches to look at factors or clusters of symptoms that characterize empirically derived subgroups. These authors proposed the use of a spectrum of autistic disorders on which children differ primarily in terms of degrees of social and cognitive impairments (see also Waterhouse et al., 1996). However, Kugler (1998) argued that the classification of groups derived from the data itself, rather than relying on group assignment by diagnostic criteria, needs to be considered carefully. He argued that despite the apparent advantages of the method, the derived classification could only be as valid as the population targeted and the variables selected for study. In conclusion, there is a need to assess other dimensions to evaluate the validity of subtypes. However, we have to be careful with circularities made on the basis of particular measures by making comparisons of the resulting subtypes using the same or similar measures, such as measures of intelligence or language that are frequently used to initially define the syndromes.

Third, although the tasks included in this study were assigned to a specific domain, we are aware that EF tasks cannot be considered pure measures of a single EF domain (e.g., Denckla, 1996; Ozonoff, 1997). To deal with this measurement problem, tasks that overlap in their EF demands were included to ensure that the domains were adequately covered, and that possible deficits were not due to the task chosen. This study improved upon previous studies by the inclusion of non-specific measures to control for various cognitive processes. Besides classical neuropsychological measures, information-processing tasks were also applied (i.e., the change task, the SoP, and the ToL). Contrary to the classical tasks, these tasks measure specific component processes and make experimental manipulations possible (Ozonoff, 1997; Rapport, Chung, Shore, Denney, & Isaacs, 2000). Future research needs to consider the development of valid information-processing EF measures for children (e.g., Anderson, 2002; Archibald & Kerns, 1999; Beveridge, Jarrold, & Pettit, 2002). Furthermore, current EF tasks often lack ecological validity. The use of a structured testing environment removes the executive demands normally placed on children in free situations (e.g., Jarrold, Boucher, & Smith, 1994). It is important that new EF measures will be developed reflecting the behavioral difficulties of children in natural daily life situations (Hill, 2004; Jarrold et al., 1994; Liss et al., 2001). With well-validated tasks, we may be more able to reliably measure possible differences between autism spectrum disorders.

In conclusion, the present findings suggest an autistic spectrum (see also Eisenmajer et al., 1996; Leekam et al., 2000; Ozonoff et al., 2000). The results suggest that for group comparisons it is important to take comorbidity into account, especially comorbidity with ADHD symptoms. The differentiation of PDD subtypes is important only if these distinctions carry implications with respect to neurologic pathophysiology, outcome, and response to specific remedial approaches (Beglinger & Smith, 2001; Kugler, 1998; Mahoney et al., 1998; Schopler, 1996). More studies are required concerning these issues, especially within a systematic long-term developmental follow-up (Ozonoff & McEvoy, 1994). Also, brain imaging studies or research of possible genetic markers can be important for this purpose (Szatmari, 1998). A better understanding of the behavioral and neuropsychological heterogeneity of disorders within the autistic spectrum is crucial to improve research and clinical practice.