Introduction

Although mentioned as secondary features, behavior problems such as temper tantrums, overactivity, aggression, and self-injurious behaviors are part of the clinical descriptions of autism in current psychiatric diagnostic systems (APA, 1994; WHO, 1992). There is ample evidence indicating that young people with pervasive developmental disorders (PDDs) present with a wide range of behavior and emotional problems, with symptoms of anxiety, depression, and attention deficit hyperactivity disorder (ADHD) being the most frequently reported (e.g., see Gillberg & Billstedt, 2000; Lainhart, 1999; Sverd, 2003).

An important barrier to the study of behavior and emotional problems is the heterogeneity of symptoms presented by individuals with PDDs. Such individuals often differ significantly in cognitive and adaptive functioning, and the nature and severity of autistic behaviors vary and change with development. It is not well understood how these individual differences impact the occurrence and presentation of behavior and emotional problems beyond the core symptoms that define the PDD population.

Comparisons across studies and generalization of findings have been hampered by the fact that many studies contained small or clinically referred samples or did not use standardized instruments. Noteworthy exceptions include the recent studies by Gadow and colleagues (Gadow, DeVincent, & Azizian, 2004; Gadow, DeVincent, Pomeroy, & Azizian, 2005) and Tonge and Einfeld (2003). Gadow et al. (2004) and Gadow et al. (2005) compared the severity and prevalence of DSM-IV symptoms in preschoolers and elementary school aged children to clinic controls and community-based samples. In both studies, children with PDDs were recruited from a developmental disabilities specialty clinic and were rated by parents and teachers on a rating scale containing all symptoms relevant to childhood disorders from the DSM-IV. Overall, preschoolers with PDDs (n = 182) presented with more severe DSM-IV psychiatric symptoms than their peers in regular and special education, and to some extent to non-PDD psychiatric referrals. School-aged children with PDDs (n = 301) exhibited a pattern of psychiatric symptoms highly similar to non-PDD clinic referrals. The highest screening prevalence rates were for ADHD, oppositional defiant disorder, and generalized anxiety disorder. The results also suggested that children with the most severe PDD symptoms had fewer psychiatric symptoms.

In their epidemiological study of children with mental retardation (MR), Tonge and Einfeld (2003) had a sub-sample of 118 children with autism (mean age of 8.5 years). All participating children were rated on the Developmental Behaviour Checklist at three different intervals across an eight-year span. Results indicated that 73.5% of children with autism were above the cutoff considered clinically significant for caseness. Scores were fairly stable across time and the researchers commented that young people with autism are at high risk of suffering ongoing serious behavioral and emotional disturbance, over and beyond those defining the disorder.

In the current study, parents and teachers completed the Nisonger Child Behavior Rating Form (NCBRF; Aman, Tassé, Rojahn, & Hammer, 1996; Tassé, Aman, Hammer, & Rojahn, 1996) on a large non-clinically referred sample of children and adolescents with PDDs. The NCBRF is an empirically derived standardized instrument designed to measure behavior and emotional problems in young people with developmental disabilities. In addition to measuring behavior problems commonly seen in young people with PDDs, the NCBRF also contains two prosocial behavior subscales. Normative data were based on a clinically referred sample of children with developmental disabilities. Recently, the tool was shown to have good construct validity in a PDD population (Lecavalier, Aman, Hammer, Stoica, & Matthews, 2004).

The current study had three interrelated objectives. First, it was designed to examine the relative prevalence of specific behavior problems. Despite the accepted fact that people with autism experience high rates of behavior problems, few studies, if any, have reported such information. Categorical classifications and subscale scores make it difficult to know the proportion of individuals who present with a variety of behavior problems. The second objective was to examine the effects of subject characteristics for both parent and teacher versions of the instrument. Both English and French versions of the instrument have normative data for a MR population, but not for individuals with PDDs (Tassé et al., 1996; Tassé, Girouard, & Morin, 1999). Examining the effects of subject characteristics on problem behavior is important for several reasons. A number of disorders are more prevalent in one gender (e.g., conduct disorder or anxiety disorders), others change over the lifespan (e.g., ADHD), and level of functioning impacts the presentation and occurrence of psychopathology (e.g., Brown, Aman, & Havercamp, 2002; Rojahn, Matson, Lott, Esbensen, & Smalls, 2001; Tassé et al., 1996). The final goal was to derive an empirical classification of behavioral and emotional problems. Cluster analysis was used as a multivariate technique to isolate groups of subjects with similar behavioral profiles. It has been used successfully to examine the dimensions of psychopathology in young people with MR as well as their typically developing peers (e.g., Brown, Aman, & Lecavalier, 2004; Kamphaus, Huberty, DiStehano, & Petoskey, 1997; Kamphaus, Petoskey, Cody, Rowe, & Huberty, 1999). Most cluster analysis studies of psychopathology have not included prosocial behaviors, and none have been done assessing behavior problems in young people with PDDs. Prosocial behaviors are important to include in studies of psychopathology because they have been identified as significant protective factors against the development of childhood disorders (e.g., Kamphaus et al., 1999). This is the first study to address these objectives in a large non-clinically referred sample of young people with PDDs.

Method

Measures

Nisonger Child Behavior Rating Form

The content of the parent and teacher versions of the NCBRF is identical, but the subscale scoring methods differ slightly to reflect small variations that occurred in factor structures when the scales were developed (Aman et al., 1996). Ten social competence items are distributed on two subscales: Compliant/Calm and Adaptive/Social. Items are rated on a four-point Likert scale ranging from not true (0) to completely or always true (3). Sixty-six problem behavior items are also rated on a four-point Likert scale ranging from did not occur or was not a problem (0) to occurred a lot or was a serious problem (3). Raters are instructed to consider both the rate of occurrence and degree to which the behavior was a problem. Sixty items from the parent version and 62 items from the teacher version are distributed on six subscales: Conduct Problem, Insecure/Anxious, Hyperactive, Self-Injury/Stereotypic, Self-Isolated/Ritualistic, and Overly Sensitive (parent version) or Irritable (teacher version). With the exception of the Overly Sensitive/Irritable subscales, both versions share very similar subscale content. As assessed in several studies, the NCBRF has good psychometric properties (see Aman et al., 1996; Lecavalier et al., 2004; Tassé, Morin, & Girouard, 2000).

Scales of Independent Behavior-Revised

The SIB-R (Bruininks, Woodcock, Weatherman, & Hill, 1996) is a comprehensive and standardized measure of adaptive behavior. It contains 14 subscales distributed into four areas: (a) Motor Skills, (b) Social and Communication Skills, (c) Personal Living Skills, and (d) Community Living Skills. The Broad Independence score is a measure of overall adaptive behavior or functional independence and is based on the average of the four different areas of adaptive behavior. The test provides norms from early infancy to adulthood and has good psychometric properties (Bruininks et al., 1996).

Procedure

Data were collected in 37 school districts across Ohio over a two-year period as part of larger state evaluation project (Hammer & Lecavalier, 2003). Elsewhere, we assessed the psychometric properties of the NCBRF and Gilliam Autism Rating Scale on about half of the participants included in the current study (Lecavalier, 2005; Lecavalier et al., 2004). Inclusion criteria for the current study were that students be aged between 3 and 21 years and receiving educational services for PDDs. Students were not chosen on the basis of any demographic variables such as level of functioning, behavior, or academic functioning. Project coordinators from each district were asked to select between 5 and 20 students from their rosters, depending on the size of the school district. In all, 611 children were identified as potential participants.

Data were collected via questionnaires from parents and teachers on classroom environments, home and school resources, and on several areas of student functioning. Investigators held eight regional meetings over the two-year period with representatives of school districts, parents, and teachers to explain the purpose of the project and the completion of the different instruments. The SIB-R was completed as a checklist by parents only. It is designed to be administered in an interview format, but checklist administration is considered acceptable under certain circumstances (Bruininks et al., 1996). In addition to the regional meetings and technical support offered throughout, a summary sheet with instructions and examples was provided for accurate completion of the SIB-R. A doctoral level graduate student in psychology verified every completed SIB-R for anomalies in responding (e.g., not giving credit for skills that were obviously mastered such as crawling for an individual who walks, runs, and rides a bicycle). Parents also provided information on psychotropic medicine use, and teachers provided information on educational variables.

In all, 487 students were rated on the NCBRF; 353 were rated by parents and 437 were rated by teachers. Three hundred three participants were rated by both informants. Thus, 50 participants were rated only by parents and 184 were rated only by teachers. There were no differences in chronological age, gender, adaptive behavior, or on any of the eight subscale scores between the participants rated only by one informant and those rated by both.

Participants

Three hundred eighty-nine students were males (82.6%), 82 were females (17.4%), and 16 had missing data (3.3%). Most of the students (89.3%) were Caucasian and the average age was 9.6 years (SD = 3.8). The disability identified by the Individualized Education Plan (IEP) team and reported by teachers were as follows: 326 (67%) autism; 53 (11%) Preschooler with a Disability; 67 (14%) had a variety of other disability categories such as Speech and Language Impairment, Mental Retardation, or Other Health Impaired; and 40 (8%) had this information missing.

Parental reports indicated that 46.7% of the sample had taken at least one type of psychotropic medicine in the last 12 months. Three hundred forty-five parents (97.7%) confirmed that their child had been diagnosed by a physician or psychologist with a PDD. Students without a parental confirmation of a PDD diagnosis were classified as Autistic, Preschooler with a Disability, Speech and Language Impaired, or Other Health Impaired by their school districts.

Three hundred twenty parents provided complete adaptive behavior ratings. Average SIB-R age equivalents in months for the sample were as follows: 65 (SD = 43) for Broad Independence, 76 for Motor Skills (SD = 53), 58 for Social and Communication Skills (SD = 48), 66 for Personal Living Skills (SD = 50), and 66 for Community Living Skills (SD = 47). Average SIB-R standard scores for the sample were as follows: 52 (SD = 33) for Broad Independence, 69 for Motor Skills (SD = 30), 54 for Social and Communication Skills (SD = 35), 60 for Personal Living Skills (SD = 30), and 54 for Community Living Skills (SD = 33). Sixty-six percent of students obtained functional independence standard scores in the range of MR (i.e., 70 or lower).

Raters

Most ratings were provided by mothers (n = 295; 83.6%), but 23 were provided by fathers (6.5%), 24 by both parents (6.8%), and 11 by guardians or grandparents (3.1%). The average age of parents was 40.5 years (SD = 7.0; Range 21–59), and 52.6% graduated from college.

Two hundred sixty teachers rated the 437 students. Female respondents completed 396 of the 437 ratings (90.6%). Average teacher age was 37.9 years (SD = 10.2; Range 22–62) and average teaching experience was 11.2 years (SD = 9.1; Range 0–36). Level of teacher education was as follows: three had an Associate’s degree (0.7%), 229 had Baccalaureates (52.4%), 200 had Master’s degrees (45.8%), and five did not report this information (1.1%). Most of the teacher ratings were provided by primary instructors (= 311; 71.2%) or instructors (= 88; 20.1%), but 39 (8.9%) were provided by other types of raters such as classroom assistants or “shadows.” Three hundred ninety-three of the 437 ratings (89.9%) were completed by raters who knew the students they were rating for more than 6 months.

Data Analysis

Prevalence and Effects of Subject Characteristics

The prevalence of problem behaviors was determined by simple frequency counts of items endorsed as occurring often/being a moderate problem or occurring a lot/being a severe problem (i.e., scores of 2 or 3). Subject variables were analyzed with a three-way analysis of variance in which each subscale was analyzed as a function of gender, age, and level of adaptive behavior. Ages were roughly divided into quartiles and adaptive behavior composite scores were roughly divided into three groups (severe/profound MR range; mild/moderate MR range; no MR range). The least squares means option was used (type III analysis) to correct for different sample sizes. To correct for multiple comparisons, a Bonferroni correction was used, setting the alpha level at .006. Tukey’s HSD were used as post hoc tests.

Cluster Analysis

Cluster analysis refers to the process of applying multivariate heuristics in order to partition a set of entities into relatively homogeneous subgroups based on similarities between them (i.e., the goal is to derive clusters with minimal within-group variance and maximal between-group variance). Unlike factor analysis, the subject (rather than the variable) is the basic unit of analysis.

In the current study, a two-step procedure was followed: Ward’s hierarchical method was followed by a K-Means analysis. Ward’s analysis is a non-overlapping and agglomerative method. It begins with each case considered as a separate cluster. At each successive step in the clustering, the two most similar clusters are merged so that variance within clusters is minimized. This method has good cluster recovery ability and has performed well with behavioral data (Milligan, 1996; Milligan & Cooper, 1987). The pseudo F statistic, pseudo t 2 statistic, and cubic clustering criterion were used as statistical indexes to determine the optimal number of clusters. In addition, clinical meaningfulness of solutions was taken into account. A drawback of Ward’s method is that once a child is assigned to a cluster, its membership cannot change. The intent behind the K-means analysis is to make possible some shifts in cluster membership. It is an iterative cluster partitioning method that starts with a set number of clusters (in this case, determined by the results of Ward’s analysis). At each step of the clustering process, the cases are moved to the nearest centroid, which are re-calculated based on the new assignment. The procedure is repeated until no reassignments are made.

The reliability of cluster solutions was examined by randomly dividing the sample into two three separate times. The six half samples were cluster analyzed based on the centroids of the total sample. A measure of agreement (coefficient Kappa) was then calculated between cluster assignments obtained for the total and half samples. The average Kappa for the six half samples were .84 and .91 for parent and teacher solutions, respectively.

Findings were also replicated in the subsample of student with an autism classification in their IEP for both parent (= 230) and teacher (= 314) ratings. Cluster solutions were replicated by using the centroids obtained from the entire sample and by conducting separate cluster analyses (Ward’s followed by K-Means). The second procedure is a strict test of the robustness of the solutions as cluster analysis is heavily dependent on the sample used. Cluster centroids and percentages of individuals belonging to different clusters were very similar to those obtained for the entire PDD sample with both procedures. Therefore, only data on the broader PDD phenotype is presented.

The validity of the cluster solutions was appraised with data external to the cluster analysis. Groups were contrasted on demographic variables, psychotropic medication use, and the presence of a behavior plan in the student’s IEP.

Results

Relative Prevalence

Table 1 shows the percentages of items rated as occurring often/being a moderate problem and occurring a lot/being a severe problem, separately for parents and teachers. Overall, the rates reported by parents and teachers were very similar, but varied significantly across items. For instance, parents and teachers reported moderate or severe problems with Stealing for 3.4% and 3.6% of the sample, respectively. On the other hand, the item Easily frustrated was endorsed by parents and teachers as a moderate or severe problem for 62.0 and 53.6% of the sample, respectively.

Table 1 Percentage of behavior problems endorsed as moderate or severe in severity according to parents and teachers

The most frequently endorsed items were those relating to symptoms of ADHD. For instance, in percentages, the following items were reported to be moderate or severe problems (parents/teachers): Difficulty concentrating (49/50), Easily distracted (60/60), and Fidgeting, wiggling, squirming (42/44).

The item Physically harms self was endorsed as being a moderate or severe problem for 11% and 10% of parents and teachers, respectively. Rates for specific self-injurious behaviors rated as moderate or severe problems ranged from 5% (teachers ratings on Bites self) to 16% (parent ratings on Hits/slaps self). Along the same lines, Destruction of property was reported to be a moderate or severe problem for 11% (parents) and 12% (teachers) of the sample. Both types of informants reported moderate or severe problems with Physical fights for 5% of the sample.

In percentages, the following symptoms of anxiety were reported to be moderate or severe problems (parents/teachers): Nervous/tense (21/18), Too fearful/anxious (17/11), and Worrying (14/14). In percentages, the following symptoms of mood problems were reported to be moderate or severe problems (parents/teachers): Crying (23/23), Irritable (19/23), Temper tantrums (29/30), and Unhappy/sad (6/9).

Subject Characteristics

No gender or interaction effects were found. Therefore both sexes were combined for subsequent analyses. Table 2 shows average subscale scores and standard deviations based on the four age groups and by type of informant. Parent ratings yielded a significant main effect for the Insecure/Anxious subscale [(F (3, 340) = 11.76)], with the youngest participants obtaining the lowest scores. Teacher ratings revealed significant main effects for age for the Insecure/Anxious [(F (3, 422) = 12.07)], Self-Injury/Stereotypic [(F (3, 422) = 4.44)], and Self-Isolated/Ritualistic subscales [(F (3, 422) = 4.99)]. As with parent ratings, younger children had significantly lower scores on the Insecure/Anxious subscale. The 10- to 12-year-old age group had the highest scores on the Self-Injury/Stereotypic and Self-Isolated/Ritualistic subscales.

Table 2 Summary of the main effect of age for parent and teacher versions of the NCBRF

Table 3 shows average subscale scores and standard deviations based on the three levels of adaptive behavior. Parent ratings indicated significant differences across groups for five of the eight subscales. Participants with lower adaptive skills obtained lower scores (worse) on the Adaptive/Social [(F (2, 317) = 30.09)] and Insecure/Anxious subscales [(F (2, 317) = 16.42)]. They also had higher scores (worse) on the Hyperactive [(F (2, 317) = 7.03)], Self-Injury/Stereotypic [(F (2, 317) = 21.63)], and Self-Isolated/Ritualistic [(F (2, 317) = 10.92)] subscales. Teacher ratings were quite similar to parent ratings, with seven of the eight subscales reaching statistical significance. Participants with lower adaptive skills had lower scores on the Compliant/Calm [(F (2, 281) = 19.71)], Adaptive/Social [(F (2, 281) = 26.48)], Insecure/Anxious [(F (2, 281) = 8.45)], and Irritable [(F (2, 281) = 13.51)] subscales. Conversely, they obtained higher scores on the Conduct Problem [(F (2, 281) = 8.93)], Hyperactive [(F (2, 281) = 22.97)], and Self-Injury/Stereotypic [(F (2, 281) = 10.84)] subscales.

Table 3 Summary of the main effect of adaptive behavior level for parent and teacher versions of the NCBRF

Average subscale scores and standard deviations presented by age and level of adaptive functioning combined for both parent and teacher ratings can be obtained by contacting the author.

Cluster Solutions

Parent Ratings

Parent ratings suggested that a six-cluster solution fit the data best. Table 4 shows mean subscale scores and standard deviations for this cluster solution. Clusters were characterized as follows: (a) Cluster 1, Problem Free: Members of this cluster (31% of the sample) had scores within ±½ SD from the sample average on all eight subscales. The average age was 9.2 years (SD = 3.9) and average SIB-R composite score was 55.7 (SD = 30.9). (b) Cluster 2, Well Adapted: Compared to other clusters, members of this cluster (21% of the sample) had the highest scores on both prosocial subscales and the lowest scores on all six problem behavior subscales. The average age was 9.5 years (SD = 4.3) and the average SIB-R composite score was 57.2 (SD = 37.5). (c) Cluster 3, Ritualistic and Hyperactive: Members of this cluster comprised 13% of the sample. Compared to other clusters, they had the lowest scores on the Adaptive/Social subscale. They also obtained scores roughly one SD above the sample average on both the Self-Isolated/Ritualistic and Hyperactive subscales. The average age was 8.8 years (SD = 4.2) and the average SIB-R composite score was 27.4 (SD = 26.8). (d) Cluster 4, Hyperactive with Conduct Problems: Members of this cluster (14% of the sample) had scores one SD above the sample average on the Hyperactive subscale. They also had elevated scores on the Conduct Problem subscale. Their average age was 9.1 years (SD = 3.5) and average SIB-R composite score was 41.3 (SD = 27.8). (e) Cluster 5, Anxious: Members of this cluster comprised 13% of the sample. They were characterized by scores one SD above sample average on the Insecure/Anxious subscale. With the exception of scores on the Overly Sensitive subscale, all other subscale scores were within ±½ SD of the sample average. Average age was 11.0 years (SD = 3.3) and average SIB-R composite score was 71.6 (SD = 23.7). (f) Cluster 6, Undifferentiated Behavior Disturbance: Compared to other clusters, members of this cluster (9% of the sample) had the lowest scores on the Compliant/Calm subscale. With the exception of the Self-Isolated/Ritualistic subscale score, all other problem behavior subscale scores were above one SD from the sample average. Average age was 10.2 years (SD = 3.3) and average SIB-R composite score was 52.5 (SD = 33.2).

Table 4 Means and standard deviations by NCBRF subscale for the six-cluster solution based on parent ratings (N = 353)

In sum, parent ratings suggested that (a) 52% of the sample was free of significant behavior and emotional problems (clusters 1 and 2), (b) 35% of the sample were in a cluster having an average score above one SD of the sample average on the Hyperactive subscale (clusters 3, 4, and 6), (c) 22% were in a cluster having an average score above one SD of the sample average on the Insecure/Anxious subscale (clusters 5 and 6), and (d) 9% of the sample had several subscale scores above one SD of the average (cluster 6).

The following comparisons involved an attempt to examine the validity of the clusters. First, cluster membership was examined as a function of age, gender, and level of functioning. No significant differences were found across clusters for age or gender. As in previous analyses, adaptive behavior was treated as a categorical variable. There were significant differences at the p < .001 level between clusters as a function of adaptive behavior [(F (5, 314) = 9.96)]. Members of cluster 3 (Ritualistic and Hyperactive) had significantly lower adaptive behavior scores than members from all other clusters, except those of cluster 4 (Hyperactive with Conduct Problems). Members of cluster 5 (Anxious) had significantly higher adaptive behavior scores than members from all other five clusters. Psychotropic medicine use was compared across clusters. This variable was treated dichotomously and cross-tabulated with the six clusters. The resulting Chi Square was significant [χ 2 (5, = 353) = 34.6, p < .0001]. Follow-up analyses indicated that members from clusters 1 and 2 (Problem Free and Well Adapted) were taking significantly less psychotropic medicine than members from clusters 4 and 6 (Hyperactive with Conduct Problems, and Undifferentiated Behavioral Disturbance). Finally, the presence of a behavior plan in the students’ IEP was compared across clusters. This variable was also treated dichotomously and cross-tabulated with the six empirical clusters. The resulting Chi Square was significant [χ 2 (5, N = 299) = 11.6, p < .05] and follow-up analyses indicated that members from cluster 6 (Undifferentiated Behavioral Disturbance) had significantly more behavior plans than members from cluster 1 (Problem Free).

Teacher Ratings

Teacher ratings suggested that an eight-cluster solution fit the data best. Table 5 shows mean subscale scores and standard deviations for this cluster solution. Clusters were characterized as follows: (a) Cluster 1, Problem Free: Members of this cluster (20% of the sample) had scores slightly below the sample average on both prosocial behavior subscales. They also had scores below the sample average on all six problem behavior subscales. The average age was 8.1 years (SD = 3.7) and average SIB-R composite score was 37.0 (SD = 29.6; = 31). (b) Cluster 2, Well Adapted: Compared to other clusters, members of this cluster (23% of the sample) were characterized by the highest prosocial subscale scores. They also had the lowest scores for five of the six problem behavior subscales. Average age was 9.8 years (SD = 3.8) and average SIB-R composite score was 74.2 (SD = 27.9; = 36). (c) Cluster 3, Ritualistic: Members of this cluster comprised 13% of the sample. They were characterized by scores that fell more than one SD above the sample average on the Self-Isolated/Ritualistic subscale. With the exception of scores on the Self-Injury/Stereotypic subscale, all other scores fell at ±½ SD from the sample average. Average age was 9.6 years (SD = 4.1) and average SIB-R composite score was 39.8 (SD = 25.5; = 71). (d) Cluster 4, Conduct Problem: Members of this cluster comprised 8% of the sample. They were characterized by scores more than one SD above the sample average on the Conduct Problem subscale. With the exception of the scores on the Self-Injury/Stereotypic subscale, all other subscales fell within ±½ SD of the sample average. Average age was 9.1 years (SD = 2.8) and average SIB-R composite score was 50.2 (SD = 28.4; = 63). (e) Cluster 5, Hyperactive: Members of this cluster comprised 12% of the sample. They obtained scores more than one SD above sample average on the Hyperactive subscale and scores within ±½ SD of the sample average on all other subscales. The average age was 8.3 years (SD = 3.0) and the average SIB-R composite score was 44.6 (SD = 34.6; = 31). (f) Cluster 6, Anxious: Members of this cluster (11% of the sample) were characterized by scores that were more than one SD above sample average on the Insecure/Anxious subscale. With the exception of the Self-Isolated/Ritualistic subscale, all other subscales fell within ±½ SD from the sample average. The average age was 11.7 years (SD = 3.5) and the average SIB-R composite score was 72.0 (SD = 28.8; = 16). (g) Cluster 7, Undifferentiated Behavior Disturbance with Stereotypy: Members of this cluster comprised 8% of the sample. They obtained the lowest scores on the Adaptive/Social subscale and scores roughly one SD below sample average on the Compliant/Calm subscale. They also obtained scores significantly above one SD from the sample average on the Conduct Problem, Hyperactive, Self-Injury/Stereotypic, and Irritable subscales. Average age was 10.6 years (SD = 3.3) and average SIB-R composite score was 27.0 (SD = 29.2; = 23). Cluster 8, Undifferentiated Behavior Disturbance with Anxiety: Compared to other clusters, members of this cluster (5% of the sample) had the lowest scores on the Compliant/Calm subscale. They were also characterized by scores that fell above one SD of the sample average on the Conduct Problem, Insecure/Anxious, Hyperactive, and Irritable subscales. Average age was 10.1 years (SD = 3.1) and average SIB-R composite score was 74.1 (SD = 34.1; = 13).

Table 5 Means and standard deviations by NCBRF subscale for the eight-cluster solution based on teacher ratings (N = 437)

In sum, when including members of the Ritualistic cluster, teacher ratings suggested that 55% of the sample was relatively free of behavior/emotional problems (clusters 1, 2, and 3). They also indicated that (a) 25% of the sample were in clusters having an average score above one SD from the sample average on the Hyperactive scale (clusters 5, 7, 8), (b) 16% of the sample were in clusters having an average score above one SD from the sample average on the Insecure/Anxious subscale (clusters 6 and 8), and (c) 13% of the sample had several subscale scores above one SD of the average (clusters 7 and 8).

The external validity of the clusters derived from teacher ratings was examined in the same fashion as with the parent ratings. There were no significant differences across clusters as a function of gender. There were significant differences at the p < .001 level across clusters as a function of age [(F (7, 418) = 5.92)]. Members of cluster 6 (Anxious) were significantly older than members of clusters 1 (Problem Free), 5 (Hyperactive), and 6 (Conduct Problem). Members of cluster 1 (Problem Free) were younger than members from clusters 2 (Well Adapted) and 7 (Undifferentiated Behavior Disturbance with Stereotypy). There were also significant differences at the p < .001 level across clusters as a function of adaptive behavior level [(F (7, 276) = 13.79)]. Members of cluster 6 (Anxious) had higher adaptive behavior scores than members of cluster 1 (Problem Free), 3 (Ritualistic), 5 (Hyperactive), and 7 (Undifferentiated Behavioral Disturbance with Stereotypy). Members from cluster 2 (Well Adapted) had more adaptive skills than members from clusters 1 (Problem Free), 3 (Ritualistic), 5 (Hyperactive), and 7 (Undifferentiated Behavioral Disturbance with Stereotypic Behavior). Finally, members from cluster 8 (Undifferentiated Behavioral Disturbance with Anxiety) had more adaptive skills than members from clusters 1 (Problem Free) and 7 (Undifferentiated Behavioral Disturbance with Stereotypy). Psychotropic medicine use was compared across the eight clusters. The resulting Chi Square was significant [χ 2 (7, N = 303) = 18.1, p < .05]. Follow-up analyses indicated that members from clusters 1 (Problem Free) and 2 (Well Adapted) were taking less psychotropic medicine than members from cluster 8 (Undifferentiated Behavioral Disturbance with Anxiety). Finally, the presence of a behavior plan in the students’ IEP was compared across clusters. The resulting Chi Square was significant [χ 2 (7, = 419) = 22.8, p < .001] and follow-up analyses indicated that members from Clusters 2 (Well Adapted) and 5 (Hyperactive) had significantly fewer behavior plans than members from clusters 7 and 8 (both Undifferentiated Behavioral Disturbance groups).

Discussion

Prevalence

This was the first study to report on prevalence rates of specific behavior problems in a large non-clinically referred sample of young people with PDDs and to use a quantitative method for typing psychopathology. The sample was probably representative of the population of young people with PDDs attending public schools in the American Midwest. Participants were not selected on the basis of any behavioral or academic criteria. They were recruited from a good range of urban and rural communities and had an ethnic composition similar to the state’s. The project was done in close collaboration with the Ohio Department of Education and response rates were very high, with 79.7% of the students initially targeted being rated by at least one informant (i.e., 487 out of 611). Supportive of this is the fact that the distributions of gender and levels of adaptive functioning were consistent with previous epidemiological studies of PDDs (e.g., see Bryson & Smith, 1998) and patterns of psychotropic medicine use were very similar to other reports for this population (e.g., Aman, Lam, & Collier-Crespin, 2003).

We recognize that some behaviors are an inherent part of PDDs (e.g., social withdrawal or repetitive behaviors), while others are mediated by cognitive and language abilities (e.g., threatening or arguing). The purpose of this study was not to determine if behavior and emotional problems are manifestations of the PDD diathesis, separate clinical entities, or non-specific symptoms secondary to the core feature of PDDs. Regardless of one’s position in this debate, results indicated that young people with PDDs experience high rates of behavior and emotional problems. The nature of reported problems was consistent with other reports in the literature (e.g., see Gadow et al., 2004; Gadow et al., 2005; Lainhart, 1999). However, the NCBRF does not provide categorical classifications, which renders direct comparisons with many studies difficult.

Subject Characteristics

As far as subject characteristics go, the only age effect observed for both informants was on the Insecure/Anxious subscale. This effect was also found in a sample of clinically referred children with MR on both parent and teacher versions of the NCBRF (Tassé et al., 1996). The absence of age effects on the Hyperactive subscale was somewhat surprising, as it has been observed in children with MR and in their typically developing peers (e.g., Arnold, 2000; Brown et al., 2002; Rojahn & Helsel, 1991; Tassé et al., 1996). The absence of gender effects has been reported in several studies with developmentally disabled populations, including children with PDDs (Brown et al., 2002; Gadow et al., 2004; Gadow et al., 2005; Tassé et al., 1996). This seems to be a consistent difference between populations with and without developmental disabilities. Level of adaptive functioning was the subject characteristic that impacted subscale scores the most. Overall, lower adaptive skills were associated with less prosocial behaviors and symptoms of anxiety, and higher scores on other subscales, indicating more problems. These findings are consistent with the MR literature (e.g., Brown et al., 2002; Einfeld & Tonge, 2002; Marshburn & Aman, 1992; Rojahn, Borthwick-Duffy, & Jacobson, 1993). Taken as a whole, these data underscore the importance of having appropriate comparison groups in clinical and research contexts. Subject characteristics impacted subscale scores differently than in MR and typically developing populations.

Empirical Classification

Cluster analyses complemented other descriptive analyses in at least two ways. First, they allow one to examine the severity and overlapping nature of behavioral dimensions simultaneously. Second, by using the person as the basic unit of analysis, they enabled us to study all cases on behavioral dimensions (including prosocial behaviors). Overall, parent and teacher ratings yielded similar solutions that were clinically meaningful and were supported with data external to the analyses. They indicated that slightly more than half of the students were relatively behavior- and emotional-problem free (52% for parent ratings and 55% for teacher ratings). For comparison purposes, Brown et al. (2004) reported that 64% of their sample of non-clinically referred children with MR were free of problem behavior. Kamphaus and colleagues reported rates of 53% and 63% in typically developing children based on teacher and parent ratings, respectively. Results of the cluster analysis also indicated that a substantial minority of young people presented with severe and undifferentiated behavior disturbances (8% and 13% for parent and teacher ratings, respectively). These rates were much higher than those reported in Brown et al. (4%) as well as in the Kamphaus et al. studies (3% and 4% for parents and teachers, respectively). Interestingly, participants with low and high adaptive skills were in relatively problem-free and severe psychopathology clusters. The cluster analyses also showed that different patterns of prosocial behaviors were associated with different cluster memberships. Finally, the cluster solutions obtained here were similar to that obtained by Brown et al. (2004) in a non-clinically referred sample of children with MR. Brown and colleagues reported an eight-cluster solution with two problem-free clusters, as well as Hyperactive, Conduct Problem, Shy/Inactive, and Undifferentiated Behavior Disturbance clusters (the other two clusters were Social Withdrawal with Agitation and Autistic-like).

The differences between the solutions obtained for parent and teacher ratings were threefold. First, teachers reported two types of undifferentiated behavior disturbances (as compared with one for parents), based on level of functioning. Second, teacher ratings produced two separate clusters for hyperactive and conduct problems, instead of just one found with the parent data. Finally, teacher ratings produced two separate clusters for Ritualistic and Hyperactive behaviors, whereas parents only had one. These differences could be an artifact of the different samples, factor structures, raters, and patterns of medication use. In the only other studies with overlapping samples and different raters using the same instrument, Kamphaus and colleagues reported nine- and seven-cluster solutions for parent and teacher ratings, respectively.

Study Limitations

Results need to be interpreted within the context of the methodology. One caveat relates to the preciseness of the diagnostic information. Without evaluating every child individually, it is impossible to confirm who met criteria for Autistic Disorder, Asperger’s Disorder, or PDDNOS. Despite the fact that a third of the sample was not labeled autistic by their IEP team, the actual presence of PDDs is likely not a significant concern for several reasons. School districts in Ohio have been apprehensive about offering educational services without a psychiatric diagnosis. Furthermore, Ohio does not have a primary and secondary disability data system. Many children who present with co-morbid disorders will be labeled “multiply handicapped” rather than autistic. Many students in the current study were preschoolers and classified as “preschoolers with a disability.” Finally, there are financial pressures for districts not to label students with autism, as there are assumptions that the autism disability category will lead to expensive services and litigation.

Given the context of the data collection, we feel confident that the 2% of participants for whom no parental confirmation of a diagnosis was available also had some type of PDD. They were treated as having PDDs by their schools and, at the very least, would have to present with sufficient autistic features to make them eligible to receive educational services.

Two recent epidemiological studies are relevant to the issue of educational classification. Yeargin-Allsopp et al. (2003) reported 100% agreement between CDC expert case review and previous PDD diagnosis used in concert with previous autism special education eligibility. This is not to say that all children with PDDs were treated as such in their IEP. In this study, only 41% of the students were classified as autistic, while 59% had other special education eligibility categories such as MR and speech and language impairments. A study by Bertrand et al. (2001) found that 50% of PDD students received services under various classifications such as language/communication impairments, multiple disabilities, or preschoolers with a disability.

Another limitation pertains to the instrument used. Although there were significant advantages to using a standardized behavior rating scale, the subgroups derived were necessarily limited by the range of items. An instrument with only 76 items will clearly not identify all possible behavioral and emotional problems. Furthermore, because only a moderately large community sample was used, it is likely that significant but uncommon subgroups were not detected.

Along the same lines, results were dependent on the accuracy of information obtained through informants. The investigator had no control over the accuracy of the information provided or over school policies (e.g., absence of behavior plans as part of the IEP does not necessarily mean that there were no behavior problems).

A final comment relates to the cluster analysis. Results of cluster analysis are often treated skeptically, and we recognize that different clustering methods could have yielded different results (e.g., see Milligan, 1996; Stevenson, 1989). Having said that, the clusters were clinically meaningful, showed some internal and interrater reliability, and they were partially supported by variables external to the analyses. There is always a threat of artifactual findings with cluster analysis, and the number of subgroups identified is a function of the subgroups present in the sample. No direct assessment of accuracy is possible; if the number of groups were known, there would be no need for cluster analysis.

Conclusions

Despite the limitations to the findings, the results have important implications. Although a substantial proportion of young people with PDDs present with relatively low rates of behavior and emotional problems, many of them have serious behavior problems. Furthermore, psychiatric symptomatology does not present in a single fashion. For instance, symptoms of ADHD can or cannot be accompanied by other behavior and emotional problems. The data also suggested that there are similarities and differences in the dimensions of psychopathology as compared to populations of children with and without MR.

Dimensional approaches provide an opportunity to study behavioral phenomena that are not captured with categorical systems. They promote a greater understanding of the full range of child behavior by virtue of their sensitivity to sub-clinical symptoms. Some studies have shown that they have better predictive validity than categorical approaches in typically developing children (e.g., Fergusson & Horwood, 1995), which could lead to a better use of existing prevention and intervention strategies. The relative merit of dimensional approaches requires future research, especially with populations in which there is limited evidence that the traditional categorical systems used are even valid.

Although behavior and emotional problems have received more empirical attention as of late, they remain significantly understudied in this field. It would be helpful if future studies replicated these findings and examined the stability and predictive validity of empirically derived clusters. Additional research is also needed on the role of prosocial behaviors in the development of behavioral and emotional problems in this population. These individual differences may have important clinical implications in terms of etiology, responsiveness to treatment, and outcome. In all likelihood, a better comprehension of psychopathology in the PDD population will come by bridging the knowledge from the typically developing and MR literatures and by using multiple empirical approaches in concert such as family approaches, pharmacological probes, and neuroimaging techniques.