Translations of instruments developed in the United States (U.S.) are used in children’s mental health research in many countries. Because the Child Behavior Checklist for ages 6–18 (CBCL; Achenbach and Rescorla 2001) has been translated into more than 85 languages, it has been used in many international studies to assess children’s behavioral and emotional problems (Achenbach and Rescorla 2007a).

To effectively assess people from other countries with an instrument developed elsewhere, that instrument’s reliability and validity must be demonstrated in those countries. Because equivalence in translation cannot be assumed, Geisinger (1994) argues that cross-cultural normative assessment requires “substantial evidence of the comparability” of a translated/adapted instrument and the original instrument. He further notes that adaptations may be needed even when translation is not necessary, because people who live in different countries but speak the same language may differ in their “culture or life experiences.”

Much theorizing about cross-cultural research has emerged from international comparisons of adult personality using the MMPI-2 (Butcher and Han 1996) and the Five Factor Model (McRae and Allik 2002). Butcher and Han (1996) proposed establishing cross-cultural equivalence across different societies by testing whether items perform similarly, by using confirmatory factor analysis (CFA) to test the fit of data to the instrument’s factor model, and by comparing levels and correlates of scale scores.

Rescorla et al. (2007) proposed the term multicultural robustness to convey that an instrument’s reliability and validity have been supported through international research. Multicultural robustness is demonstrated when an instrument performs similarly across societies with respect to features such as internal consistency, factor structure, scale scores, and associations with other variables. In a comparison of parents’ CBCL ratings of behavioral and emotional problems in 31 societies (N = 55,508), Rescorla et al. (2007) found that age and gender effects, scale alphas, and the rank order of mean item ratings were quite consistent across the 31 societies. Furthermore, Ivanova et al. (2007) reported good fit of the CBCL’s 8-syndrome factor model when CFAs were applied to data from the 30 non-U.S. societies compared in the Rescorla study. Rescorla et al. (2007) also found small effect sizes (ESs) for societal differences in scores on most CBCL scales. However, societal variations in scale score levels led Achenbach and Rescorla (2007a, b) to provide clinical and borderline range cutpoints based on percentiles calculated for low-, medium-, and high-scoring societies.

Although the Rescorla et al. (2007) study supported the CBCL’s multicultural robustness in societies from many regions of the world (Europe, Asia, Middle East, Africa, Caribbean, plus Australia), no South American samples were included. With their rapidly expanding populations, burgeoning economies, and growing international visibility, South American countries are playing increasingly important roles in the 21st century.

Economic development is typically accompanied by investment in public health programs. Such programs should be based on epidemiological research examining the prevalence and distribution of physical and mental disorders. Although epidemiological studies of children’s mental health problems have been conducted in a few South American countries, few of these studies are recent and most are limited in scope. Consequently, there is a need for contemporary epidemiological studies of children’s mental health in South America. Many of these epidemiological studies may employ the CBCL, due to its availability in Latin American Spanish and Brazilian Portuguese, its prior use in several South American countries, its ease of administration and scoring, its research base over four decades, and its demonstrated multicultural robustness in many societies. Because the CBCL is likely to be used in future South American epidemiological studies, it is important to test the CBCL’s multicultural robustness in South America. We address this goal by using data from an epidemiological survey in Uruguay to test the multicultural robustness of the current CBCL (Achenbach and Rescorla 2001) in a South American country.

Epidemiological Studies Using the CBCL in South America

The first published South American epidemiological study of child mental health problems using the CBCL was conducted in Chile (Montenegro et al. 1983). Using their own Spanish translation of the 1978 CBCL (Achenbach 1978), Montenegro et al. assessed 409 6- to 11-year-olds attending schools in the Santiago Metropolitan region. The sample comprised 33% high, 35% medium, and 33% low SES families (based on education and occupation), which overrepresented high SES families. A clinical sample of 933 6- to 11-year-olds was also assessed (5% high, 41% medium, and 54% low SES). Montenegro et al. reported a variety of findings that supported the robustness of the CBCL in Chile. One-week test-retest Pearson correlations (rs) for Total Problems scores were very high (0.98 for 57 non-referred children and 0.99 for 176 referred children). Internal consistency measured by Kuder-Richardson-20 was also very high (0.97 for Total Problems). Using principal components analysis with a varimax rotation, Montenegro obtained nine factors that were similar to many of the factors reported by Achenbach (1978) for U.S. data. Analysis of variance (ANOVA) indicated that Total Problems scores were significantly associated with SES in the non-referred sample, with the low SES group obtaining the highest mean score. Criterion-related validity was supported by a significantly higher mean Total Problems score in the referred sample (65.4) than in the non-referred sample (28.7).

Building on the Montenegro et al. (1983) study, Bralio et al. (1987) reported CBCL data for 517 Chilean 6- to 11-year-olds (51% boys) attending 18 schools in the province of Santiago. SES distribution was 9% high, 44% middle, and 48% low, determined by a govenmental classification of schools based on parental education, occupation, residence quality, and neighborhood. Using a clinical cutpoint of T = 70 based on Montenegro et al. (1983), Bralio et al. found that 2% of high, 6% of middle, and 10% of low SES children obtained Total Problems scores ≥70 (14% of boys and 17% of girls).

In the first Argentinian epidemiological study of child mental health problems, Samaniego (2008) obtained data in 1997 for 240 6- to 11-year-olds residing in Buenos Aires. Households were selected by random sampling within SES levels. A referred sample of 240 children was recruited from mental health clinics. Samaniego used an existing Spanish translation of the 1991 CBCL (Achenbach 1991), which she modified slightly for use in Argentina. Over 7 to 10 days, the test-retest r for Total Problems was 0.91, based on a 15% subset of the non-referred sample. Cronbach’s alphas were similar to those reported by Achenbach (1991). ANOVA indicated a significant effect of SES on Total Problems scores, with children from lower SES families obtaining the highest mean Total Problems scores in both the referred and non-referred samples. The mean Total Problems score for the non-referred Argentine sample was much higher than the mean Total Problems score of 22.5 reported by Rescorla et al. (2007), which was obtained by averaging mean Total Problem scores for 31 societies. However, Argentina’s mean was quite similar to the highest mean Total Problems score among the 31 societies, namely 34.7 for Puerto Rico. Total Problems scores were significantly higher for referred than non-referred children (58.2 vs. 34.5), supporting the CBCL’s criterion-related validity.

Samaniego (2004) collected additional CBCL data for 453 6- to 11-year-olds recruited from 22 public schools in San Isidro, an area with a wide SES distribution in the Buenos Aires metropolitan region. The mean CBCL Total Problems score was higher than Samaniego (2008) reported for children in the Buenos Aires standardization sample (40.5 vs. 34.5). Children from two-parent families had significantly lower Total Problems scores than children from separated, divorced, or widowed families. Problem scores were inversely associated with maternal education. Boys obtained significantly higher scores than girls on Total Problems, Externalizing, Social Problems, Thought Problems, Attention Problems, Delinquent Behavior, and Aggressive Behavior.

Findings from Chile and Argentina have been similar to U.S. findings with respect to test-retest reliability, internal consistency, factor structure, gender and SES effects, and criterion-related validity, although mean scale scores have tended to be higher than U.S. scores. However, these South American findings have not been published in English and were based on early versions of the CBCL. Moreover, samples were regional rather than national and not always representative of the target populations, and some studies omitted statistical details. Finally, only the Montenegro et al. (1983) study assessed children’s adaptive competencies, but few results were reported.

To ascertain the prevalence and distribution of behavioral and emotional problems among 6- to 11-year-olds in Uruguay, the Uruguayan government commissioned the first child mental health epidemiological survey of a nationally representative sample of children. The Uruguayan research team conducting this epidemiological survey selected the CBCL (Achenbach and Rescorla 2001) for assessing competencies and problems in Uruguayan children. They chose the CBCL because of its previous use in epidemiological studies in many other countries. The epidemiological data collected by the Uruguayan research team were used in the present study.

The present study was conducted to evaluate whether the multicultural robustness found for the CBCL in 31 societies was replicated in a South American society. To achieve this goal, we compared Uruguayan and U.S. CBCL data using the same analytical techniques employed by Rescorla et al. (2007). Uruguay differs from the U.S. in many ways. About the size of the State of Washington, Uruguay has a population of only 3.5 million people and only one major metropolitan area (Montevideo), which is the capital. It suffered a violent Marxist revolution in the 1960s, followed by more than a decade of military rule. It has a very homogenous white Hispanic population, with very few people of native American or African origin. Uruguay was relatively prosperous until about 1960, but then underwent significant socioeconomic and educational deterioration (Calvo 2000). Thus, Uruguay differs from the U.S. in size, ethnic composition, economic and political history, and culture, thereby providing a strong test of multicultural robustness.

Purpose of the Present Study

The purpose of the present study was to test the multicultural robustness of the CBCL in a South American society by comparing Uruguayan epidemiological data with data from the U.S. (Achenbach and Rescorla 2001). As summarized below, six different aspects of multicultural robustness were examined in this study.

The first aspect of multicultural robustness was comparability in item performance. To determine whether the same problem items received high, medium, or low ratings in Uruguay and the U.S., we computed the correlation between mean item ratings (0 = not true, 1 – somewhat or sometimes true, 2 = very true or often true) from the two societies. When Rescorla et al. (2007) computed rs between mean CBCL item ratings in every pair of 31 societies, the mean r between all pairs of societies was 0.74, indicating considerable similarity with respect to which items received high, medium, or low ratings.

The second aspect of multicultural robustness was comparability in the factor structure of the syndromes derived from the CBCL. We used CFA to test the fit of the Uruguayan data to the 8-syndrome model derived from the U.S. sample. The syndromes are designated as Anxious/Depressed, Withdrawn/Depressed, Somatic Complaints, Social Problems, Thought Problems, Attention Problems, Rule-Breaking Behavior, and Aggressive Behavior. When Ivanova et al. (2007) used CFA to test the fit of the CBCL 8-syndrome model in the 30 non-U.S. samples analyzed by Rescorla et al. (2007), good fit was found in all societies.

The third aspect of multicultural robustness was comparability between Uruguay and the U.S. in internal consistency of CBCL scales. A correlation was computed between U.S. and Uruguayan Cronbach’s alpha coefficients for the CBCL’s eight syndromes, its three broad-band scales (Internalizing, Externalizing, and Total Problems), and its six DSM-oriented scales (Affective Problems, Anxiety Problems, Somatic Problems, Attention Deficit Hyperactivity Problems, Oppositional Defiant Problems, and Conduct Problems). The DSM-oriented scales reflect diagnostic categories of the American Psychiatric Association’s Diagnostic and Statistical Manual – 4th Edition (American Psychiatric Association 1994). When Rescorla et al. (2007) computed correlations among Cronbach’s alphas for these 17 CBCL scales across 31 societies, the mean bi-society correlation was r = 0.88 (range 0.77–0.92), demonstrating strong comparability in alphas across societies.

The fourth aspect of multicultural robustness concerned associations between CBCL scores and demographic variables. Gender, age, and SES were tested as correlates of CBCL problem and competence scale scores in Uruguay. Because SES was measured differently in Uruguay than in the U.S., these analyses used only the Uruguayan data. Rescorla et al. (2007) reported significant gender and age differences for many CBCL scales. For example, in many of the 31 societies, girls scored significantly higher than boys on the Internalizing scale, whereas boys scored significantly higher than girls on the Externalizing scale. Because SES scores were not available for all societies, Rescorla et al. (2007) could not test SES effects. However, Achenbach and Rescorla (2007b) summarized findings from 15 societies in which problem scores were elevated among children from lower SES families (as indexed by parental education, occupation, and/or income).

The fifth aspect of multicultural robustness was comparability between Uruguay and the U.S. on scale scores. To examine this aspect of multicultural robustness, we tested the effects of society (Uruguay vs. U.S.), referral status (referred vs. non-referred children), age (6–8, 9–11), and gender on CBCL scale scores. Because Uruguay differs from the U.S. in distributions of race/ethnicity and SES, as well as in many other ways, some differences in score levels were anticipated. Rescorla et al. (2007) reported that 19 of 31 societies scored within 5.7 points of the omnicultural mean of 22.5 on CBCL Total Problems, a scale that could range from 0 to 224. We therefore considered it likely that Uruguay’s mean problem scores would be close to U.S. scores but perhaps somewhat higher, given the higher scores found previously for Chile and Argentina.

The sixth and final aspect of multicultural robustness was the ability of the Uruguayan CBCL to discriminate non-referred children in the general population from children referred for mental health or special education services or diagnosed with significant developmental disabilities. Although Rescorla et al. (2007) were not able to test CBCL score differences between referred and non-referred children in their 31-society comparison, Achenbach and Rescorla (2007b) reported that the CBCL significantly discriminated referred from non-referred children in Denmark, Finland, France, Germany, the Netherlands, and the U.S.

In summary, the multicultural robustness of the CBCL in a South American country was tested in six ways: (a) we computed correlations between mean problem item ratings for Uruguay and the U.S.; (b) we used CFA to test the fit of the Uruguayan problem item ratings to the CBCL 8-syndrome model; (c) after computing alphas for problem and competence scales in each country, we computed correlations between Uruguayan and U.S. problem scale alphas; (d) we tested the effects of SES, gender, and age on CBCL problem and competence scale scores in Uruguay; (e) we tested effects of society (Uruguay vs. U.S.), age, gender, and referral status on CBCL problem scores; and (e) we tested the ability of the Uruguayan CBCL to discriminate referred from non-referred children.

Method

The national epidemiological survey was carried out by the Department of Child and Adolescent Psychiatry of the University of Uruguay’s School of Medicine. The research protocol was approved by the Ethics Commission of the School of Medicine.

Participants

Since 97% of children between the ages of 6 to 11 in Uruguay attend either their neighborhood public school or private school (ANEP 2006), recruitment was done through schools. Twenty-three schools from Montevideo and 42 schools from the rest of the country were selected at random from the complete roster of public and private schools in the country, stratified by region and school size. Next, 30 children in each of the 23 schools in Montevideo and 25 children in each of the 42 schools from the rest of the country were randomly selected from grades 1 to 6 (4 children per class), for a total of 1,740 children.

Notes were sent from the school to parents whose children had been selected inviting them to come to the school to participate in the survey. Graduate student assistants met with parents in small groups and monitored their self-administered completion of the CBCL, after obtaining the parents’ written informed consent. From the target sample of 1,740 children, 1,374 completed CBCLs were obtained, for a response rate of 79%. This was close to the mean response rate of 82% (range 45% to 100%) reported by Rescorla et al. (2007) for 31 societies. Parents also completed a questionnaire regarding socioeconomic and cultural variables.

Parents of 276 children (20% of the sample of 1,374) reported that their child had received mental health treatment, had ≥2 years of educational delay, and/or had a significant developmental or medical diagnosis; 24 children met >1 of these criteria. We classified these 276 children into three mutually exclusive subgroups: 196 children (mental health subgroup) had received mental health services during the preceding year, but no diagnoses or educational delay were reported; 34 children (educational delay subgroup) had been retained in grade at least twice (10 of whom had also received mental health services and 3 of whom met all three criteria); and 46 children (diagnosed subgroup) had received a developmental, psychiatric, or medical diagnosis from a physician (11 of whom had also received mental health services). Among the diagnoses reported were Down Syndrome, other forms of intellectual disability, hemiparesis, deafness, visual impairment, cerebral infarct, and severe hypothyroidism, but parents of many of the 46 children did not report a specific diagnosis.

In the Achenbach and Rescorla (2001) U.S. national survey of 6- to 18-year-olds, 12.5% were reported to have received mental health services or special education services for major behavioral, emotional, or developmental problems in the preceding year. Achenbach and Rescorla (2001) excluded these children from their normative sample in order to provide what epidemiologists term a “healthy” sample. Using the same methodology, we excluded the 276 Uruguayan children in the national sample with documented special needs in order to obtain a “healthy” Uruguayan non-referred sample of 1,098 children. The 276 excluded children were retained as a comparison group, denoted as the referred group. However, it should be noted that this referred group was more heterogeneous than if it had been recruited from inpatient or outpatient mental health services.

As can be seen in Table 1, 502 of the 1,098 children in the non-referred sample were boys (46%) and 596 were girls (54%), with a mean age of 8.6 years. In contrast, the referred sample consisted of 61% boys and 39% girls, with a mean age of 8.8, a significant difference in gender distribution relative to the non-referred sample, χ 2 (1) = 20.3, p < 0.001. In the non-referred sample, 36% came from Montevideo and 64% came from the rest of the country. In the referred sample, 41% came from Montevideo and 59% came from the rest of the country.

Table 1 Demographic characteristics of Uruguayan non-referred and referred samples

SES was scored using a 5-level Uruguayan government index based on the neighborhoods and facilities of each school. As Table 1 shows, the SES distribution in the non-referred sample was 24%, 22%, 28%, 14%, and 13%, from lowest to highest levels. The SES distribution in the referred sample was comparable. Both samples were representative of the overall SES distribution for Uruguay. Parental education was dichotomized as ≤6 years versus >6 years (6 years of primary schooling are required in Uruguay). Thirty percent of mothers in the non-referred sample versus 41% of mothers in the referred sample had completed ≤6 years of education, a significant difference, χ 2 (1) = 10.8, p < 0.001. Paternal education information was missing for 7% of the non-referred and 14% of the referred children; 29% and 34% of fathers in the non-referred and referred samples, respectively, had completed ≤6 years of education, also a significant difference, χ 2 (1) = 6.1, p < 0.05.

For several analyses, Uruguayan CBCL scores were compared with those obtained for non-referred and referred children in the U.S. (Achenbach and Rescorla 2001). The U.S. non-referred group consisted of 733 6- to 11-year-olds drawn from the U.S. normative sample, which was recruited from 40 states; children who had received mental health or special education services in the preceding year were excluded. The U.S. demographically-matched referred group consisted of 733 6- to 11-year-olds assessed in 20 inpatient and outpatient mental health services. The U.S. non-referred and referred samples were identical in gender distribution (52% boys and 48% girls) and closely matched on age, ethnicity, and SES (based on parental occupation).

Measure

The present study used the 2001 version of the CBCL (Achenbach and Rescorla 2001), which assesses behavioral and emotional problems and adaptive competencies over the past six months for children ages 6 to 18, as rated by their parents. The problem portion of the CBCL contains 120 problem items that parents rate as 0 = not true, 1 – somewhat or sometimes true, or 2 = very true or often true. The CBCL is written at a fifth grade reading level. Two open-ended problem items often omitted by parents were excluded from analysis, leaving 118 items for this study. The competence portion of the CBCL assesses the child’s participation in sports, hobbies, jobs, and organizations; academic functioning; and relations with peers and family members.

Before beginning data collection, the official Spanish version of the CBCL (Achenbach and Rescorla 2001), which was developed for use in the U.S. with Hispanic parents, was examined to determine if any modifications were needed for Uruguay. Like the English CBCL, the official Spanish version is written at about a fifth grade reading level and has been used successfully with low SES parents. Because the official Spanish CBCL was the product of an extensive translation and revision process carried out over many years involving numerous Spanish speakers from different countries, only minor modifications were needed for Uruguay. Minor wording changes were made in 20 of the 120 problem items, typically involving replacing or deleting one or two words or changing the grammatical form of an item. Additionally, minor changes were made in the sample occupations listed for parents, as well as in the lists of sports, hobbies, and jobs on the competence portion of the CBCL. Prior to the epidemiological survey, 40 parents of low SES participated in a pilot test to ensure that the CBCL would be understood. The results indicated that the Uruguayan CBCL was easily understood by parents with limited education.

We applied the Rescorla et al. (2007) methods for analyzing the 2001 CBCL scales (Achenbach and Rescorla 2001). The scales included eight syndromes, three broad-band scales (Internalizing, Externalizing, and Total Problems), six DSM-oriented scales, three specific competence scales (Activities, Social, and School), and broad-band Total Competence.

Data Analysis

SPSS 15 was used for all data analyses except the CFA, which was conducted using MPlus 5.0 (Muthén and Muthén 2007). First, mean item ratings for the non-referred Uruguayan and U.S. samples were correlated. Second, the CBCL 8-syndrome model was tested for the Uruguayan item ratings using CFA. Third, Cronbach’s alphas for each CBCL problem and competence scale were computed and compared to U.S. alphas. Fourth, effects of school SES (using the 5-level Uruguayan government index), gender, and age group (6–8, 9–11) on CBCL problem and competence scale scores were tested. Fifth, the effects of country, age, gender, and referral status were tested using ANOVAs and multivariate analyses of variance (MANOVAs). Sixth, decision statistics tested the ability of CBCL scales to differentiate between referred and non-referred children in the Uruguayan sample. For this last analysis, 276 children were chosen from the non-referred sample to demographically match the referred sample in age, gender and SES, following the Achenbach and Rescorla (2001) procedure. For all analyses, we set p < 0.001 as the criterion for statistical significance to take account of the large sample sizes and numerous statistical tests. Effect sizes (ESs) for ANOVAs and MANOVAs are represented by η2.

Results

Results for Mean Item Ratings

The correlation between the 118 mean item ratings for the 1,098 Uruguayan non-referred children and the 733 U.S. non-referred children was 0.82, higher than the mean correlation of 0.79 for the U.S. and 30 other societies in Rescorla et al. (2007). This large correlation indicates strong comparability between Uruguay and the U.S. with respect to the items receiving high, medium, or low mean ratings.

CFA Results

Following Ivanova et al.’s (2007) procedures, we tested the fit of the Uruguayan data to the 2001 CBCL 8-syndrome model using the WLSMV estimator on tetrachoric correlations (ratings of 0 vs. 1 and 2) for the 102 items comprising those syndromes. We used the Root Mean Squared Error of Approximation (RMSEA) as the primary index of the model’s fit (values ≤0.06 indicate good fit). The Tucker-Lewis Index (TLI) and the Comparative Fit Index (CFI) were used as additional measures of model fit (values ≥0.90 indicate good fit). All three indices indicated that the Uruguayan data fit the U.S. 8-syndrome model (RMSEA = 0.037, TLI = 0.930, CFI = 0.902), with the RMSEA being within the range reported by Ivanova et al. (2007) for 30 societies (0.026 to 0.055). All Uruguayan items loaded significantly on their predicted factor, with the following mean item loadings: Anxious/Depressed = 0.54, Withdrawn/Depressed = 0.61, Somatic Complaints = 0.59, Social Problems = 0.53, Thought Problems = 0.53, Attention Problems = 0.65, Rule-Breaking Behavior = 0.58, and Aggressive Behavior = 0.67. The differences between the Uruguayan mean factor loadings and the mean loadings reported by Ivanova for 30 societies ranged from 0.00 to 0.06 (mean of 0.03), indicating great consistency between the Uruguayan loadings and the average of the loadings for 30 other societies.

Internal Consistency of CBCL Scales

Table 2 displays Cronbach’s alphas for problem scales for the non-referred and referred Uruguayan samples, as well as for the 6- to 11-year-olds in the demographically-matched non-referred and referred U.S. samples (Achenbach and Rescorla 2001). In both countries, alphas tended to be higher for the referred sample than for the non-referred sample, reflecting the greater variance in item ratings in the referred samples. The highest alphas in both countries were found for the three broad-band scales (Internalizing, Externalizing, and Total Problems), with all alphas ≥0.80. In both countries, alphas for the 17 problem scales in referred samples were correlated 0.89 with alphas in the non-referred samples. The Uruguay-U.S. correlation was 0.92 for non-referred samples and 0.93 for referred samples, both higher than the mean bi-society r of 0.88 for alphas reported by Rescorla et al. (2007). Thus, alphas obtained in Uruguay were very similar to those obtained in the U.S. in terms of the rank ordering of alphas across the 17 problem scales, the absolute levels of the various alphas, and the finding that all alphas were higher in referred than in non-referred samples.

Table 2 Cronbach’s alpha for CBCL scales in Uruguay and U.S. samples

Table 2 also displays alphas for the four competence scales. In the Uruguayan non-referred sample, all parents responded “No” to the School scale item (“Has your child had academic or any other problems in school?”), thereby reducing the alpha for that scale. Except for the School scale, Uruguay and U.S. competence scale alphas were quite similar. Correlations between Uruguayan and U.S. alphas could not be computed with only four competence scales.

Effects of SES, Gender, and Age on CBCL Scale Scores

Prior to our main analyses of demographic factors, we calculated correlations between SES (scored for each child’s school), maternal and paternal education, and CBCL scores. Maternal and paternal education (dichotomized as ≤6 years or >6 years) had a phi correlation of 0.44 (p < 0.001), and both had point-biserial rs of 0.34 with school SES (p < 0.001). The percentage of mothers and fathers with ≤6 years of education steadily decreased across the five levels of school SES (from 57% to 10% for mothers and from 56% to 6% for fathers, in the lowest to highest SES groups, respectively). Maternal and paternal education had small but significant negative correlations with Total Problems (−0.12 and −0.16, respectively) and much larger positive correlations with Total Competence (0.31. and 0.30, respectively). The correlation between school SES and Total Problems was −0.19 (p < 0.001), whereas the correlation with Total Competence was 0.29 (p < 0.001).

To evaluate the effects of demographic factors on Total Problems scores, we conducted a 5 (school SES) × 2 (gender) × 2 (age group 6–8, 9–11) × 2 (referral status) ANOVA. The effect of school SES was significant and medium in size, F (4, 1, 1334) = 21.36, p < 0.001, η 2 = 0.06, with higher scores for children from lower SES schools. Effects of gender and age were not significant. As expected, the referral status effect was significant and large, F (1, 1334) = 207.9, p < 0.001, η 2 = 0.14. Whereas mean Total Problems score was 29.4 for non-referred children, it was 51.9 for referred children (mental health subgroup = 46.2, educational delay subgroup = 58.5, and diagnosed subgroup = 71.4). Furthermore, the SES x referral status interaction, which is depicted in Fig. 1, was also significant, F (4, 1334) = 5.93, p < 0.001, η 2 = 0.02. When the two samples were examined separately, the ES for SES was 0.03 in the non-referred sample and 0.13 in the referred sample, indicating that school SES had a bigger impact on problem scores in the referred than in the non-referred sample.

Fig. 1
figure 1

Effects of school SES on Total Problems scores in referred and non-referred Uruguayan children

A parallel 5 × 2 × 2 × 2 ANOVA was conducted on Total Competence scores, which were available for 1,206 children. Results indicated significant effects for SES, F (4, 1166) = 20.05, p < 0.001, η 2 = 0.06, and referral status, F (1, 1166) = 24.04, p < 0.001, η 2 = 0.02.

Effects of Country, Referral Status, Age, and Gender on Scale Scores

Table 3 displays mean scores on CBCL problem and competence scales for referred and non-referred samples in Uruguay and the U.S. To examine the effects of country, age, gender, and referral status, 2 × 2 × 2 × 2 ANOVAs were conducted for Total Problems and Total Competence, whereas 2 × 2 × 2 × 2 MANOVAs were conducted for Internalizing and Externalizing, the eight syndrome scales, the six DSM-oriented scales, and the three competence scales. Because the percentage of referred children was so different in the U.S. and Uruguayan samples (50% vs. 20%), the effect of country was tested in simple effects analyses for referred and non-referred samples separately. Significant country effects are noted in Table 3. The only interactions described are those between country and the other variables, as these were of primary interest.

Table 3 Mean CBCL scale scores (SDs) in Uruguay and U.S. samples

Results for Total Problems indicated main effects for referral status (ES = 0.25), age (ES < 0.01, older > younger), and gender (ES < 0.01, boys > girls), as well as significant interactions for country x referral status (ES = 0.03) and country x gender (ES < 0.01). Referred children scored significantly higher than non-referred children in both countries, with a larger effect in the U.S. than in Uruguay, as shown in Fig. 2. Simple effects analysis indicated that mean Total Problems score was significantly lower in the Uruguayan referred sample than in the U.S. referred sample (51.9 vs. 63.2), F (1, 1007) = 26.14, p < 0.001, η 2 = 0.03), as well as having significantly less variance by Levene’s Test (SDs of 27.95 vs. 32.55), F (1, 1007) = 7.95, p < 0.01). In contrast, mean Total Problems score was significantly higher in the Uruguayan non-referred sample than in the U.S. non-referred sample (29.4 vs. 23.1), F (1, 1829) = 58.31, p < 0.001, η 2 = 0.03), but Levene’s Test did not indicate a significant difference in variance (SDs of 17.4 vs. 16.6), F (1, 1007) = 3.67, p = 0.06.

Fig. 2
figure 2

Total Problems scores in Uruguay and the U.S. by referral status

In the MANOVA for Internalizing and Externalizing scores (see Table 3), referred children in both countries had higher Internalizing scores than non-referred children (ES = 0.16). The country x referral status was significant (ES < 0.01), with a larger referral status effect in the U.S. than in Uruguay. Simple effects analysis indicated that mean Internalizing score was significantly higher in Uruguay than in the U.S. only for the non-referred sample. Gender did not have a significant effect on Internalizing, nor was the country x gender interaction significant. Older children had slightly higher Internalizing scores than younger children (ES = 0.02). The referral status effect for Externalizing (ES = 0.23) was larger than that for Internalizing, and the country x referral status interaction was also larger (ES = 0.05), with the referred/non-referred difference much larger in the U.S. (21.6 vs. 6.3) than in Uruguay (15.1 vs. 8.2). Uruguayan mean Externalizing scores were significantly higher for non-referred children but significantly lower for referred children than U.S. mean Externalizing scores. Age did not have a significant effect for Externalizing, but boys had significantly higher Externalizing scores than girls in both countries (ES < 0.01).

In the MANOVAs for the eight syndromes and six DSM-oriented scales, the significant ESs for referral status ranged from 0.07 (Somatic Complaints) to 0.23 (Aggressive Behavior) for syndromes and from 0.05 (Somatic Problems) to 0.20 (Conduct Problems) for DSM-oriented scales. Boys obtained significantly higher scores than girls on Thought Problems, Attention Problems, Rule-Breaking Behavior, Aggressive Behavior, DSM-Attention Deficit Hyperactivity Problems, DSM-Oppositional Defiant Problems, and DSM-Conduct Problems. Older children obtained significantly higher scores than younger children on Anxious/Depressed, Withdrawn/Depressed, and Somatic Complaints, DSM-Affective Problems and DSM-Somatic Problems. The country × referral status interaction was significant for all syndromes except Somatic Complaints and all DSM-oriented scales except DSM-Somatic Problems, with a larger referred/non-referred difference in the U.S. than in Uruguay (ESs ≤ 0.01 for all syndromes, except 0.02 for Thought Problems, 0.03 for Rule-Breaking Behavior, and 0.05 for Aggressive Behavior, and DSM-Conduct Problems).

For Total Competence, the ES for referral status was 0.135, just under Cohen’s (1988) threshold for a large effect. The country × referral status interaction was also significant (ES = 0.06), reflecting a larger difference between referred and non-referred children in the U.S. than in Uruguay. ESs for referral status were 0.04 for Activity, 0.11 for Social, and 0.19 for School, but the significant country x referral status interactions of 0.03, 0.05, 0.03, respectively, indicate much larger differences in competence scores between referred and non-referred children in the U.S. than in Uruguay. For non-referred samples, all competence scale scores except for the School scale were significantly higher in the U.S. than in Uruguay. For referred samples, Uruguayan mean scores were significantly lower on the Activity scale and significantly higher on the School scale than U.S. mean scores.

Decision Statistics Analysis

For the final set of analyses, we tested the ability of the CBCL to discriminate between the non-referred and referred samples in Uruguay. Children were sampled randomly from the non-referred group to achieve the same gender ratio and approximately the same sample size as the referred group. This non-referred subsample of 279 children was identical to the referred sample of 276 in gender distribution (61% boys) and was quite similar in age distribution (ages 6 to 8, 43% in the non-referred and 40% in the referred).

To define deviance, we used a Total Problems score of ≥49, which is the clinical range cutpoint (90th percentile) reported by Achenbach and Rescorla (2007) for the middle-scoring group in their multicultural norms (where Uruguay’s mean score falls). Because the mean Total Problems score for non-referred children was somewhat higher in Uruguay than in the U.S., 16% of Uruguayan non-referred children but only 9% of U.S. non-referred children were identified as deviant using this cutpoint in the current study.

When referral status and deviance thus defined were cross-tabulated, the resulting odds ratio was 4.43 (CI 2.98 to 6.59), indicating that children who scored in the deviant range on the CBCL Total Problems scale were four times more likely to be in the referred group than the non-referred group. Sensitivity was only 46%, indicating that 54% of the referred group scored below the cutpoint. However, specificity was 84%, indicating that most of the children in the non-referred group scored below the cutpoint. Of children scoring in the deviant range, 74% were from the referred group (positive predictive value). However, only 61% of children scoring below the cutpoint were from the non-referred group, as many children in the referred group also had scores below the cutpoint (negative predictive value). Results from receiver operating curve analysis (ROC) (Swets 1996) were consistent with these cross-tabulation findings. The area under the curve (AUC) was 72%, indicating only moderately good prediction. Discrimination between the referred and non-referred groups was much stronger (AUC = 87%) in the U.S. sample, where only 35% of the referred group scored below the 90th percentile cutpoint, compared to 54% in Uruguay.

Discussion

Although Uruguay differs from the U.S. in population size, land mass, history, language, ethnicity, political system, economic development, and educational system, results from six different types of analysis support the multicultural robustness of the CBCL in Uruguay. The r of 0.82 between mean item ratings in the Uruguayan and U.S. samples indicated substantial consistency between the two countries in the items that received high, medium, or low ratings. Additionally, CFA results indicated that the Uruguayan item data manifested good fit with the CBCL 8-syndrome model derived in the U.S. Furthermore, Cronbach’s alphas for the 17 problem scales in Uruguay were very comparable to those obtained in the U.S. in terms of the rank ordering of alphas across the scales, the absolute levels of the various alphas, and the finding that alphas were higher in referred than in non-referred samples. Results for mean item ratings, CFA, and internal consistency were also very comparable to those reported by Rescorla et al. (2007) for 31 societies and by Ivanova et al. (2007) for 30 non-U.S. societies.

Uruguayan non-referred children obtained significantly higher mean scores than U.S. non-referred children on Total Problems, as well as on Internalizing and Externalizing, five CBCL syndromes and four DSM-oriented scales. This indicates that Uruguay’s tendency toward higher CBCL scores was evident for many kinds of problems. Despite the fact that Uruguayan scores were higher than U.S. scores, Uruguay still placed in the middle-scoring group derived from Rescorla et al.’s (2007) 31-society comparison and in Achenbach and Rescorla’s (2007a) multicultural norms for the CBCL.

Although we cannot be certain why Uruguayan scores were higher than U.S. scores, we predicted this finding because CBCL scores in Chile and Argentina were higher than U.S. scores in previous studies. We do not know what cultural or demographic factors are most responsible for this pattern. However, higher rates of problems in South American countries to not appear to be specific to the CBCL. For example, Fleitlich-Bilyk and Goodman (2004) reported 12.7% prevalence for child psychiatric disorders in Brazil versus 9.7% prevalence in the United Kingdom, using the same diagnostic interview procedure. Belfer and Rohde (2005) state that Latin American and Caribbean countries “have some of the most intractable child mental health problems seen anywhere on the globe,” which they attribute to factors such as severe income inequality, child homelessness, war and violence, and elevated school dropout rates.

Schools attended by the children in this sample spanned the full SES range, as measured by a 5-level school SES index. Consistent with findings in many other societies, lower SES was significantly associated with higher problem scores. SES had an ES of 0.06 on Total Problems scores in Uruguay, compared to a 0.01 ES in the U.S. This may reflect a greater degree of SES inequality in Uruguay than in the U.S.

Another important finding of the study was the significant SES x referral status interaction, depicted in Fig. 1. Referred children showed a much larger decrease in mean Total Problems scale with increase in school SES than did non-referred children (SES ESs of 0.13 vs. 0.03). This suggests that a higher SES school served as an important protective factor for Uruguayan children at risk due to developmental, physical, behavioral, or emotional problems. We speculate that the more favorable environments found in higher SES schools relative to lower SES schools (e.g., more educational stimulation, greater financial resources, and safer environments) may significantly reduce morbidity. An important public health strategy for reducing mental health burden in South American countries such as Uruguay might therefore be to provide more school-based support services for troubled children, particularly in lower SES communities.

The Uruguayan ES for SES on Total Competence (0.06) was somewhat larger than the U.S. ES of 0.04 reported by Achenbach and Rescorla (2001) but in the same range. Referral status had a negligible effect on Total Competence and the referral status x SES interaction was not significant. This indicates that Uruguayan children in higher SES schools had higher competence scores than children in lower SES schools regardless of their referral status.

Referred children scored significantly higher on all CBCL problem scales and lower on all competence scales than non-referred children in both Uruguay and the U.S. However, Uruguayan referred children had lower scores than U.S. referred children. We speculate that this is because the U.S. referred sample was recruited from inpatient and outpatient mental health services whereas the Uruguayan referred sample was recruited in regular classrooms. Samples selected from mental health service settings would be expected to have many children with high rates of problems, as this is the reason they are typically referred for services. Because the Uruguayan school-based referred sample was comprised of children with educational delay, intellectual disability, medical disorders, and miscellaneous mental health issues, we would expect that problem scores would be higher than those of non-referred children but not as high as those of a true clinical sample. For this same reason, discrimination between referred and non-referred groups was weaker for the Uruguayan sample than for the U.S. sample (AUCs of 72% vs. 87%), as well as weaker than discrimination reported by Achenbach and Rescorla (2001).

Whereas 12.5% of children in the U.S. national survey sample had received mental health or special education services for behavioral, emotional, or developmental problems in the preceding year, 20% of the Uruguayan school-based sample recruited from regular classrooms had been referred for mental health services, retained at least 2 years in school, or diagnosed with a significant medical or developmental condition. These referred children were not receiving any special education services in school, as such services are not provided in Uruguayan schools. The 3% of Uruguayan children ages 6 to 11 who do not attend regular schools attend special schools, which were not sampled in this survey.

In both countries, boys had significantly higher scores than girls on Externalizing, Thought Problems, Attention Problems, Rule-Breaking Behavior, Aggressive Behavior, DSM-Attention Deficit Hyperactivity Problems, DSM-Oppositional Defiant Problems, and DSM-Conduct Problems. Similarly, in both countries, older children had significantly higher scores than younger children on Anxious/Depressed, Withdrawn/Depressed, Somatic Complaints, DSM-Affective Problems and DSM-Somatic Problems. Thus, effects of gender and age were very similar in Uruguay and the U.S. and consistent with those reported by Rescorla et al. (2007) in the 31-society comparison.

Limitations

Although this is the first epidemiological survey of child mental health in a representative sample of Uruguayan children, it was limited to ages 6–11 and only parents’ reports were obtained. Additionally, although our SES measure had the advantage of being an official government index, it classified schools rather than families. Finally, only 196 of the 276 children in the referred sample were reported to have received mental health services, which may help explain why the Uruguayan referred sample had lower CBCL problem scores than the U.S. referred sample. On the other hand, because this referred sample had lower problem scores than might have been found in a sample recruited from mental health services, these children may have been more likely to benefit from secondary-level interventions provided in schools than a more typical mental health service sample.

Conclusions and Future Directions

Our CBCL findings for mean item ratings, factor structure, internal consistency of scales, and age, gender, referral status, and SES effects for Uruguayan children were very consistent with those reported for U.S. children. Furthermore, results were very consistent with those reported by Rescorla et al. (2007) for 31 societies. Findings from this study thus provide strong support for the multicultural robustness of the CBCL in Uruguay. However, CBCL data from other South American countries are needed to test the generalizability of our findings and to determine if those countries, like Uruguay, fall in the middle-scoring group for the CBCL’s multicultural norms (Achenbach and Rescorla 2007a).