Introduction

The global incidence rates for anxiety and depression disorders in children and adolescents are between 18% and 6.50% (Bronsard et al., 2016; Polanczyk et al. 2015). In Chile, the prevalence rate is similar (Vicente et al. 2016). The studies indicate that the comorbidity between anxiety disorder and youth depression is extremely high (Kohn, 2018; Vicente et al., 2016).

Emotional disorders in young people are a public health problem not only because of the consequences and negative impact they have on youth quality of life, well-being and development, but also because of the socioeconomic and health costs they entail (Brännlund et al. 2017; Conway et al. 2016; Chisholm et al., 2016; Red Proem, 2018; Richmond-Rakerd, et al., 2020).

Hence, it is important that clinicians have access to reliable and valid assessment tools to facilitate early detection and treatment design (Spence, 2018). However, empirical studies conducted in Chile on assessment tools of anxiety and depression for children and adolescents are scarce (eg.: Martínez-González et al. 2015; Román et al. 2016). As noted, there is a lack of reliable and valid measures in Chile. In addition, the RCADS has considerable advantages to allow the assessment of general anxiety-depressive symptoms, while allowing the analysis of each specific symptom. It is, therefore, a multidimensional and global measure, which also improves the tools available to Chilean mental health providers and also allows for international comparisons and cross-cultural studies.

Revised Child Anxiety and Depression Scale (RCADS)

Internationally, there are different questionnaires with excellent psychometric properties for the assessment of severity of depression and anxiety symptoms. However, because of the comorbidity of anxiety and depression disorders, over the past few years, there have been a higher number of studies using one of the unique measure including both symptoms of anxiety and depression disorders according to the Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria, the Revised Child Anxiety and Depression Scale (RCADS; Chorpita et al. 2000).The RCADS is a self-report measure developed to screen and identify clinical symptomatology of anxiety and depression among children and adolescents. RCADS has proven to be a reliable and valid measure for the assessment of anxiety and depression in clinical and general populations of children and adolescents (Piqueras et al. 2017a). The RCADS was developed from the Spence Children’s Anxiety Scale (SCAS; Spence, 1998). The main changes included in the RCADS refer to the inclusion of a scale of depression and the reformulation of a new scale of generalized anxiety that better represents the DSM criteria for the diagnosis of generalized anxiety disorder. Other modifications were the suppression of the scale of fear of physical harm and the purging of the scale of panic disorder (Chorpita et al., 2000).

The initial version was composed of 47 items (RCADS-47) and 6 subscales: separation anxiety disorder (SAD), social phobia (SP), generalized anxiety disorder (GAD), panic/agoraphobia (PD), obsessive-compulsive disorder (OCD) and major depression (MD) (Chorpita et al., 2000). Later, a shorter 30-item version of the RCADS, which retains the 6 original subscales, was validated (RCADS-30) (Sandín et al. 2010). Numerous international studies have confirmed the excellent psychometric properties of the RCADS-47 and RCADS-30 in different contexts, cultures and languages (Bouvard et al. 2015; Donnelly et al. 2019; Pineda et al. 2018; Piqueras et al., 2017a, b; Stevanovic et al., 2017).

Most of the studies have confirmed a 6-factor model, consisting of the 6 specific symptoms subscales previously described (i.e., Bouvard et al., 2015; Piqueras et al., 2017a; Sandín et al., 2010). Estimates of internal consistency for the RCADS-47 and RCADS-30 showed similar results for each subscale: MD (α = .79 in the RCADS-47 and α = .72 in the RCADS-30), PD (α = .80 in the RCADS-47 and α = .74 in the RCADS-30), SP (α = .81 in the RCADS-47 and α = .75 in the RCADS-30), SAD (α = .71 in the RCADS-47 and α .71 in the RCADS-30), GAD (α = .79 in the RCADS-47 and α = .78 in the RCADS-30), and OCD (α = .68 in the RCADS-47 and α = .68 in the RCADS-30) (Sandín et al., 2009, 2010). Recently, a meta-analysis using a sample of 25 studies concluded that all versions of the RCADS are reliable to assess anxiety and depression symptoms in different cultural environments. Specifically, the scores of RCADS, in all its versions, has an excellent mean reliability (α = .93). Similarly, the mean Cronbach’s alpha for the anxiety scale was excellent (α = .93), and the other subscales had a mean Cronbach’s alpha between .74 and .85, showing good reliability measures (Piqueras et al., 2017a). A later study on the RCADS-47 showed an internal consistency between .88 and .69 in an Irish population (Donnelly et al. 2019). Similarly, a study with a Spanish sample found a Cronbach’s alpha of .96 for the total scale and a range between .80 and .89 for the subscales of the RCADS-30 (Piqueras et al., 2017b).

The RCADS has been used in studies on Chilean populations; however, these studies did not report psychometric results beyond internal consistency values (Araya et al., 2013; Gaete et al., 2016; Martínez-González et al., 2015; Martínez-González et al., 2020; Stapinski et al., 2013). For example, Araya et al. (2013) used only the panic, social phobia, and generalized anxiety subscales of the RCADS-30. The authors found that the total internal consistency was .84 (.81 for men and .84 for women). Martínez-González et al. (2015) reported adequate internal consistency values for all subscales of the RCADS-47 in a Chilean sample: PD (α = .85), SP (α = .82), SAD (α = .73), GAD (α = .74), OCD (α = .71), MD (α = .85), and total RCADS (α = .94).

In recent years, important research has begun to apply Item Response Theory methods (IRT) to validate clinical measures (Christensen et al., 2019). In fact, different authors (Van der Linden, 2017) has claimed that IRT analysis which focus on the quality of items in measuring underlying construct are valuable complement to classical test theory approaches (e. g. factor analysis and internal consistency analysis). To our knowledge, there have been no studies conducted on the RCADS using Item Response Theory (IRT). In this sense, IRT models (Wright & Masters, 1982) will give researchers more confidence in applying the scale in wider contexts. Differing from classical test theory, which considers that an observed test score is composed by a true score and a random error component, IRT considers that “the probability of a person’s expected response to an item is a mathematical function of that person’s ability and one or more parameters characterizing the item” (Bond & Fox, 2015, p. 363).

The Rasch model (Rasch, 1960) is the most known among IRT models, providing a method based on the calibration of ordinal data from a shared measurement scale and enabling one to test conditions such as dimensionality, linearity and local independence. This model establishes that the difficulty of the items and the ability of the participants can be measured on the same logarithmic scale, and the likelihood that a subject responds correctly to an item is based on the difference between the ability of the subject and the difficulty of the item (Reckase 1997). Calibration is the procedure used to estimate person latent trait or item difficulty by converting raw score odds to logits on an IRT measurement scale (Bond & Fox, 2015). Moreover, as the RCADS-30 has different factors, a multidimensional Rasch model is a better technique to simultaneously calibrate all subscales and increase the measurement precision by taking into account the correlations between subscales (Adams et al., 1997). In addition, previous studies found significant differences between males and females using the RCADS-47 and RCADS-30, with females presenting higher scores in the MD, SP, SAD, and GAD subscales and the total score scale (Donnelly et al., 2019; Pineda et al., 2018 Stevanovic et al., 2017). For this reason, the analysis of Differential Item Functioning (DIF) is crucial to verify construct equivalence among groups.

Regarding age differences, there are few studies with the RCADS-30. However, it seems that in late and mid-adolescence there are higher scores in all the subscales of the RCADS-30 compared to early adolescence in Spain (Pineda et al., 2018). Other studies have found that levels of depression and anxiety are higher in late and middle adolescence but that self-esteem in early adolescence could predict depressive symptoms in late adolescence as well as early adulthood too (Huesmann et al., 2019; Masselink et al., 2018).

In summary, the RCADS-30 is a widely used international scale with clearly contrasted psychometric properties. Therefore, its validation in the Chilean adolescent population has great relevance both for the educational community and for mental health care professionals.

The present study aimed to analyze the psychometric properties of the RCADS-30 in a non-clinical sample of Chilean children and adolescents using confirmatory factor analysis and multidimensional Rasch analysis. The specific objectives were as follows: to analyze (1) internal factor structure of scale, (2) internal consistency and temporal stability, (3) model-data fit of items and score category structure, and (4) invariance properties according to sex.

Method

Participants

An incidental sample was employed for the present study. One thousand thirty four Chilean students from two centers (one primary and one secondary school) of the metropolitan area of Santiago de Chile and two centers (secondary school) located at the south of the country (sixth and ninth region) participated in the study. Inclusion and exclusion criteria: children and adolescents enrolled in secondary schools from both rural and urban areas were included. In contrast, students with reading comprehension problems related to learning difficulties or cognitive deficits were excluded. Eighteen participants were excluded, 11 of them for difficulties in reading comprehension or possible intellectual functioning difficulties and seven for incomplete responses to questionnaires. The cases which has missing values in the analysis was eliminated. Final sample was 1016. Out of all the students, 39.60% were female. The mean age was 14.82 years (standard deviation = 2.21, range = 10–18). Interdependency between sex and age was found in the sample, χ2 (8) = 59.58; p = .001, with more males than females between 15 and 17 years old.

Measures

The 30-Item Version of the Revised Child Anxiety and Depression Scale (RCADS-30; Sandín et al., 2010 )

RCADS-30 items are scored 0 to 3 points (never – almost always) for SAD (5 items. E.g.: “I feel scared if I have to sleep on my own”), SP (5 items. E.g.: “I feel afraid that I will make a fool of myself in front of people “), GAD (5 items. E.g.: “I worry that bad things will happen to me”), PD (5 items. E.g.: “My heart suddenly starts to beat too quickly for no reason”), OCD (5 items. E.g.: “I get bothered by bad or silly thoughts or pictures in my mind”) and MDD (5 items. E.g.: “I feel sad or empty”). The RCADS-30 has previously demonstrated high reliability, and good internal consistency and convergent validity (Piqueras et al., 2017a).

Procedure

The European Spanish version of the RCADS-30 was revised by two expert Chilean psychologists and one Spanish psychologist who corroborated the cultural equivalence of the items. Subsequently, the clarity and easy comprehension of the items was verified in a pilot trial with 30 participants: 12 children (6 boys and 6 girls), and 18 adolescents (8 boys and 10 girls) of the metropolitan area of Santiago de Chile. No comprehension difficulties were found, and therefore the vocabulary could be kept in European Spanish language. As a result of this exploration, it was concluded that it was not necessary to modify the original wording of the instrument. Therefore, the RCADS retro-translation was not necessary. Finally, European Spanish version of RCADS-30 was included in this research.

The participants completed RCADS-30 in their classes. Appropriate instruction was provided for each scale. The researcher remained in the classroom during the administration in order to assist students who experienced difficulty. The tests were applied by experienced psychologists who gave instructions and provided individual assistance to students who needed it. The total time employed for all the scales was approximately 15 min. After 1 month, a second application of the RCADS-30 was administered to a random sample of 208 Chilean students who belonged to the same total sample. There were not age or sex differences between the sample completing the test-retest and the one that only took the test, nor in the RCADS scores in the first time point. The present study was approved by the ethics committee of the Miguel Hernández University (Alicante, Spain). The consent process for this study followed the same procedure that had received ethics approval for similar research implemented in schools in Spain, with reference number DPS-JPR-001-12. First, eligible schools were provided with information about the study, and interested schools signed written confirmation that their school wanted to participate. Second, schools provided a parental consent letter explaining the minimal risk and potential benefits associated with participation in this study and advised parents that they could withdraw their child from the study at any time. Third, all eligible children and adolescents were provided with information about the study, and they (only those between 12 and 18 years old) signed a written consent form to participate. The procedure for field work was carried out following all the requirements that the USACH established for the research. Additionally, we had got approval by the ethics committee of the visiting professor’s home university concerning an equivalent research in Spain.

Data Analysis

First, sex differences were calculated with t tests for each subscale. To determine whether there were age differences, three groups were formed: the first group consisted of students between 10 and 13 years of age (early adolescence group), the second group consisted of students between 14 and 16 years of age (mid-adolescence group), and the third group consisted of students between 17 and 18 years of age (late adolescence group). A one-way analysis of variance was conducted, followed by the post-hoc Games-Howell test, which is appropriate when there are groups with different numbers of participants and equal variances are not assumed. IBM SPSS statistics version 24 (IBM Corp., 2016) was used for this analysis. The Cohen’s criteria for effect size calculation was found from the comparisons that were significant, considering 0.20 as small, 0.50 as medium and 0.80 as large effect size (Cohen, 1988).

A confirmatory factor analysis was conducted with Mplus version 8 software (Muthén & Muthén, 2017). Missing data were treated with the Full Information Maximum Likelihood (FIML). Weighted least squares with mean and variance adjusted estimator were used for the polytomic indicators in the RCADS-30, as the data distribution showed multivariate kurtosis. Current guidelines for good model fit suggest a standardized root mean square residual (SRMR) and root mean square error of approximation (RMSEA) values less than .05, and comparative fit index (CFI) and Tucker-Lewis index (TLI) values greater than .95 to indicate a well-fitting model. Acceptable fit values are commonly described as less than .08 for RMSEA and greater than .90 for CFI and TLI (Brown, 2015).

Temporal stability and correlations between RCADS-30 subscales were conducted with IBM SPSS statistics version 24 (IBM Corp., 2016). For temporal stability, correlations exceeding 0.70 have been suggested as acceptable for group comparisons (Nunnally & Bernstein, 1994), whereas Cohen’s criteria (Cohen, 1988) were used to assess the magnitudes of these relations between different variables: correlations greater than .50 were considered high, between .30 and 49 moderate, and between .10 and .29 low.

Reliability was assessed by multiple indicators. Given the multidimensional nature of the instrument, the omega coefficient was firstly used, which is based on a decomposition of the variance of a test within a factor analytic model (Revelle, 2018). Accepted values are similar to Cronbach’s alpha, between .80 and .90 (Salvia et al., 2010). Composite reliability (CR) was used to perform the internal consistency of the scale and to test the indicator in measuring the construct (Peterson & Kim, 2013). Finally, the average variance extracted (AVE) assesses the extent to which the items of a specific factor converge or share a high proportion of variance. AVE values greater than 0.5 are considered adequate (Hair et al., 2010).

A multidimensional Rasch rating scale model was fit to the data. ConQuest version 2.0 software (Wu et al., 2007) was used to conduct the multidimensional Rasch analysis. Maximum likelihood estimation method was employed for the parameters of the model. The RCADS-30 was treated as a multidimensional scale containing six unidimensional subscales, and the calibration of the six subscales was conducted at the same time in ConQuest using the Monte Carlo method. First and independently for each factor, local independence was calculated, which refers to the assumption that the response to an item should have no influence on the response to any other item within the same test (Wright, 1996). Local dependency between item pairs are reflected by the residual correlation matrix. A residual correlation value higher than 0.20 compared to the mean of all residual correlation is considered as a sign of local dependency (Christensen et al., 2017; Marais & Andrich, 2008). Next, Rasch reliability measures, infit and outfit statistics were used to check the quality of the scale. These indexes are the mean value of the squared residuals. Therefore, the larger the squared residual, the larger the misfit between data and model. Infit statistic is an information-weighted sum, so this variance is larger for well-targeted observations and smaller for extreme observations (Bond & Fox, 2015). d Values of outfit and infit mean squares can range from 0 to positive infinity. Values below 1 indicate a higher than expected fit of the model, whereas values greater than 1 indicate poor fit of the model. Category’s function of the rating scale was also examined, according to the monotonic increasement of the thresholds and the count of answers for each category (Linacre, 2002). Finally, the DIF estimated the distribution of the difficulty parameter in the sample of males and females. As suggested by previous studies (Wang et al., 2006), a difference equal to or larger than 0.5 logits was regarded as evidence of substantial DIF. The mean item parameters were set to be equal over groups so that the differences in the parameter estimates can be directly compared.

Results

Descriptive Statistics and Inter-Scales Correlations

Table 1 shows means and standard deviations and correlations among the subscales and the total scale. The correlations between each subscale and the total score of RCADS-30 were high, especially of SAD, PD, or OCD subscales. The inter-correlations among the subscales reached moderated values. None of the correlations exceeded .80, suggesting no problems with multicollinearity (Tabachnick & Fidell, 2007). The collinearity statistics, including the tolerance and variance inflation factor estimates, were within normal limits and ranged from .90 to 1.00 and from 1.00 to 1.11, respectively.

Table 1 Means (M), standard deviations (SD), reliabilities (Cronbach’s α) and intercorrelations between the RCADS-30 subscales and correlations with the RCADS-30 total score

First, all of the subscales showed significant sex differences, as girls had higher scores; with significant t values (p = .001). The different effect sizes were between small and intermediate (SAD: d = .39; GAD: d = .39; SP: d = .36; PD: d = .55; OCD: d = .28; and MDD: d = .56). Second, one-way ANOVA was performed to determine differences between the three groups for each RCADS-30 subscale (Table 2). There were statistically significant differences between the groups for all of the variables with the exception of the PD subscale. Games-Howell test for mean differences showed a similar pattern between the groups, as group 1 (early adolescence group) had significant higher measures than groups 2 and 3 in the majority of subscales, with the exception of MD, for which group 2 had higher scores than group 1. Groups 2 (mid-adolescence group) and 3 (late adolescence group) had no significant differences in any variable, and d tests revealed that magnitude differences were small in all significant comparisons.

Table 2 Results of ANOVA for subscales of RCADS-30

Confirmatory Factor Analysis

The results revealed a good model fit for the 6-factor model including the 6 specific symptoms subscales previously described (i.e., Bouvard et al., 2015; Piqueras et al., 2017a; Sandín et al., 2010): χ 2/dg = 1497/390, CFI = 0.94, TLI = 0.98, RMSEA = 0.05 (90% CI = 0.050–0.056), SRM = 0.05. Intercorrelations among the subscales were moderate and significant (p < .001), ranging from 0.14 to 0.41. Factor loadings were examined for each item with their corresponding factor. Table 3 shows the results obtained. The lowest values were found for the GAD (item 5), OCD (item 30), and MDD (item 7). Values were above .40 in all cases.

Table 3 Loadings of the confirmatory factor analysis for the correlated 6-factor model of the RCADS-30

Reliability Measures

Cronbachs’ alpha values can be seen in Table 1 ranged from .65 (OCD subscale) and .90 (total score).

Pearson correlations were calculated to test the temporal stability of the scale with a random sample of 188 participants. Results showed that the RCADS-30 had a high test-retest reliability (r = .79). The other values of the RCADS-30 subscales were also high (SAD: r = .70; GAD: r = .69; SP: r = .70; PD: r = .71 and MDD: r = .74), with all p < .01. However, a moderate positive correlation was found on the OCD subscale (OCD: r = .60).

Composed reliability (CR), omega coefficient, and the average variance extracted were calculated for each factor. Values of CR were .73 (SAD), .72(GAD), .74 (SP), .65 (OCD), and .71 (MDD). Omega coefficient values were .73 (SAD), .71 (GAD), .74 (SP), .65 (OCD), and .70 (MDD). Average variance extracted (AVE) values were .51 (SAD), .5 (GAD), .52(SP), .39(OCD), and .47 (MDD).

Multidimensional Rasch Analysis

Correlation of residuals was calculated by applying the Q3 statistic for each item pair. Mean residual correlation values were −0.24 for MDD, −0.27 for PD, −0,24 for SP, −0.23 for SAD, and − 0.26 for both GAD and OCD. The results showed that, for the MDD, the correlation of residuals ranged from −0.28 to −0.12, −0.36 to −0.03 for PD, −0.39 to −0.10 for SP, −0.41 to −0.09 for SAD, −0.43 to −0.06 for GAD, and − 0.46 to −0.10 for OCD. None of the item pairs residual correlation were above the critical value of 0.20. Therefore, no evidence of violation of the assumption of local independence was found Table 4.

The six dimensions of the RCADS-30 were calibrated simultaneously in ConQuest. Table 5 shows the difficulty estimates and fit statistics for each item. The majority of items were greater than 0.7 and equal or less than 1.3, with the exception of item 4, with an infit of 1.49 and outfit of 1.41; and items 3, 10, 11, 21, and 28, with infit values higher than 1.39. A clear explanation is that those items target on those individuals with highest level in the construct continuums (see Fig. 1). Given the clinical nature of the instrument, and its purpose to detect possible mental disorders, these items were finally kept. With respect to the category’s function of the rating scale, it is confirmed that step calibration increase monotonically to ensure that higher measures on the items represent higher traits under measurements, with threshold values of −1.393 between categories 0 and 1; 0.689 between categories 1 and 2; and 0.705 between categories 2 and 3. Total observe counts were 1969 for category 0, 2261 for category 1, 587 for category 2, and 263 for category 3. The DIF analysis was also conducted to assess the model-data fit across gender. As indicated in Table 4, none of the items showed a difference larger than or equal to 0.5 logits between males and females’ difficulty parameters.

Fig. 1
figure 1

Item person-map for the RCADS-30. Note: Each ‘X’ represents 11.7 cases. MDD Major Depression Disorder, PD Panic/Agoraphobia, SP Social Phobia, SAD Separation Anxiety Disorder, GAD Generalized Anxiety Disorder, OCD Obsessive-Compulsive Disorder

Table 4 Item difficulty measures, standard errors, infit and outfit mean squares
Table 5 Item difficulty measures, standard errors, infit and outfit mean squares, differential item functioning between males and females

A person-item map is provided in Fig. 1, as it is possible to calibrate a person’s measure from low to high, and item difficulty from easy to hard along the same latent trait scale. The six continuums on the left side indicate the participants’ measures in the six dimensions of the RCADS-30. Individuals who had high scores were placed at the top of the continuum and those who had lower scores were placed at the bottom of the continuum. Moreover, the items that fell into each of the six dimensions were clustered on the right side. It can be seen that all the items are distributed in the highest section of the map, which indicates that they appropriately discriminate individuals who have higher levels of anxiety or depression symptoms. Therefore, the measurement precision of the items is excellent.

Discussion

The aim of the present study was to analyze the psychometric properties of the short version of the RCADS-30 in a Chilean sample of children and adolescents. From the obtained results, we can conclude that the RCADS-30 presents adequate psychometric properties in Chilean adolescents between 10 and 18 years old, although some measurement issues should be addressed to improve construct validity.

First of all, the results indicated a high internal consistency of the RCADS-30 in Chilean adolescents, similar to that found in other studies (Martínez-González et al., 2015; Piqueras et al., 2017a, b). As in previous studies, the internal consistency of the OCD subscale was lower than .70 and it was lower than the correlations of the other subscale (Sandín et al., 2009, 2010). Moreover, this factor also had a low AVE value, and item contents should be further revised to reduce possible item error measurement. With regard to the correlations between each subscale and the RCADS-30 total score, they ranged from .41 to .84. However, the correlations between each subscale showed moderate to high values, and the temporal stability was near or above .70, according to Nunnally and Bernstein (1994) and consistent with estimates reported by previous studies (Bouvard et al., 2015; Sandín et al., 2010).

With regard to sex differences, the results of this study are in line with previous studies where girls had small to moderate higher scores for all subscales of the RCADS-30 (Donnelly et al., 2019, Pineda et al., 2018, Sandín et al., 2010; Stevanovic et al., 2017). With regard to age differences, the results coincided with prior studies that found higher scores for all the subscales of the RCADS-30 in the middle and late adolescence (Pineda et al., 2018).

Our data, for the total sample with a CFA, fit adequately to a 6-factor model, consisting of 6 subscales: separation anxiety disorder, social phobia, generalized anxiety disorder, panic/agoraphobia, obsessive-compulsive disorder, and major depression, identical to the original model proposed by Chorpita et al. (2000), and congruent with previous studies (Bouvard et al., 2015; Piqueras et al., 2017a; Sandín et al., 2010). Moreover, in order to have a deeper understanding of the measurement precision, a multidimensional Rasch analysis was implemented. In this sense, the instrument showed good fit to the Rasch model, although there six items (3, 4, 10, 11, 21, and 28) with high infit values due to a lack measurement precision, especially those loading on factor SAD. In general, item difficulty is reasonably spread at the top of the map, as the main objective of the scale is to detect high levels of emotional disorder symptoms. On the other hand, evidence of good psychometric properties also comes from the gender DIF analysis, and the impact of gender difference was taken into consideration. DIF analysis was performed to check the construct equivalence across gender. No substantial gender DIF was found for any item of the scale.

Given the need to have valid and reliable tools for the assessment of emotional disorders in children and adolescents in Chile, the validation of the RCADS-30 has a strong role to play in the assessment of youth anxiety and depression, in providing data to identify youth with elevated anxiety and/or depression symptoms for whom a more in-depth assessment is justified, to inform the case formulation, and to provide a baseline against which to monitor and assess change in response to treatment. In addition, it provides valuable information not only about anxiety and depression levels in general, but also about the type of anxiety and depression symptoms experienced by the young person.

Finally, it is important to mention some limitations of the present study. First, the results obtained cannot be generalized to other populations of children and adolescents, but they are extrapolated to the rest of Chilean youth population. Future studies should compare these results with those in children of other age levels. Second, in the present study, convergent validity was not measured using another test of depression, and therefore, future studies should address this limitation. Third, in this study a clinical sample is not included, future researches should include clinical sample and cross-cultural studies in order to deeply explore concurrent validation. IRT analyses between different samples (community vs clinical population) can be also used to explore item fit values, as a deeper analyses on item measurement precision (e.g. sample invariance and item anchoring) is needed to improve the quality of the scale.

As a final conclusion, the results of this work contribute to reinforce two assertions: (1.) the Chilean version of the RCADS-30 is an adequate instrument to measure emotional disorders in Chilean adolescents, and it can be used for an early detection of emotional symptoms; and (2.) the relevance of implementing analyses from the multidimensional Rasch model in the construction and/ or adaptation of tests because it gives relevant information about item parameters and scale measurement precision for a better adaptation of the instrument.