Introduction

It is considered best practice for clinicians and researchers to utilize information from multiple informants when assessing symptoms of pediatric psychopathology [1]. While this generates a more comprehensive understanding of symptomology, systematic differences occur in the ways that respondents report symptoms [2]. Issues related to discrepant reporting have begun to be explored for the Screen for Child Anxiety and Related Emotional Disorders - Parent and Child versions, (SCARED-P/C) [3, 4] a dual informant, gold-standard measure of pediatric anxiety symptoms. While the SCARED has been established as a valid, reliable, and sensitive measure of anxiety, prior studies have yielded inconsistent findings regarding informant (parent–child) agreement/discrepancy, with estimates of agreement ranging from r ~ 0.2 to 0.6 on both the SCARED total score and its subscales [3, 5, 6]. The present study assesses informant discrepancy on the SCARED in the largest sample to date and probes potential clinical, demographic, and familial correlates of discrepant reporting as well as potential psychometric contributors to informant discrepancy.

Previous literature posits that informant discrepancy regarding pediatric anxiety may vary systematically with a variety of demographic and clinical variables. A child’s age is one of the most consistent predictors of discrepant reporting. Reports from younger children are more discrepant from parent- or teacher-reports compared to those of older children [6,7,8,9,10]. Research investigating informant discrepancy as a function of child’s sex have yielded mixed results. Several studies have found no significant differences in informant agreement between girls and boys [6, 8, 11, 12], while others report moderate differences [7, 10, 13, 14]. Ethnicity [8], socioeconomic status (SES [8, 14]), parental psychopathology, and a variety of other family-related factors [7, 9, 14] have also been investigated as potential predictors of discrepancy, though this research has been sparser. Recently, Rappaport et al. explored informant discrepancy and discriminant validity on the SCARED [5]. While psychiatrically healthy children over-reported symptoms compared to their parents, children with a clinically diagnosed anxiety disorder under- or equally-reported symptoms relative to their parent. Further, informant discrepancy was significantly larger among youth with social anxiety disorder (SAD) relative to those with generalized anxiety disorder (GAD) or comorbid GAD and SAD. Further work is needed to characterize patterns of informant discrepancy and specific demographic, clinical, and psychometric factors that relate to discrepant reporting on the SCARED.

Measurement invariance and test–retest reliability are two such psychometric factors that may contribute to informant discrepancy. Tests of measurement invariance assess the extent to which different groups (e.g., informants or diagnostic groups) interpret questionnaire items similarly [15]. Establishing strict measurement invariance across informants and groups, a critical step in examining a questionnaire’s interpretability, suggests that the same underlying constructs contribute to the interpretation of items across groups [16]. The only prior study examining measurement invariance on the SCARED across parent–child dyads (N = 408; children were seeking treatment at an outpatient mental health facility) tested four standard levels of measurement invariance and found evidence for partial threshold invariance. This is an important step that warrants replication in a large sample including both patients with anxiety disorders and healthy volunteers [8]. Establishing strict measurement invariance across informants on the SCARED is beneficial for the continued use of the questionnaire.

In addition, issues with questionnaire reliability could potentially contribute to informant discrepancy. In their relatively small initial validation study (N = 88 children, N = 86 parents), Birmaher et al. [3] established strong test–retest reliability of the SCARED over a 5-day to 15-week window (intraclass correlation coefficient [ICC] = 0.70–0.90). Since then, Boyd et al. [17] found moderate test–retest reliability (r = 0.47) of SCARED child-report over a 6-month period. Additionally, studies examining translated versions of the revised 41-item SCARED have found mostly strong test–retest reliability, though generally over a much shorter time window of 7–14 days [18,19,20]. However, given the use of the SCARED in studies of the treatment of pediatric anxiety, it is important to establish test–retest reliability of the revised SCARED questionnaire over a longer window of time closer to the length of standard treatment. Additionally, no study to our knowledge has assessed the stability of informant discrepancy on the SCARED over time.

Building on recommendations and limitations from previous work, the aims of the current study were to assess informant agreement/discrepancy on the SCARED in the largest sample to date (N = 1092 parent–child dyads) and to elucidate clinical, demographic, familial, and psychometric factors contributing to informant discrepancy. Specifically, we quantified informant agreement and examined associations between four metrics of informant discrepancy and demographic factors as well as differences as a function of diagnosis (healthy vs. anxious youth). Further, we examined measurement invariance of the SCARED across informants and across ages for parent- and child-report in a sample missing no item-level data. Additionally, we investigated test–retest reliability of SCARED scores and of informant discrepancy. Finally, we examined associations with clinician-rated anxiety to explore external validity.

Methods

Sample and Setting

Participants included 1092 youth (7–18 years old) and parent dyads who completed the SCARED questionnaire. Individuals were enrolled in IRB-approved research protocols examining pediatric anxiety disorders at the National Institute of Mental Health (NIMH). Child participants and their parents provided written assent and consent, respectively. All participants were assessed using a structured diagnostic interview (Kiddie Schedule for Affective Disorders and Schizophrenia- Present and Lifetime version; K-SADS [21]). Of this sample, 457 youths were seeking treatment for an anxiety disorder and received a primary diagnosis of either generalized anxiety disorder, separation anxiety disorder, or social anxiety disorder. Healthy youth had no current or past psychiatric diagnoses. All child participants had an IQ > 70 and were medication-free; the presence of current major depressive disorder, obsessive compulsive disorder, or post-traumatic stress disorder were exclusionary. Dyads were selected for the current analyses if both parent and child completed the SCARED: with no missing item responses, within 2 months of one another, and prior to the child beginning treatment at the NIMH.

A subset of these participants was included in a test–retest analysis if they completed a second administration of the SCARED (5 days–15 weeks after the first administration), prior to the start of treatment with no more than four items missing on the second administration. This subsample included 339 parent-report forms and 359 child-report forms (n = 298 complete dyads). The timeframe mirrored Birmaher and colleagues’ initial SCARED reliability study [3].

Measures

Child anxiety symptoms were assessed using the SCARED parent and child versions. The SCARED-P and SCARED-C each consist of 41 items that assess a child’s recent anxiety symptoms. Participants respond on a 3-point Likert scale of 0 (Not True or Hardly Ever True), 1 (Somewhat or Sometimes True), or 2 (Very True or Often True). Prior confirmatory factory analyses suggest that the instrument measures five distinct domains of anxiety [4, 8, 22]. Thus, in addition to total scores, five subscales were examined: generalized anxiety symptoms (nine items), separation anxiety symptoms (five items), social anxiety symptoms (eight items), panic or somatic symptoms (seven items), and school avoidance (three items). A total score of 25 or above has been suggested to indicate the presence of clinically significant anxiety [3, 5].

Clinicians rated children’s anxiety severity during the previous week using the Pediatric Anxiety Rating Scale (PARS), a 50-item checklist examining symptoms of social, separation, and generalized anxiety, specific phobias, and physical symptoms [23]. Recent studies have found the PARS to be psychometrically reliable and valid, and it has been used as an outcome measure for several treatment studies [24,25,26]. Clinicians integrated parent and child report during interview assessment to rate seven areas of anxiety severity (number of symptoms, frequency, severity of distress associated with anxiety symptoms, severity of physical symptoms, avoidance, interference at home, and interference outside of home). A score of 3 on each of these 5-point scales reflects a clinically significant level of anxiety. Composite PARS scores were calculated by summing 5 of the 7 items (number of symptoms and severity of physical symptoms were excluded as they are likely less related to overall anxiety severity and tend to be highly skewed). PARS scores were only available for a subset of (n = 213) youth with a diagnosed anxiety disorder, and all scores reflected anxiety prior to beginning treatment at NIMH.

Children’s age, sex, ethnicity, and family socioeconomic status (SES) were assessed using a demographics questionnaire. Highest parental educational attainment and annual income were used as markers of SES. The Weschler Abbreviated Scale of Intelligence II (WASI [27]) was used to assess child IQ. The Family Risk Factor Checklist (FRFC [28]) is a 48-item measure that assesses children’s exposure to environmental/family-related risk (five subscales: adverse life events & instability, family structures & SES, parenting practices, parental verbal conflict, and mood problems). Higher scores indicate greater exposure to family risk factors. Demographic and clinical characteristics of the sample are summarized in Table 1.

Table 1 Sample characteristics

Analysis

Informant Agreement/Discrepancy

Parent–child agreement was indexed using intra-class correlation (ICC) for SCARED total scores and for each subscale. ICC1 values were calculated using the psych package [29] in R v3.3.1 [30] for the whole sample and separately for the healthy and anxious subsamples (Table 2). Four measures were used to characterize informant discrepancy. Raw discrepancy scores (RDS; raw parent–raw child score) characterize the magnitude and direction of discrepancy. Absolute values of these RDS (absRDS) characterize the overall magnitude of discrepancy regardless of directionality. Given that raw discrepancy scores tend to correlate with overall level of symptoms and following recommendations by De Los Reyes and Kazdin [2] standardized mean difference scores (SMDS) were also calculated by separately z-scoring parent and child scores and then creating a within-dyad difference score (Z parent–Z child scores). This approach aids in interpretation by centering and normalizing variance for each informant group. Additionally, absolute values of these SMDS (absSMDS) were calculated to assess the overall magnitude of discrepancy regardless of directionality. Differences between RDS and SMDS tend to be most salient when variance differs by informant groups. As variance was largely similar across informants in this sample, these measures yielded largely convergent results. Nonetheless, these different measures were presented to address concerns about the advantages and disadvantages of RDS and SMDS as well as to improve interpretability and comparison to other studies. A summary of all four measures of discrepancy (RDS, absRDS, SMDS, absSMDS) across the sample and by diagnostic group is presented in Table 1.

Table 2 Parent–child agreement

Associations between informant discrepancy and demographic variables of interest were assessed using independent samples t-tests for binary predictors (sex, ethnicity) and Spearman’s rho correlations for continuous predictors (age, SES, IQ, FRFC) to account for potential deviations from normality among predictors and the ordinal nature of the SES variables. Independent samples t-tests were also used to compare RDS, absRDS, SMDS, and absSMDS between healthy and anxious parent–child dyads.

Measurement Invariance

Next, we conducted three tests of measurement invariance using confirmatory factor analysis (CFA) in MPlus [31]. The first analysis built on prior work [8] and used a within-subjects model to assess invariance across parent- and child-report. Further, to assess potential effects of age, we conducted separate multi-group measurement invariance models to test invariance of the SCARED-C, comparing younger and older children, and invariance of the SCARED-P, comparing parents of younger and older children. Specifically, dyads were separated into two groups at the median sample age of 12 (n = 557 < 12 years old and n = 535 ≥ 12 years old).

All CFAs used a mean- and variance-adjusted weight least squares (WLSMV) estimator to account for the ordinal nature of the SCARED response scale [8, 32]. Additionally, as in the prior study of invariance on the SCARED [8], we fixed factor means to zero, factor variances to one, and residual variances to one to ensure model identification for both parents and youths [33]. Each measurement invariance test included CFAs at four levels of increasing stringency. First, we examined configural invariance, which tests the factor structure across the two groups/informants. The second level tested weak or metric invariance, locking factor loadings to be equal across groups/informants. The third level tested strong or threshold invariance, locking factor loadings and item thresholds to be equal across groups/informants. The fourth level tested strict or residual invariance, locking item loading, thresholds, and residual item invariance to be equal across groups/informants. Given the oversensitivity of χ2 tests for model fit and model comparison (e.g. Mplus difference test function) to large sample sizes as in this study, we report χ2 values but rely on other measures of model fit. Specifically, good model fit was established by a comparative fit index (CFI) > 0.95 and root mean square error of approximation (RMSEA) < 0.06. Measurement invariance at each level was established by small changes in model fit, specifically a decrement in CFI < 0.01 and an increase in RMSEA < 0.015 [34,35,36].

Test–Retest

We examined the test–retest reliability of SCARED scores and of informant discrepancy (SMDS; recalculated in this subsample). Reliability was assessed using linear mixed effects models using the lme4 package [37] in R. These models included within-subjects effects for participant and timepoint, controlling for the number of days between assessments. ICC values were extracted, indicating the proportion of participant-specific to total variance. Twelve test–retest reliability models were tested examining total and five subscale scores from parent report (n = 339) and from child report (n = 359). Test–retest reliability of informant discrepancy (SMDS of total and subscale scores) was assessed in n = 298 dyads that completed the SCARED at two timepoints.

External Validity

Finally, we examined the associations between parent/child report on the SCARED and clinician-rated PARS scores. First, we assessed Pearson’s correlations between the PARS, the SCARED-P, SCARED-C, and the SCARED-P/C mean scores. Next, we conducted a multiple linear regression to assess whether SCARED-P and SCARED-C predicted unique or shared variance in PARS scores.

Results

Informant Agreement/Discrepancy

Weak but significant informant agreement was found in the full sample between parent- and child-report on the SCARED (total and subscale scores ICCs = 0.14–0.19; Table 2). In the anxious subsample, the SCARED-P and SCARED-C showed weak but significant agreement for total scores and for all subscales besides the Panic subscale. Within the healthy subsample, parent–child agreement was only significant for the Social Phobia subscale.

An independent samples t-test indicated significant differences in informant discrepancy between healthy and anxious participants (Table 1) on all four measures: raw difference scores (RDS), absolute values of raw difference scores (absRDS), standardized mean difference scores (SMDS), and absolute values of standardized mean difference scores (absSMDS). While parents under-reported symptoms relative to their child across both groups, the magnitude of the discrepancy was significantly greater in dyads with an anxious child. It is important to note, however, that this could be due to the relatively smaller range of scores present in the non-anxious group.

There were few meaningful correlations between the demographic variables of interest and informant discrepancy (Table 3). Age showed a weak negative correlation with absSMDS. IQ and income had weak, but statistically significant positive associations with both SMDS and RDS. Sex differences in SMDS were also identified such that females (SMDS: M = − 0.08, SD = 1.25) reported more symptoms relative to their parents whereas males (SMDS: M = 0.08, SD = 1.30) reported less symptoms than their parents. Of the variables investigated, parental education was the strongest predictor of discrepancy (RDS and SMDS), with more highly educated parents reporting more symptoms than their children. Ethnicity (white vs. non-white) and differences in familial risk factors were not significant predictors of discrepant reporting.

Table 3 Correlates of the SCARED

Next, we conducted exploratory regression analyses (Table 4) to examine how much variance in discrepancy scores was accounted for jointly by all factors (excluding the Family Risk Factor Checklist (FRFC) to maintain a larger analysis sample size of n = 730 dyads). Age, sex, ethnicity, highest parental education, income, child IQ, SCARED-P/C mean scores, and diagnosis accounted for 8.27% of the variance in RDS (R2 = 0.08, F(8, 721) = 8.12, p < 0.001) and 7.41% of the variance in SMDS (R2 = 0.07, F(8, 721) = 7.22, p < 0.001). These factors accounted for 17.64% of the variance in absRDS (R2 = 0.18, F(8, 721) = 19.31, p < 0.001) and 19.26% of the variance in absSMDS (R2 = 0.19, F(8, 721) = 21.50, p < 0.001). Most of this was accounted for by positive associations with SCARED-P/C mean scores in all four regressions. RDS and SMDS differed by child’s diagnosis. Parental education predicted RDS, absRDS, and SMDS.

Table 4 Regression analyses

Measurement Invariance

Examining the five-factor configural model across informants indicated good model fit (Table 5). Strict measurement invariance was found between parent and child SCARED reports, as evidenced by below threshold changes in CFI and RMSEA at each level of invariance. Next, in two separate models, strict invariance was found for parent- and for child-report splitting the sample at the median child age of 12 years old (Table 6).

Table 5 Within-dyad parent–child invariance test
Table 6 Invariance tests by age group

Test–Retest Reliability

From the available sample, n = 339 parents and n = 359 children completed a second administration of the SCARED that fit the criteria noted in the methods. Parent-report forms were completed an average of 38.62 days apart (SD = 22.24) and child-report forms were completed an average of 40.29 days apart (SD = 23.72). SCARED total and subscale scores showed moderate to excellent test–retest reliability (Table 7). Specifically, children showed acceptable reliability across total and subscale scores (ICC = 0.59–0.61), while parent-report showed higher reliability (ICC = 0.74–0.86). Informant discrepancy (SMDS) was also moderately reliable over time (ICC = 0.59–0.66; Table 7).

Table 7 Test–retest reliability of SCARED scores

External Validity

A subset of n = 201 anxious patients were assessed by clinician interview using the Pediatric Anxiety Rating Scale (PARS; M = 15.8, SD = 4.09). SCARED-P (r = 0.25, p < 0.001), SCARED-C (r = 0.23, p < 0.001), and the SCARED-P/C mean (r = 0.32, p < 0.001) scores correlated significantly with PARS scores. In a multiple regression analysis, SCARED-C and SCARED-P explain 10% of the variance in PARS score (R2 = 0.10, F(2,198) = 11.54, p < 0.001). Both parent- (β = 0.22, t = 3.39, p < 0.001) and child-report (β = 0.20, t = 3.00, p < 0.005) on the SCARED significantly predicted unique variance in PARS scores. As expected, collinearity between SCARED-C and SCARED-P was low (variance inflation factor = 1.01).

Discussion

Informant Discrepancy

The current findings indicate that the association between parent and child reports of anxiety on the SCARED questionnaire is on the lower end of prior estimates [5, 6, 10, 38]. Parents tend to report less symptoms than their children overall and this discrepancy was more pronounced in parent–child dyads with a clinically anxious child. However, it should be noted that this finding could be due in part to a “floor effect” in the healthy volunteer group, where anxiety severity was generally low. Despite marked informant discrepancy, this did not appear to vary systematically based on the demographic variables of interest. All significant correlations between predictors and different measures of informant discrepancy were weak, and the large sample size contributed to statistical significance. These results differ from previous findings suggesting that myriad demographic and family-related variables more strongly predict discrepant reporting on the SCARED [6, 8, 9, 12]. The findings of this study support the utility of combining parent and child SCARED scores to obtain a comprehensive view of anxiety symptomology without introducing systematic bias from extraneous factors. Nonetheless, given low informant agreement, more research is warranted into what other factors may contribute to discrepancy, particularly given the reliability of discrepancy scores over time.

Measurement Invariance

The current results support strict measurement invariance across informants in a large sample missing no item-wise SCARED data. These results are supported by below threshold changes in CFI and RMSEA across the four levels of invariance. While the χ2 difference tests were significant, these tend to be inflated with large sample sizes and thus we rely on CFI and RMSEA as measures of model fit. These findings largely support and expand on previous literature. Dirks and colleagues established partial invariance (freeing 22 item thresholds) between parent and child informants in a large, but considerably smaller sample (N = 408) [8]. Based on the current data and Dirks and colleagues’ prior findings, we argue that the SCARED likely exhibited strict invariance across informant, i.e. parent- versus child-report.

We also found evidence for strict measurement invariance between younger and older children and between the parents of younger and older children. In examining invariance of child-report across younger/older children, changes in CFI and RMSEA were below the set thresholds. However, comparing report from parents of younger/older children, the change in CFI from the strong to strict invariance model marginally exceeded the set threshold of 0.01. That said, the change in RMSEA was below the established threshold. These data suggest that the interpretation of the SCARED is not significantly impacted by the age of the child. This work may be further expanded in the future using newer approaches allowing for testing invariance along age as a continuous covariate, e.g. Bauer [39]. Currently, this is not implemented for models fit with a mean- and variance-adjusted weight least squares (WLSMV) estimator, which we use given the ordinal nature of the SCARED items and to maintain consistency with prior work.

Test–Retest Reliability

Since the initial SCARED reliability study [3], there has been limited research on the test–retest reliability of the revised SCARED questionnaire [4], and no previous studies have explored the reliability of the parent–child discrepancy over time. Reliability of the parent-report was higher than for child-report; however, both showed moderate to high ICC values, based on prior guidelines [40, 41]. This suggests that individuals respond similarly over time, which further supports the use of the SCARED as a stable measure of anxiety. Interestingly, informant discrepancy, i.e. the amount that informants disagree, also remained moderately consistent over time.

External Validity

Of note, in a subset of anxious patients, we found correlations between SCARED-P and SCARED-C scores and the Pediatric Anxiety Rating Scale (PARS) that were lower than prior estimates (e.g. r > 0.32 [23]). This could be related to methodological differences between the two studies. The prior study [23] completed all measures on one day, whereas measures could be completed on different days in the current study. Regardless, both SCARED-P and SCARED-C did predict unique variance in clinician-rated anxiety severity on the PARS. This suggests that the SCARED-P and the SCARED-C may capture some meaningfully different aspects of the child’s anxiety symptoms.

Despite the presence of statistically significant correlations, the low magnitude of these correlations is a concern that could be addressed through additional research on informant discrepancy in the assessment of anxiety. Future research could examine factors that influence associations among the SCARED-P, SCARED-C, and clinician-rated anxiety as well as explore associations with biological measures and other clinical measures, such as long-term outcome. Alternatively, novel assessment techniques that harness digital technology could be explored. Continued research in these and other areas may clarify the meaning of informant discrepancy.

There were several key limitations to our study. First, this was a secondary data analysis, limiting our analyses to the existing data. This presented us with several distinct issues. First, we did not have data on which parent completed the questionnaire. As such, we were unable to examine whether mothers or fathers were more or less discrepant with child-report. Recent work by Jansen et al. [42] suggests that mothers are less discrepant with their child relative to fathers, illustrating the need for further research. Another shortcoming was that not all participants completed every form. Because only a smaller subset of our large sample also completed the FRFC (n = 358) and PARS (n = 213), we suggest that future research continue to examine these variables’ associations with informant discrepancy on the SCARED.

Recent work also suggests that the SCARED factor structure may differ as a function of informant ethnicity, although these findings are inconsistent [8, 17, 43, 44]. These conflicting findings could reflect cultural differences in the presentation, understanding, and stigma associated with these symptoms. Unfortunately, due to the largely homogenous nature of our sample, we were unable to explore invariance as it relates to ethnicity. In addition, the stringent exclusionary criteria (i.e. the requirement that all participants be medication free, present with no co-occurring major depressive disorder, obsessive compulsive disorder, or post-traumatic stress disorder, have an IQ > 70, and present prior to treatment) influences generalizability of these findings. Future research should continue to examine informant discrepancy, measurement invariance, and test–retest reliability of the SCARED in diverse samples to expand replicability and generalizability of findings.

In sum, using the largest sample to date, our clinical, demographic, and psychometric findings further support the reliability and validity of the SCARED. While measurement invariance analyses suggest that parents and children use and interpret the scale in similar ways, it is noteworthy that lower levels of informant agreement were observed in our sample compared to previous studies. These findings hold important clinical significance, supporting the use of the SCARED as a psychometrically valid tool for self- and parent-report of anxiety symptoms in children, but also highlight the need for further study of the determinants of informant discrepancy and the unique information captured by the parent- and child-report on the SCARED relative to clinician interview.

Summary

Self-report measures are a critical tool in psychological and psychiatric research that can offer insight into an individual’s level and characteristics of impairment. The Screen for Child Anxiety Related Emotional Disorders (SCARED) is one of the most commonly used questionnaires for assessing childhood anxiety. While the SCARED is a reliable, valid, and sensitive measure to screen for pediatric anxiety disorders, informant discrepancy can pose clinical and research challenges [2]. In a sample of N = 1092 anxious and healthy parent–child dyads, variables such as child’s age, sex, socio-economic status, symptom severity, and family stress did not systematically predict discrepant reporting. Further, the SCARED showed strict measurement invariance, strong test–retest reliability, and the SCARED-C and SCARED-P predicting unique variance in a clinician-rated measurement of anxiety. These findings suggest not only that item interpretation is not responsible for rater discrepancy, but also that the SCARED-C and SCARED-P may capture meaningful differences in the child’s anxiety symptoms. Given the widespread use of the SCARED by both practitioners and researchers, an understanding of its psychometric properties and potential factors driving discrepant reporting is important.