Introduction

Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by impairments in social interaction and communication as well as the presence of repetitive behaviors and sensory processing differences (American Psychiatric Association 2013). Recent estimates indicate that 31.6 % of individuals with ASD have intellectual disability (Christensen et al. 2016), and around 30 % are minimally verbal or non-verbal (Tager-Flusberg and Kasari 2013). This high proportion of less verbal individuals poses a challenge to assessment, as measures that attempt to tap internal states are often not applicable to non- or minimally verbal children with ASD.

Assessment of psychiatric symptoms in particular is a pressing problem in ASD. The vast majority of children with ASD meet criteria for at least one comorbid psychiatric disorder (e.g., Salazar et al. 2015), and individuals with ASD utilize psychiatric services at much higher rates than their non-ASD peers (Croen et al. 2006). Therefore, valid psychiatric assessments are needed to direct and monitor treatment. However, most psychiatric questionnaires use language such as “complains about,” “worries about,” or “has suicidal thoughts,” etc., which rely on a verbal response from the individual in question, limiting their utility to higher-functioning samples. Further, most verbal individuals with ASD have difficulty identifying and labeling their feelings (Samson et al. 2012) and are less likely to verbalize their emotions than typically-developing individuals (Lartseva et al. 2015).

With the exception of ongoing efforts to develop a measure of anxiety specifically for ASD (Bearss et al. 2015), most dimensional measures of psychiatric symptoms have not been validated for use in ASD, which potentially limits their interpretability for ASD samples. There is some evidence to suggest that the psychometric properties of questionnaires developed in typical samples differ when applied to individuals with ASD. For example, White et al. (2015) examined the measurement and factorial invariance of the Multidimensional Anxiety Scale for Children (MASC; March 1997). Although they found a similar latent structure of anxiety on the MASC self-report between children with and without ASD, the factor structure and association between factors differed in the ASD group. This finding suggests that anxiety is not measured in the same way in ASD samples as non-ASD samples, which could be partly due to atypical manifestations of anxiety in ASD (Kerns et al. 2014). Further, the possibility that psychiatric concerns present differently in ASD extends beyond anxiety (Leyfer et al. 2006; Magnuson and Constantino 2011), thereby suggesting that even our best evidence-based psychiatric questionnaires, when not adapted for ASD, may not capture the full range of presenting symptoms in individuals with ASD.

Focusing on processes that are shared across disorders, referred to as transdiagnostic factors, may promote more parsimonious models of psychopathology (Farchione et al. 2012). For example, over two decades of research supports increased negative affect as a transdiagnostic process implicated in both anxiety and depression (Watson and Clark 1992). Difficulty appropriately regulating affective experiences, often referred to as emotion regulation, also contributes to the development of a range of psychiatric disorders (Aldao et al. 2010). Treatments that target these underlying processes have been shown to be as effective as disorder-specific treatments, while simplifying treatment planning and delivery and increasing applicability to a wider population (McEvoy et al. 2009).

Despite the prominence of affective experience and regulation in models of developmental psychopathology in other populations and the potential impact of targeting these processes in treatment, measures of these constructs that are suitable for individuals with ASD are lacking. Studies of high-functioning individuals with ASD suggest that emotion regulation is impaired in ASD and associated with a range of poorer outcomes, such as worse social functioning, more symptoms of anxiety and depression, and maladaptive behavior (Mazefsky 2015). Having a dimensional measure of emotion dysregulation that does not rely on verbally conveyed information would enable a more complete understanding of the role of emotion dysregulation in ASD and open the door to new treatments. Toward this goal, this manuscript describes the development of a caregiver-report measure of the experience and regulation of emotion called the Emotion Dysregulation Inventory (EDI) that was designed to be suitable for use with individuals with ASD across the full range of verbal ability.

The EDI was created using guidelines from an NIH Roadmap Initiative focused on developing sensitive outcome measures, the Patient-Reported Outcomes Measurement Information System (PROMIS®) (http://www.nihpromis.org; Cella et al. 2010, 2007). The multiple-site PROMIS research network (2012) produced scientific standards for developing measures as well as a comprehensive battery of effective and sensitive outcome measures. There is evidence to suggest that measures developed through PROMIS methods assess a wider range of functioning with greater precision than legacy measures (e.g., Pilkonis et al. 2011). To our knowledge, PROMIS principles have not yet been applied to develop measures for use in ASD.

The overall goals of this study were to illustrate the value of utilizing PROMIS guidelines for measure development in an ASD sample and to develop a change-sensitive measure of emotion dysregulation that can be used across the full spectrum of verbal and cognitive abilities in individuals with ASD. The steps utilized to create the EDI are described (see Fig. 1), with particular emphasis on the development of the item pool (Phase I). Initial data on language or IQ biases, range of information captured, and ability to detect change are presented based on EDI data from psychiatric inpatients with ASD (Phase II). Item analyses are ongoing and information on reliability and validity will be provided in future reports.

Fig. 1
figure 1

Protocol for development of the Emotion Dysregulation Inventory (EDI), following recommendations from the PROMIS® Project (http://www.nihpromis.org)

Phase I: Item Pool Development

Phase I Methods

Comprehensive Literature Search

First, a comprehensive literature search was conducted to identify all existing caregiver- or provider-report instruments that cover related constructs utilizing PsychInfo, Medline, and Google Scholar. The search terms included all possible combinations of a measurement term (psychometric, validity, reliability) or “questionnaire,” along with an emotion term (irritability, emotion regulation, mood, coping, temperament, depression, anxiety, anger, frustration, reactivity, lability). All measures identified were reviewed by the first author. The PROMIS process involves creating a database of all items from every measure found during the literature search, which are then systematically grouped by latent construct and later reduced and modified for consistency (described in detail in DeWalt et al. 2007). Given resource constraints, item banking for the EDI was completed at a later step for only those items likely to be utilized (see “Item Pool Development” section).

Generation of Conceptual Model

A conceptual model for the EDI was developed by the first author, based on the literature review and extensive clinical experience working with individuals with ASD (see Fig. 2). In line with PROMIS, the model had four levels to support item development, including the primary domain (emotion dysregulation), subdomains (affective experience and affective control), factors, and facets. The intent of this hierarchical structure is not to eventually produce scores for each facet (specific subscores will be determined by later statistical analyses), but rather, to ensure sufficient item coverage for all theoretically-relevant constructs.

Fig. 2
figure 2

Conceptual structure of the Emotion Dysregulation Inventory

Item Pool Development

A draft of items was created based on the literature search, conceptual model, and clinical experience. In order to support later item response theory (IRT) analyses, a minimum of 4 items were written for each facet and these were placed into an item hierarchy. Concepts guiding item development included: (1) adequate assessment of the full range of emotion dysregulation (i.e., different presentations and different degrees of severity); (2) could be completed regardless of the verbal ability of the person being assessed; and (3) relevant to children and adults. Given the intention to be applicable to individuals with any verbal ability, items tapping emotion regulation included observable indicators of maladaptive or ineffective emotion regulation, rather than specific emotion regulation strategy use (e.g., types of coping strategies; see Table 1 for sample items).

Table 1 Sample items from the Emotion Dysregulation Inventory

Response Options

The response options were selected based on a review of recommendations from PROMIS and were refined based on input from parents and professionals (see “Qualitative Review” section). A 7-day recall period was utilized as this interval provides the optimal balance of minimizing bias while providing clinically relevant information (Cella et al. 2010). Five response options were included, as a five-point scale is considered superior for IRT analyses and may be more sensitive than fewer options (Alwin and Krosnick 1991; Cella et al. 2010). The rating scale encompasses both severity (degree of interference and intensity) and frequency. The EDI directions provide examples of ways that a seemingly minor behavior (e.g., tantrums) can be considered severe if it occurs very frequently, with high intensity, or long duration, and is significantly interfering. To increase the reliability of the ratings, specific descriptions with objective benchmarks were developed for each response option. Although not part of PROMIS, a visual representation of the response scale was also created (see Fig. 3). This was intended to support reliability as well as future self-report adaptations given the utility of visual aids in ASD (Quill 1995).

Fig. 3
figure 3

Response option visual for the Emotion Dysregulation Inventory

Qualitative Review

The EDI was reviewed and refined by experts in the ASD field from more than ten academic and hospital sites, including a mixture of clinicians and researchers with various degrees (e.g., psychologists, psychiatrists, social workers, neuroscientists). Their suggested modifications to improve the measure (additional items to add, wording suggestions, etc.) were incorporated. Their review also served as initial evidence for face validity.

Cognitive Interviews

Based on an information processing model, cognitive interviewing methodology is designed to assess a participant’s comprehension of items as well as the information and decision process that participants use to choose a response (Irwin et al. 2009). Cognitive interviews were conducted with parents (n = 19) of children and young adults with ASD using a combined think-aloud and debriefing methodology. Specifically, parents were asked to read all items of the EDI aloud, think aloud as they chose their answers, and describe the directions, response sets, and items in their own words in order to determine the clarity and meaning of items and response sets. After completing the EDI aloud, parents were asked to provide any general feedback on aspects that were confusing or helpful to assess the complexity of the EDI. Parents were also asked for any suggestions for improvement and any ideas for relevant content that was not assessed. All parents but one completed the cognitive interview process for the full 67-item EDI, thereby exceeding the PROMIS guideline of a minimum of five interviews per item (Irwin et al. 2009; PROMIS 2012). Five different item orders were created to account for item order effects. All interviews were audio-recorded for later review. Following completion of the cognitive interviews, the EDI items and response options were reviewed by a subset of the earlier expert panel consisting of individuals with substantial experience in measure development and the assessment and treatment of emotion dysregulation in ASD. Item pool winnowing occurred at the end of this stage.

Phase I Participants

Cognitive interviews were completed with parents whose children with ASD included a representative range of verbal ability, intellectual ability, severity of behaviors, and past psychiatric history. They were recruited through advertisements including fliers in our ASD research programs and email listserves of local autism advocacy groups. Parents were compensated for their time. Key characteristics of the individuals with ASD are provided in Table 2. All but one child had a parent-reported current comorbid psychiatric diagnosis (unverified); 21 % had current clinical diagnoses of depression and 42 % had a current anxiety disorder diagnosis.

Table 2 Phase I cognitive interview sample characteristics (n = 19)

Phase I Results

The overall response to the EDI was positive. There was uniform enthusiasm for the response option visual aid. Parents found the items easy to understand and rate regardless of their child’s verbal ability. Minor adjustments for clarity were made to six items based on the interviews and one was removed. An additional four items were revised to simplify the reading level. The majority of changes stemming from the cognitive interviews were made to the response options and directions. Overall, the modifications simplified the directions and options. None of the parents had suggestions for additional content that was not already covered by the draft items.

The frequency distributions of responses to the items were explored to identify those with few endorsements at the extremes, and were determined to be adequate for future IRT analyses. Only three items were endorsed as Mild or higher by two or fewer (~10 %) of participants. 10 of the 67 items (such as “frustrates easily”) were endorsed as at least Mild by the majority (15 or more) of parents, though specific intensity ratings varied substantially. The remaining 54 items were endorsed by an average of 9.5 parents as Mild or higher.

As a final step, the reading level of the remaining 66 EDI items was tested. The complete EDI had a Flesch Reading Ease of 66.3 (scale of 0–100 with higher scores indicating easier to read) and a Flesch–Kincaid Grade Level of 7.6, which are both considered adequate. Without the instructions, which include some longer passages with guidance on how to interpret behavior, the items and response options have a Flesch Reading Ease of 100 and a Flesch–Kincaid Grade Level of 2.1.

Phase II: Pilot Testing with an Inpatient Sample

The 66-item version of the EDI was administered in a sample of psychiatric inpatients with ASD. This sample was selected as it was expected that there would be a wide range of level of functioning, communication abilities, and emotion dysregulation (including the most severe forms). In addition, application to the inpatient sample provided an opportunity to explore sensitivity to change by comparison of admission and discharge scores. Given that all participants met medical necessity criteria for inpatient hospitalization (most intensive and restrictive form of care) at admission, we considered discharge to a less restrictive environment to be an indicator of improvement; this assumption is also consistent with research suggesting that children with ASD improve after specialized inpatient psychiatric treatment (Siegel et al. 2014).

Phase II Methods

The EDI was completed within 7 days of admission and again at discharge by parents or guardians of children who were admitted to specialized psychiatric hospital inpatient units that treat ASD and other neurodevelopmental disorders [through the Autism Inpatient Collection (AIC)]. The EDI is one of the core measures of the AIC, and the full methods of the AIC have been published previously (Siegel et al. 2015). Briefly, children between the ages of 4–20 years old with a score of ≥12 on the Social Communication Questionnaire (SCQ; Rutter et al. 2003) or high suspicion of ASD from the inpatient clinical treatment team were eligible for enrollment. ASD diagnoses were confirmed by administration of the Autism Diagnostic Observation Schedule-2 (ADOS-2; Lord et al. 2012). Exclusion criteria included not having a parent available who was proficient in English or the child with ASD being a prisoner. Non-verbal intellectual ability was measured by the Leiter International Performance Scale—Third Edition (Leiter-3; Roid et al. 2013).

Participants

Parents/guardians of 287 participants (80 % male) with ADOS-confirmed ASD completed the EDI at admission. Of these, 68 had some missing item-level EDI data and were excluded from analyses. In-depth item analyses will occur at a later stage, but no single item had missing data from more than 2.5 % of the sample. Those excluded from analyses did not significantly differ from the remaining participants in age (p = .549), total household income (p = .944), or non-verbal IQ (p = .715). A similar number of minimally verbal and verbal participants were missing data, (p = .712). Although the retained and excluded participants did not differ in overall ADOS total score (sum of Social Affect and Restricted and Repetitive Behavior composites), p = .151, the retained participants had a slightly higher total SCQ score (M = 24, SD = 7) than excluded participants (M = 21, SD = 6), p = .027. Over three-quarters (79.5 %, n = 171) of participants completed the EDI again at discharge. Those with missed discharge assessments had similar EDI admission scores (p = .520), mean age (p = .725), total household income (p = .406), nonverbal IQ (p = .964), percent minimally verbal (p = .615), and length of stay (p = .789) to those with EDI discharge data.

The majority of reporters were biological mothers (71 %, n = 155), followed by biological fathers (12 %, n = 26) and step/foster/adoptive mothers (10 %, n = 21). Families had a representative range of total household income, from 19.4 % (n = 39) with an annual income less than $20,000 to 21.9 % (n = 44) over $100,000, and a mean in the $51,000–$65,000 range. The majority of participants were Caucasian (82.2 %, n = 180), followed by African–American (9.6 %, n = 21), and Asian (4.6 %, n = 10). The majority (94.2 %) were not Hispanic. The children with ASD ranged in age from 4 to 20 years old (M = 12.89). They had a wide range of nonverbal IQs (range = 30–145, SD = 29.4) with a mean IQ of 77. In order to understand how the EDI performed based on the verbal ability of the child, participants were split into non-/minimally verbal (defined as requiring an ADOS-2 module 1 or 2; 48.1 %, n = 104) or verbal (ADOS-2 modules 3 or 4; 51.9 %, n = 112).

Analyses

Analyses focused on an EDI total score that was created by summing the items. The EDI item scores ranged from 0 to 4, so the possible total score ranged from 0 to 264. An EDI change score was created by subtracting EDI discharge scores from EDI admission scores. Analyses were conducted in IBM SPSS Statistics Version 23 (IBM 2015). After checking assumptions, independent samples t tests were utilized to compare group means. Pearson correlations were utilized to determine the association between variables of interest.

Phase II: Results

There was a wide range of EDI admission scores (range = 5–229, M = 123.6; SD = 39.7; median = 126.5; see Fig. 4a). Both skewness (−.28, SE = 0.16) and kurtosis (−.08, SE = 0.33) statistics suggested a normal distribution. Discharge scores (M = 61.5; SD = 41.6) were significantly lower than admission scores (M = 120.6, SD = 37.1), t (149) = 15.94, p < .001. Although length of stay was not correlated with EDI change scores (Admission—Discharge), r (150) = −0.13, p = .101, the analysis comparing discharge and admission scores was repeated with only those with length of stays 14 days or longer to be more certain that the results were not skewed by those with shorter stays or overlapping rating periods at admission and discharge. The finding was the same: Discharge scores (M = 61.1; SD = 41.9) were significantly lower than admission scores (M = 125.0, SD = 34.4), t (100) = 15.10, p < .001. The mean EDI change score was −59.1, with a standard deviation similar to admission scores (SD = 45.4; see Fig. 4b).

Fig. 4
figure 4

Distribution of Emotion Dysregulation Inventory scores with the normal curve superimposed at admission (a) and scores representing the change from admission to discharge (b)

Admission EDI scores were not significantly correlated with age, r (218) = −0.07, p = .30. In addition, EDI Admission scores were not significantly correlated with Nonverbal IQ, r (193) = − 0.05, p = .51. Participants with Nonverbal IQ scores below 70 had EDI scores (M = 123.7, SD = 40.9) that were nearly identical to participants with Nonverbal IQ scores of 70 or above (M = 123.5, SD = 39.7), t (217) = .008, p = .93. Also, EDI scores of non- or minimally verbal participants (M = 120.6, SD = 38.4) did not significantly differ from the scores of verbal participants (M = 126.3, SD = 41.3), t (214) = −1.04, p = .30.

Discussion

The primary aim of this study was to apply the approach utilized by the PROMIS network to develop a measure of emotion dysregulation that would be suitable for use with individuals with ASD regardless of verbal or cognitive ability, and be sensitive to change. The results presented demonstrate how the items were developed and suggest that the EDI is capturing a wide range of emotion dysregulation severity. Importantly, the EDI was completed in a diverse sample of psychiatric inpatients with ASD, and the EDI performed similarly in groups of minimally verbal and verbal individuals and was unrelated to nonverbal IQ. Given the central role of emotion dysregulation across various forms of psychopathology (Aldao et al. 2010), and frequency with which emotion dysregulation prompts parents and individuals with ASD to seek treatment in both inpatient (Siegel and Gabriels 2014) and outpatient settings (Arnold et al. 2003), we anticipate that the EDI will be useful in a variety of research and clinical contexts.

The strategic item development strategy utilized to develop the EDI could be applied to develop measures of other constructs of interest in ASD. In particular, the PROMIS recommendation to develop a conceptual model with four tiers (domain, subdomain, factors, and facets) is useful for ensuring that there is sufficient item coverage for constructs of interest. Although it is not possible to determine with certainty, the combination of using this form of conceptual model, developing an item hierarchy that assigns items to each facet of the model, and expert review likely had a positive influence on the breadth of EDI scores. Indeed, none of the parents who participated in Phase I cognitive interviewing had any suggestions for content that was not adequately covered after completing these steps.

The use of qualitative research methods during Phase I was considered an essential step. A qualitative research approach that has been applied to ensure stakeholder perspectives in ASD research is focus groups (e.g., White et al. 2016). Focus groups have also been used for measure development in ASD, mostly to aid in item generation (e.g., Bearss et al. 2015). The cognitive interviewing process completed in this study differs from focus groups in that it was completed after the items were developed and reporters were interviewed individually about each item to ensure comprehension and identify problems with specific wording. PROMIS provides specific guidance on how to apply cognitive interviewing during measure development and notes some general recommendations stemming from their experiences, such as problems with specifiers and different perceptions of time frames (e.g., the past week is not interpreted in the same way as the past 7 days; DeWalt et al. 2007). If the goal is to develop a measure that can be used across the ASD population, as was the case with the EDI, cognitive interviewing can be used to ensure that items are interpreted the same way regardless of the individual in question’s symptom severity or communication ability. This process might have had a positive influence on the lack of verbal ability and IQ biases, which we were able to confirm by using a sample with the complete range of verbal and cognitive abilities (the Autism Inpatient Collection; AIC). Because the AIC also includes those with the most severe forms of emotional and behavioral dysregulation, we were also able to ensure that the EDI has an adequate ceiling.

Given that the AIC is a unique sample, collection of EDI data in additional samples is required. Currently, the authors are collaborating with the Interactive Autism Network (IAN; Daniels et al. 2012), a large, national, validated and verified, autism registry and database, to collect EDI and other questionnaire data on individuals with ASD across the country to ensure that the EDI is calibrated in a representative sample of children with community diagnoses of ASD. Finally, given that it is clinically informative to understand the degree of impairment relative to the general population, EDI data are also being collected from a large nationally-representative sample of census-matched individuals.

Once a fully representative sample has been collected, additional item-level and factor analyses, as well as equivalence testing, will be conducted. One of the distinguishing characteristics of the PROMIS battery is that IRT was utilized to identify which items are most sensitive to the latent construct of interest (e.g., offer the most precision and range). IRT refers to a class of psychometric techniques where the probability of choosing a certain item response category is modeled as a function of a latent trait called theta (θ; Hamleton 1991). The underlying θ manifested by a set of items is assumed to explain individual item performance. IRT models focus attention at the level of individual items, provide estimates of information contributed by those items, and identify where each item contributes its maximum information along the continuum of the underlying construct being measured. IRT will be applied to identify the most sensitive EDI items to retain in order to shorten the EDI and also to create a <10 item short form for repeated progress monitoring or use in studies where emotion dysregulation is not the primary focus. IRT will also be utilized to create empirically-derived subscale scores and to perform additional tests for any biases. Once the EDI item content is finalized through IRT, validity analyses will be conducted. This will include comparison to other established theoretically-related and unrelated measures, comparison of samples with expected mean differences (psychiatric inpatients with ASD versus community sample with ASD versus general community sample), and testing the association between EDI scores and emotional reactivity during structured emotion-eliciting tasks.

There are several directions for future research that would further our understanding of emotion dysregulation in ASD and performance of the EDI. For example, although the decrease in EDI scores from admission to discharge is a promising indicator of change-sensitivity, it was not possible to control how often caregivers visited their child while in the inpatient setting and it is possible that some caregivers may have had limited exposure to their child’s emotion dysregulation over the sampling interval. Our future work will involve comparing the AIC EDI change scores to EDI change scores from 4 week repeated assessments in a community ASD sample with no treatment changes. It would also be useful to test the EDI in other interventions with more consistent caregiver contact and blinded conditions to further establish its sensitivity to change. Understanding how emotion dysregulation as measured by the EDI differs between children with ASD with and without comorbid psychiatric disorders is also an important direction for future research. In addition, given that the EDI was specifically developed for use regardless of cognitive ability, it does not assess the use of particular emotion regulation strategies which are more easily measured by self-report. However, it will be important for future research to determine how the EDI corresponds to measures of emotion regulation used in future studies.

In summary, initial evaluation of the EDI suggests it will be an informative, change-sensitive measure of important transdiagnostic emotional processes (emotional reactivity and poor regulation) in ASD that can be applied across the spectrum of verbal and cognitive abilities. Presuming equally promising results from ongoing item, validity, and reliability analyses, the EDI may prove fruitful for a variety of research and clinical questions, including studies targeting the biological processes underlying emotion dysregulation and treatments that conceptualize emotion dysregulation as the primary outcome of interest, underlying mechanism, or maintaining factor. Although the EDI’s conceptual model and item content were specifically developed to ensure complete coverage of manifestations of emotion dysregulation observed in ASD, it should also apply to other populations. Thus, the EDI may be useful in cross-population studies or those embracing the NIH Research Domain Criteria (RDoC 2013) transdiagnostic research approach. Finally, the EDI development process (e.g., PROMIS framework, pilot collection with a sample encompassing all verbal and IQ abilities) provides a model that can be applied to address the dearth of sensitive and valid measures for ASD.