Introduction

Efforts to successfully identify and intervene for young children at greatest risk of poorer life outcomes have intensified internationally over the last few decades. During this time, early self-regulation skills have been identified as a foundational area of developmental functioning that confers longitudinal risk and benefit for a broad array of learning and wellbeing outcomes (Robson et al., 2020). Self-regulation enables control over attention, emotion, and behavior in ways that are adaptive to children’s immediate goals, context, and environment, the outcomes of which contribute to developmental trajectories for decades to come (Bailey & Jones, 2019; Blair, 2016). This has highlighted the need for methods to accurately appraise progress in children’s self-regulation development. While multiple self-regulation measurement approaches have been developed for use in early childhood (McCoy, 2019), rarely have their specific utility as an indicator of early risk been demonstrated. The current study thus aimed to evaluate prominent approaches to self-regulation assessment in terms of: (a) their longitudinal associations with children’s academic school readiness; and (b) their potential utility to pre-emptively identify children at risk of a poor school transition and ongoing learning trajectories.

Self-Regulation in Early Childhood

Debates have raged in numerous domains of human development about how to optimally capture characteristics, symptoms, and behaviors that are associated with longitudinal risk or benefit. A primary example is unhealthy weight, for which a substantial literature has sought to identify which measures (e.g., weight, BMI, body fat percentage, waist-to-height ratio, waist circumference) optimally quantify and identify (un)healthy levels of adiposity associated with longitudinal risk or benefit (Klein et al., 2007; Savva et al., 2000; Visscher et al., 2001). Once identified, these measures can be used to more accurately appraise risk and enact appropriate strategies for prevention, intervention, or education. These debates have also extended to key domains of child development, such as literacy and numeracy (as epitomized by national and international programs of standardized educational assessment; ACARA, 2017; DfE, 2017; NCES, 2010) and, more recently, to self-regulation (Montroy et al., 2016a, 2016b; Schmitt et al., 2014).

Self-regulation refers to a suite of volitional and automatic responses that serve to control cognition, behavior, and emotion in ways that support learning and wellbeing (Blair, 2016). Specifically, self-regulation (sometimes referred to as self-control: Hofmann et al., 2012) enables goal-directed behavior, despite contrary impulses or distractions. While very early in life infants require other-regulation and co-regulation by caregivers to satisfy their regulatory needs (e.g., caregivers soothe when an infant cries), a major change across the early years is for children to gain the skills for self-regulation (McClelland et al., 2010). These skills develop rapidly in the early years and, as the prefrontal cortex continues to mature in the pre-school years and beyond, more complex cognitive regulation processes (underpinned by executive functioning) that support learning and problem-solving develop (McClelland et al., 2010).

The nature of self-regulation in terms of its structure and constituent parts continues to be debated, and its exploration is further complicated by variable use of terminology across fields of research (Bailey & Jones, 2019; Blair, 2016; Hofmann et al., 2012). Regulation of emotion, for instance, is typically seen as relatively distinct from–although developmentally and reciprocally related to–cognitive/attentional aspects of self-regulation (Bailey & Jones, 2019; Blair, 2016; McClelland et al., 2010). Those cognitive components of self-regulation are often broken down into elements concerned with attentional focusing and higher-order cognitive control (e.g., executive functions; Blair, 2016). Behavioral regulation is often used to describe the readily observable ways that children enact broad and multiple self-regulation skills in everyday contexts (e.g., waiting their turn, following instructions as observed by adults; Howard et al., 2019). However, behavioral self-regulation has also been used to refer to the everyday combination and application specifically of executive function skills (e.g., McClelland & Cameron, 2012; Ponitz et al., 2009), despite executive functioning being considered cognitive in nature. This issue of fluid term use is perhaps best exemplified in the Head Toes Knees Shoulders task (Ponitz et al., 2009)–which we describe more fully later–as it has been considered both as a behavioral self-regulation measure (due to its requirement to override a dominant yet incorrect behavioral response; McClelland & Cameron, 2012; Ponitz et al., 2009) and executive function measure (given its executive demands, applied outside of a child’s everyday self-regulatory context; Gooch et al., 2016). In the current study, this task was explored in relation to its loading on a behavioral or cognitive self-regulation construct. Overall, factor analysis of self-regulation measures designed to tap these multiple aspects, including the measures used in the current study, often confirm separate emotional, cognitive, and behavioral self-regulation factors (Howard et al., 2019; Howard & Melhiush, 2017). However, the utility of these divisions remains unclear, given that real-world situations often involve regulation of cognition, behaviors, and emotion to achieve a desired outcome.

This debate notwithstanding, that self-regulation skills are critical for school readiness is no longer debated; in fact, self-regulation has been positioned as a key focus for research and intervention related to school readiness (Blair & Raver, 2015). In early learning settings and home learning environments, children who are able regulate their attention in ways that resist distraction, maintain focus on learning activities, and persist in challenging tasks are best able to capitalize on the learning opportunities provided. Children with emotional regulation skills that support them to be less reactive to minor emotion-inducing events, or to recover easily from emotional arousal, will have more psychological resources to invest in attention and learning (Blair & Raver, 2015). Not surprisingly, multiple studies have found that children with poor early childhood self-regulation skills are linked with poor achievement trajectories over the early years of school (Finders et al., 2021), risky lifestyle choices in adolescence (Howard & Williams, 2018), and poorer health, wealth and criminality into adulthood (Moffitt et al., 2011).

Given the clear role that self-regulation plays in school readiness, and ongoing learning and wellbeing trajectories, attention has more recently been turned to accurate identification of children with poor self-regulation skills who would most benefit from early intervention (Montroy et al., 2016a, 2016b; Schmitt et al., 2014). Indeed for some interventions it is children with the poorest self-regulation skills that benefit the most (Tominey & McClelland, 2011). Focus on the early years is warranted given that earlier intervention (compared to later remediation) may potentiate a more pronounced, stable and lasting change (Wass et al., 2012), yield greater return on investment (Heckman, 2006), and any-cause improvements in childhood self-regulation are related to positive changes across a broad range of adult outcomes (Moffitt et al., 2011). Yet, although multiple measurement approaches for various sub-components of self-regulation have been developed and a large body of research suggests cross-sectional and longitudinal associations of scores on these measures with school readiness and success, the specificity and sensitivity of these measures to identify a meaningful ‘at-risk’ group of children is unknown.

Current Approaches to Self-Regulation Assessment

Approaches to assessing self-regulation vary, with studies adopting different approaches yet still ascribing their results to the same self-regulation construct, with often little attention to measure-specific implications for these results. The three broad approaches to assessing child self-regulation are adult-report, task-based, and observational, with no ‘gold standard’ measure yet established (McCoy, 2019). Here, the properties, affordances and challenges of each approach are briefly summarized.

Adult-report measures of self-regulation typically ask adults who know the child well (e.g., parents, educators) about the frequency of a child’s everyday self-regulatory behaviors across emotional, cognitive, and behavioral domains (e.g., sharing, temper tantrums; Howard & Melhuish, 2017; Matthews et al., 2009; Whitebread et al., 2009). Benefits of this approach include a focus on ecologically valid (everyday) self-regulatory behaviors, and ease and efficiency of data collection with no burden on child participants. Challenges are highlighted by the data collected, however: parent and educator responses on the same measure for the same child correlate poorly; and longitudinal data do not always capture age-related change within children (Howard et al., 2019). These nuances are likely due to the fact that adults’ reports of the frequency, severity or typicality of child behavior is necessarily couched in that adult’s frame of reference (which is different for parents and teachers), and the ways that adults reference behaviors against same-age peers (e.g. ‘she is average for a 3-year old’) as opposed to progress relative to the developmental trajectories for self-regulation (Howard et al., 2019). Still, adult-report measures of early self-regulation have shown unique (over and above task-based measures) predictive utility in terms of school readiness (Vitiello & Greenfield, 2017) and achievement (Blair et al., 2015; Finders et al., 2021).

Task-based approaches attempt to introduce greater objectivity and consistency in the capture of child self-regulatory development. In this approach, children are asked to perform a task that is believed to require self-regulation to successfully complete, generating accuracy scores through specific behavioral criteria. Prominent examples include: the Kochanska battery of seven to 13 tasks (number of tasks depends on age of the child; Kochanska et al., 1996), conceptualized as tapping effortful control (a temperament construct related to self-regulation that includes executive attention, inhibitory control, and planning; Rothbart et al., 2011); and the Head-Toes-Knees-Shoulders (HTKS) task used in the current study, in which children must perform the opposite action to what they are asked to do (Ponitz et al., 2009). Task-based approaches to self-regulation assessment show strong developmental (Montroy et al., 2016a) and intervention-related sensitivity (Schmitt et al., 2015), as well as similar predictive validity to adult-report measures (although data are often less-longitudinal given resource demands of using direct assessments within large-scale longitudinal studies) (McClelland et al., 2014; Ponitz et al., 2009). Challenges include questions of what exactly is being measured by these tasks. For example, HTKS is labelled as a measure of behavioral self-regulation in multiple studies (Ponitz et al., 2009), but can equally viewed as a measure of executive function as it requires a child to hold and operate on a rule that is maintained in mind (working memory), inhibit the impulse to perform the action as directed (inhibition), and flexibly switch between body part correspondences (cognitive flexibility). Indeed, the HTKS is conceptualized as an executive function measure in a number of recent studies (Keown et al., 2020; Liu et al., 2018), and this is not entirely discordant with the HTKS developers’ conceptualization of behavioral self-regulation as the everyday application of executive function (McClelland & Cameron, 2012). Moreover, these tasks may lack the complex and emotional investment that is intrinsic to the real-world contexts in which children must self-regulate (e.g., wait their turn when they really want to take their turn now, negotiate a fair resolution to conflict rather than lash out, redirect their attention rather than have an emotional outburst).

A third approach, utilizing observation, typically engages children in a semi-structured activity (e.g., obstacle course, memory card game) that approximates those a child would routinely engage in, aiming to ensure that observations focus on children’s authentic self-regulatory behaviors (similarly to adult-report measures). Data is generally collected by a trained observer and typically includes ratings of children’s self-regulated responses, rather than accuracy and response times. Examples include: Preschool Self-Regulation Assessment –Assessor Report (PSRA; Smith-Donald et al., 2007), which engages children in a series of activities (a number of which were taken from the Kochanska battery, e.g., peg tap, gift unwrap) to appraise children’s attention, concentration, patience, planning and impulse control; and Preschool Situational Self-Regulation Toolkit (PRSIST) assessment (used in the current study), which engages children in activities that approximate everyday group and individual activities in early years contexts to rate their cognitive and behavioral self-regulation (Howard et al., 2019). Studies using observational approaches have found self-regulation to be associated with early academic skills, social competence, and behavior problems (e.g., Howard & Vasseleu, 2020; Howard et al., 2019; Smith-Donald et al., 2007). The key affordance of this approach is that data are thought to represent the ‘closest’ representation of children’s self-regulation in the everyday contexts that matter most for learning and wellbeing. Challenges include the requirement for trained observers and the potential conceptual ‘distance’ between children’s observed behavior in a semi-structured task and their self-regulatory behavior in everyday situations, with this ‘distance’ varying among the measures.

Implications for the Current Study

Taken together, it is clear there are a number of challenges to the study and translation of early childhood self-regulation research. First, there is not yet a clear consensus on the ideal approach(es) and dimensions for assessing self-regulation, in relation to real-world outcomes such as academic school readiness, and whether each approach/dimension offers some unique insight into self-regulation development. Second, there is not yet consensus on whether self-regulatory indices should be considered as separate construct scores, latent variables, or raw summed scores. Finally, there are still questions about the relative predictive utility of these various approaches to self-regulation measurement, and whether any of these measures have adequate sensitivity and specificity to identify children at-risk of poor academic outcomes.

In light of these issues, the current study sought to evaluate the degree to which indices from different approaches (adult-report, task-based, observational) and dimensions of self-regulation (cognitive, behavioral, emotional), taken at the start of the final pre-school year (both individually and when combined), predicted academic school readiness achievement and risk 7 months later, just prior to commencing school. Specifically, path analyses were used to evaluate the strength of school readiness prediction by each measure individually, when they were considered concurrently, and when combined into a latent self-regulation factor. We also investigated, using Receiver Operating Characteristic (ROC) curve analysis, whether and which measure(s)–or aggregation of measures–provided sufficient specificity and sensitivity for identification of children with poor academic school readiness, and thus a risk for poor academic outcomes. On the basis of prior research showing strong prediction of later school readiness by objective self-regulation measures with higher cognitive demands (e.g., HTKS), yet also predictive utility of less objective (e.g., adult-reported) measures that similarly include cognitive aspects of self-regulation, we hypothesized that: (a) measures of cognitive regulation would be more strongly associated with our measure of academic school readiness; (b) each measurement approach would show independent prediction of academic school readiness; and (c) strongest predictive associations would be for the direct assessment approaches (i.e., task and observation self-regulation measures), given increased objectivity over adult-report measures. Given no prior studies, to our knowledge, has undertaken ROC curve analysis of these measures we had no specific hypotheses regarding the sensitivity and specificity of these measures to identify children at risk of poor school readiness.

Method

Design

This was a longitudinal observational and correlational study using developmental data collected from children at the beginning of the year prior to school, and again 7 months later.

Participants

Participants were 232 3- to 5-year-old children (Mage = 4.43, SD = 0.38; range = 3.20–5.24 at baseline) who were identified by their parent as likely entering school the following year. While this criterion meant that the majority of children were aged 4–5 years at baseline (86.6%), a minority of children aged 3 years (13.4%, with nearly all > 3.5 years of age) were identified as likely being enrolled in school the following year and thus were also included. All children were enrolled at one of 25 pre-school services in metropolitan and regional areas of Australia. All services: followed the Australian Early Years Learning Framework (DET, 2009), a national curriculum for prior-to-school settings; were structurally equivalent in terms of being long-day care services providing care to children aged 2 to 5 years, up to 5 days per week; and had at least one Bachelor-qualified educator (or government waiver). Centres were recruited to ensure a broad adherence to population proportions for geography (84% metropolitan), socio-economic decile for their catchment area (M = 5.91, range = 1–10) and statutory assessments of quality (against the National Quality Standard). Educators at these centres who reported on children’s self-regulation were similarly consistent with sector demographics. Respondents (N = 79) were: majority female (97.5%) and employed full-time (54.4%); had an average of 10.34 years’ experience in the sector (range = 0.17—31) and 4.57 years at their centre (range = 0.17—16); and had diverse qualifications (27.8% degree, 39.2% diploma, 32.9% certificate).

Children’s individual and family characteristics were largely reflective of population characteristics in terms of: child gender (46.1% girls); socioeconomic status, as determined by the postcode-level Socio-Economic Indexes for Areas Advantage and Disadvantage index (ABS, 2008) deciles–combining census data on factors such as education, household income, and unemployment, such that more affluent and resourced areas are placed in a higher decile (M = 5.80, SD = 2.24; range = 1–10); primary language spoken at home (68.0% English); and identification as Aboriginal or Torres Strait Islander (5.4%). Maternal education levels were: less than high school (9.1%); high school completion, trade or certificate (38.9%); tertiary qualification (34.3%); and postgraduate qualification (17.7%). No children had diagnoses of developmental delay. Eight children (3.4%) did not have educator-report data at baseline due to the educators not returning questionnaires for these children. The retention rate for the school readiness assessment at the end of the year was 93.5% (n = 217). Ethics approval was granted by the University’s Human Research Ethics Committee (HE2017/451), and all participants provided verbal assent and written parental consent as a condition of participation.

Measures

Preschool Situational Self-Regulation Toolkit (PRSIST) Assessment

The PRSIST assessment (Howard et al., 2019) is an observational measure of early self-regulation that involves an observer engaging children in routine activities and rating consequent behaviors pertaining to the child’s cognitive and behavioral self-regulation. The first activity was a group memory card game. In this activity children, in a group of four, take turns trying to find a matching pair of cards (e.g., 8 pairs for 4-year-olds, 14 pairs for 5-year-olds), taking around 10 min to complete. The second activity was an individual curiosity boxes game, in which children were presented with a series of three boxes of increasing size and they were asked to guess the contents of each box. The sequence of guessing occurred as follows: first, guess based on the size of the box (no touching); second, guess after gently lifting the box to feel its weight (no shaking); third, guess after shaking the box (no opening); and lastly, guess after closing their eyes and feeling the object inside (no peeking). This activity took ~ 5 min. to complete. Each child’s self-regulation was rated at the end of each activity, with the items rated along a 7-point Likert scale representing frequency and/or degree of the behavior. The items were scored in relation to each activity (e.g., “did the child sustain attention and resist distraction throughout the instructions and activity” for cognitive self-regulation and “did the child control their behaviors and stay within the rules of the activity” for behavioral self-regulation), and item ratings were then averaged for the two activities, before aggregating scores into cognitive and behavioral self-regulation subscales. To ensure high inter-rater reliability, all observers completed the online training module (at www.eytoolbox.com.au), which involves an inter-rater reliability check, in addition to five joint observations alongside a member of the research team prior to in-field data collection. Observers were required to achieve a minimum threshold of consistency against a benchmark rating as follows: mean difference in average rating ≤ 0.75 points; a correlation between item ratings of at least r = 0.70; and at least 80% of item ratings within 1 point. The PRSIST Assessment has shown good construct validity, reliability (α ranging from 0.86 to 0.95), and concurrent validity with task-based self-regulation measures (rs ranging from 0.50 to 0.63) and Bracken School Readiness Assessment (rs ranging between 0.66 and 0.75) (Howard et al., 2019).

Head-Toes-Knees-Shoulders (HTKS)

HTKS is often considered a task-based measure of self-regulation (or executive function, given its emphasis on cognitive control), and has been shown to have good concurrent validity with other task- and adult-report measures of self-regulation, predictive validity of academic learning (Ponitz et al., 2009), and reliability (e.g., α ranging from 0.92 to 0.94 in McClelland et al., 2014). HTKS asks children to remember a correspondence between body parts (e.g., head and knees), and then perform the opposite action to what was instructed (e.g., touch their knees when the facilitator says ‘touch your head’). At higher levels of the task children must flexibly switch between correspondences. The task consists of six practice and 10 test trials at each of its three levels of difficulty: (1) a correspondence between head and toes; (2) a correspondence between head and toes and between knees and shoulders; and then (3) flexibly switching between the correspondences of head-knees and shoulders-toes. The task continues until completion (~ 8 min) or failing to achieve four points at any level (such that two points are awarded for a correct response and 1 point for a self-corrected response). Fieldworkers completed training prior to in-field data collection, as well as in-field practice with a member of the research team, to ensure accuracy of scoring and inter-rater reliability. Self-regulation was indexed by a sum of points awarded across all practice and test trials.

Child Self-Regulation and Behaviour Questionnaire (CSBQ)

CSBQ is a 34-item adult-report measure of children’s cognitive, behavioral, and emotional self-regulation and related behaviors (Howard & Melhuish, 2017). The items ask adults to rate the extent to which each statement reflects a child’s normal behavior (e.g., waits their turn in activities, regularly unable to sustain attention) on a 5-point scale ranging from 1 (Not True) to 5 (Very True). In the current study, respondents were the educator who self-identifed as knowing the child best. Indices of cognitive, behavioral and emotional self-regulation were generated by averaging constituent items within each subscale. The subscales have shown good reliability (α ranging from 0.74 to 0.89) and convergent validity with other adult-report measures of young children’s behaviors (Howard & Melhuish, 2017).

Child Behavior Rating Scale (CBRS)

CBRS is an adult-report measure of children’s task and social behavior (Bronson et al., 1995), from which a reduced 10-item ‘omnibus’ scale of task self-regulation has been derived (contrasting discrete self-regulation subscales generated by CSBQ) (Matthews et al., 2009). Items ask adults to rate the frequency of target behaviors (e.g., attempts new challenging tasks, concentrates when working on a task) along a 5-point scale ranging from 1 (Never) to 5 (Always). Self-regulation was indexed by the average of all 10 items of the reduced CBRS. This scale has been found to have good internal consistency (α = 0.96), test–retest reliability (r = 0.67) and convergent validity with observational measures. Respondents were again the educator who self-identified as knowing the child best.

Bracken School Readiness (BSRA)

BSRA (3rd edition; Bracken, 2007) is a standardized assessment of academic areas deemed important for school readiness. It has been shown to be predictive of kindergarten teacher ratings of children’s school readiness and academic results (Panter & Bracken, 2009). This measure includes subscales of colours (10 items), letters (15 items), numbers/counting (18 items), sizes/comparisons (22 items), and shapes (20 items). For each domain, the test continues until completion or three consecutive incorrect responses. All subtests are administered regardless of individual subtest performance. This task takes 10–15 min to complete and has good validity and reliability (Bracken, 2007; Panter & Bracken, 2009). School readiness was indexed by raw scores (to evaluate improvements in performance across the year), age-adjusted standard scores (to evaluate relative changes in children’s performance), and a dichotomous risk of poor school readiness variable. For this latter variable, children were identified as ‘at risk’ if they fell within established BSRA thresholds for being ‘delayed’ or ‘very delayed’, which have been shown to be predictive of later clinical diagnoses of language delay/disorder (Bracken, 2007).

Demographic Covariates

Parents reported demographic information used as covariates for analyses. These were: child’s age; child’s sex (1 = male, 2 = female); maternal education (1 = less than high school completion; 2 = high school completion; 3 = diploma or trade typically involving 1–3 years of study, often with work integrated learning, and can be commenced with or without high school completion; 4 = undergraduate degree of 3–4 years of study, with entry following high school completion or equivalent; 5 = postgraduate degree); socioeconomic decile of area of residence (i.e., SEIFA, described above); and an index of the breadth and frequency of out-of-school enrichment activities (an index of quality of the home learning environment, used successfully in the Effective Provision of Pre-School Education study; Melhuish et al., 2008). Specifically, on a scale from 0 to 7, the home learning environment index asked parents to report the average weekly frequency of undertaking the following activities with their child: reading; going to the library; sport, dance, physical activity; teaching letters or the alphabet; teaching numbers or counting; teaching songs, rhymes, poems; supporting to paint or draw; or going to special or extra-cost activities (e.g., sports, music lessons, theatre). This yielded an index that ranged from 0 to 56, with higher scores indexing a higher quality home learning environment. Each of these covariates were included in the predictive models because of their associations with children’s school readiness (Pratt et al., 2016; Razza et al., 2010; Sektan et al., 2010). This study sought to examine the additional practicable value of using specific measures of self-regulation—a changeable construct amenable to intervention—within the early childhood education and care setting to predict children’s school readiness and risk, over and above these known and largely stable covariates.

Procedure

All tasks were administered to children in a quiet area of their pre-school centre across two sessions in the same day, to maximize children’s attention and minimize their fatigue. Measures were administered in the following fixed order to all children: (1) PRSIST curiosity boxes and HTKS; and (2) PRSIST memory. Different trained fieldworkers were responsible for observation and task-based measures, to ensure observation ratings were not influenced by (and were blinded to) children’s task-based performance. Each session took 10–20 min to complete and were conducted near the start of their final prior-to-school year. Parent-reported self-regulation and demographic information were also collected at this time. School readiness assessment was conducted near the end of the year, again within a quiet area of the child’s pre-school centre.

Results

Initial Data Examination

There was a low level of missing values at baseline (between 0 and 3% for modelled variables) and high level of retention at follow-up (93.5%). There was evidence that these datapoints were missing at random based on comparison of available results. Data were next analysed to evaluate if the PRSIST factor structure initially documented in a smaller-scale study (Howard et al., 2019) was maintained in this larger dataset (using SPSS v24; IBM Corp. 2016). Exploratory factor analysis (EFA) using maximum likelihood estimation and a direct oblimin factor rotation was conducted on the PRSIST assessment items after averaging ratings for the two activities. The Kaiser–Meyer–Olkin (KMO) statistic indicated sufficient sampling, KMO = 0.90 (‘excellent’ according to common rules of thumb). Bartlett’s test of sphericity was significant, Χ2(36) = 3316.51, p < 0.001, indicating that inter-item correlations were sufficiently large for EFA analysis. The number of factors extracted was determined by the Guttman-Keiser criterion (eigenvalues > 1; Kaiser, 1960) and by the scree plot. Results indicated two eigenvalues greater than 1 (explaining 73.9% of the variance), which was supported by the scree plot. This two-factor solution was identical to that previously found, consisting of a reliable 5-item cognitive self-regulation factor (α = 0.89) and 3-item behavioral self-regulation factor (α = 0.84). All CSBQ subscales also showed good reliability: cognitive self-regulation (α = 0.88); behavioral self-regulation (α = 0.87); and emotional self-regulation (α = 0.79). These results justified the inclusion of these subscale indices in subsequent Path analysis and ROC analyses. Descriptive statistics for self-regulation indices are presented in Table 1, along with mean self-regulation scores for each index, for children classified as delayed in school readiness (n = 34), compared to those not delayed (average or advanced for age; n = 183). Independent-samples t-tests contrasting these groups showed significance differences in all self-regulation scores favouring the not delayed group.

Table 1 Descriptive statistics for self-regulation and academic school readiness indices, overall and by risk group

Self-Regulation Measures Predicting Academic School Readiness

Subsequent path analyses in AMOS (v23; IBM Corp. 2015), sought to evaluate absolute and relative fit of the following a priori specified models, after controlling for child’s age and sex, maternal education, home learning environment quality, and area-level SES: (1) all self-regulation indices concurrently predicting school readiness; (2) individual significant self-regulation indices predicting school readiness; and (3) all significant self-regulation indices, modelled as a latent variable, predicting school readiness. In accordance with Hu and Bentler (1998), absolute fit was determined by chi-square statistics, while relative fit was assessed using Bentler’s comparative fit index (CFI, with values > 0.90 indicating good fit; Smith & McMillan, 2001), the root mean square error of approximation (RMSEA, with values < 0.05 indicating good fit; Browne & Cudeck, 1993) and Akaike’s information criterion (AIC, with lower values indicating comparatively better model fit). Correlations between measures are presented in Table 2.

Table 2 Bivariate correlations for continuous variables

First examined was a model that loaded all self-regulation indices concurrently on school readiness scores, while controlling for child age and sex, maternal education, quality of home learning environment, and area-level SES (Fig. 1), and with all correlations among self-regulation measures modelled to account for covariance. Although this model’s significant chi-square statistic suggested poor model fit in absolute terms, χ2(10) = 22.46, p = 0.013, this often is deemed an overly conservative threshold, which should be considered in conjunction with relative fit metrics. The relative fit indices suggested good model fit to the data—CFI = 0.99, RMSEA = 0.07, and AIC = 210.46 (Table 3)—and R2 statistic of 0.40 indicated substantial variation in school readiness scores was accounted for by this model. Path loadings for all cognitive self-regulation indices (observed and teacher-reported cognitive self-regulation, and HTKS) were significant (Fig. 1), indicating that each index accounted for unique variance in children’s end-of-year school readiness standard scores. No other self-regulation indices were significant. Significant covariates were child’s age, child’s sex, and SES. Removing non-significant indices did not substantially alter model fit or path loadings.

Fig. 1
figure 1

Path analysis model loading all self-regulation indices on age-standardized school readiness scores, controlling for covariates of age, sex, maternal education (MatEd), home learning environment (HLE), and area-level socioeconomic status (SEIFA). Note. χ2(10) = 22.46, p = .013, CFI = .99, RMSEA = .07. Not depicted here, yet still modelled, are error terms for each predictor, correlations between these errors, and an error term for the outcome. Factor loadings are standardized regression weights. Significant paths are indicated by full lines, while non-significant paths are denoted by dashed lines. HTKS = Head-Toes-Knees-Shoulders task. PRSIST = Preschool Situational Self-Regulation Toolkit Assessment. CBRS = Child Behavior Rating Scale. CSBQ = Child Self-Regulation & Behavior Questionnaire. CSR = cognitive self-regulation. BSR = behavioral self-regulation. SESR = social-emotional self-regulation

Table 3 SEM model fit indices

The second set of models examined each significant self-regulation index individually in relation to school readiness scores, again controlling for covariates (Fig. 2). As such, these models evaluated the cognitive self-regulation subscale of CSBQ, cognitive self-regulation subscale of PRSIST, and HTKS. While these models were more parsimonious than the initial model, fit statistics fell below thresholds suggesting good model fit to the data (Table 3). Path loadings for self-regulation indices predicting school readiness were similar for each model, although there was a slight increase with increased objectivity of measure: CSBQ cognitive self-regulation, β = 0.40, R2 = 0.22; PRSIST cognitive self-regulation, β = 0.41, R2 = 0.23; HTKS, β = 0.45, R2 = 0.26. When separated, each index remained a significant predictor of children’s school readiness, yet poor model fit suggested that substantial variation in school readiness scores remained unexplained by each of the three indices individually.

Fig. 2
figure 2

Path analysis models loading individual self-regulation indices on age-standardized school readiness scores, controlling for covariates of age, sex, maternal education (MatEd), home learning environment (HLE), and area-level socioeconomic status (SEIFA). Note. (a) CSBQ-CSR on school readiness. χ2(10) = 22.84, p = .011, CFI = .81, RMSEA = .08. (b) PRSIST-CSR on school readiness. χ2(10) = 22.44, p = .013, CFI = .84, RMSEA = .07. (c) HTKS on school readiness. χ2(10) = 23.03, p = .011, CFI = .84, RMSEA = .08. Not depicted here, yet still modelled, are error terms for each predictor, correlations between these errors, and an error term for the outcome. Factor loadings are standardized regression weights. Significant paths are indicated by full lines, while non-significant paths are denoted by dashed lines. HTKS = Head-Toes-Knees-Shoulders task. PRSIST = Preschool Situational Self-Regulation Toolkit Assessment. CSBQ = Child Self-Regulation & Behavior Questionnaire. CSR = cognitive self-regulation. BSR = behavioral self-regulation. SESR = social-emotional self-regulation

A final model sought to integrate these self-regulation indices into a latent variable, controlling for covariates (Fig. 3). This model showed good fit, evidenced by CFI = 0.90 and RMSEA = 0.07, and was superior to previous models on other indices (i.e., AIC = 108.33 and R2 = 0.56). This was further improved with non-significant paths removed, CFI = 0.95, RMSEA = 0.06, AIC = 68.85, R2 = 0.55 (Table 3). In contrast to previous path loadings for self-regulation indices that ranged from β = 0.23 to 0.45, the standardized factor loading for the latent variable on school readiness was β = 0.80. Considering these improvements in model fit and factor loadings, this model was selected as providing the best fit to the data from among those models evaluated.

Fig. 3
figure 3

Structural equation model loading a latent self-regulation variable on age-standardized school readiness scores, controlling for covariates of age, sex, maternal education (MatEd), home learning environment (HLE), and area-level socioeconomic status (SEIFA). Note. χ2(22) = 44.33, p = .003, CFI = .90, RMSEA = .07. Factor loadings are standardized regression weights. Significant paths are indicated by full lines, while non-significant paths are denoted by dashed lines. HTKS = Head-Toes-Knees-Shoulders task. PRSIST = Preschool Situational Self-Regulation Toolkit Assessment. CSBQ = Child Self-Regulation & Behavior Questionnaire. CSR = cognitive self-regulation. BSR = behavioral self-regulation. SESR = social-emotional self-regulation

Self-Regulation Measures Predicting Risk of Poor Academic Outcomes

Subsequent ROC analyses sought to evaluate whether the self-regulation measures were able to accurately identify and discriminate those children at risk of poor academic outcomes. Historically used for military applications, and now more commonly within medical research, ROC analysis provides a statistical test of the diagnostic accuracy of a measure (in this case, of self-regulation) for predicting a dichotomous outcome condition (in this case, risk of poor academic outcomes) (Metz, 1978). ROC analysis also provides, for every score, an estimate of the trade-off between sensitivity (rate of accurate detection of children with at-risk levels of school readiness; n = 35, 16.1%, in the current sample) and specificity (the rate of accurate exclusion of children with average to advanced school readiness scores; n = 182, 83.8%, in this sample), thereby suggesting a threshold that may be useful to pre-emptively identify children at risk of more-negative outcomes. Whereas self-regulation is a well-established predictor of school readiness, which was also supported by our first analyses, this analysis sought to evaluate whether self-regulation indices–individually or together–might be able to discriminate children who could benefit from additional support from those who may not require this remediation.

Preliminary ROC analyses sought to evaluate how well individual self-regulation indices distinguished between children with ‘delayed’ performance from those with ‘normal to very advanced’ performance, according to established BSRA performance bands. Resultant area under the curve (AUC) tests for each self-regulation index separately were all significant, and all met thresholds required to indicate a ‘fair’ diagnostic test (i.e., poor is denoted by statistics of 0.60 to 0.69, acceptable as 0.70-0.79, and excellent as 0.80 + ; Hosmer et al., 2013): HTKS, AUC = 0.72, p < 0.001, 95% CI [0.63, 0.80]; PRSIST cognitive self-regulation, AUC = 0.70, p < 0.001, 95% CI [0.61, 0.79]; CSBQ cognitive self-regulation, AUC = 0.73, p < 0.001, 95% CI [0.62, 0.83]. Given improvement in path models when indices were aggregated, a subsequent ROC analysis was conducted using self-regulation factor scores, derived from the combination of these three indices in EFA. Results of this analysis indicated a significant and improved ability to discriminate school readiness risk groups, AUC = 0.76, p < 0.001, 95% CI [0.68, 0.85]. It is noteworthy that an average of these indices, after their standardisation to place them on the same scale, yielded similar results, AUC = 0.78, p < 0.001, 95% CI [0.69, 0.87]. This procedure may be more easily performed within educational applications and provide more readily interpretable thresholds (average number of standard deviations from the mean) than would latent variable modelling or factor scores. As such, diagnostic cut-off evaluations that follow consider this composite standardized score.

Given the adequacy of the composite score, subsequent evaluation sought to identify the threshold at which there was an acceptable level of accurate identification (sensitivity) and low level of misidentification (1—specificity). There are no established rules of thumb for selecting such a threshold, as the required balance of sensitivity and specificity is dictated by the context. For instance, in medical situations where non-identification of patients at risk is likely to result in death, a high degree of sensitivity is important to ensure that no cases are missed. In situations with risky treatment programs, in contrast, there is a greater emphasis on minimising misidentification (that is, maximising specificity). In the current context it would appear that a balance between the two is optimal, such that there is a sufficiently high level of correct identification (~ 70%) and accurate exclusion of those children not at risk (~ 80%). The threshold at which these criteria were best achieved was -0.44 (indicating a cut-off of 0.44 SD below the mean after averaging the indices). This provided a sensitivity of 0.71 (i.e., applying this cut-off to baseline self-regulation composite accurately identified 71% of children who would become classified as ‘delayed’ in their school readiness at the end of the year) and a specificity of 0.79 (i.e., 79% of those who were not delayed were correctly classified as such). While additional plausible cut-off points are provided in Table 4 (for various combinations of one, two and three indices to identify a minimally sufficient set), adopting a composite score of three measures using this threshold appeared to optimize diagnostic utility for later school readiness risk.

Table 4 Selected classification thresholds from ROC curve analysis

To the question of risk identification, however, results indicated that this index provided better prediction of true-negatives (i.e., probability that children above the self-regulation cut-point were not at risk on BSRA; negative predictive power = 0.94) than for true-positives (i.e., probability that children below the self-regulation cut-point were at risk on BSRA; positive predictive power = 0.40). This indicates that, although the overall model provided acceptable diagnostic utility, the self-regulation index alone was insufficiently accurate in forecasting risk on the school readiness assessment (i.e., while 71% of children at risk on BSRA were captured by this self-regulation cut-point, an additional 21% of children not at risk on BSRA were also captured at this cut-point; 29% of children at risk on BRSA were not captured at this cut-point).

Discussion

International efforts to identify and intervene for young children who are at risk of poorer outcomes are undertaken in the context of a need to target limited resources to those who can benefit most. To achieve this, reliable early indicators with thresholds for identifying children at the highest risk are essential. Given the pervasive way that self-regulation skills impact on learning and wellbeing, early childhood self-regulation is an obvious target. This study thus aimed to evaluate contemporary approaches to self-regulation assessment across cognitive, behavioral, and emotional domains in terms of: (a) their longitudinal associations with children’s academic school readiness; and (b) their utility for pre-emptively identifying those children who may benefit from further support. Results indicated that only cognitive self-regulation indices–in each of the forms of teacher-report, observation, and task-based assessment–were related to school readiness 7 months later, just prior to school entry.

This is perhaps unsurprising given the academic nature of the current school readiness measure, which focuses on foundational academic content knowledge (i.e., shapes, colours, numbers, letters). International understandings of school readiness include these important learned skills, as well as approaches to learning, emotional competence, social behaviors, and motor skills (UNICEF, 2014). With a different measure of school readiness, it is possible that other aspects of self-regulation may have also been useful predictors. For example, early emotional self-regulation capabilities have been linked with more positive school adjustment (Herndon et al., 2013) and longer-term emotional wellbeing (Guhn et al., 2016). Further, it is important to note that multiple studies document the way that early emotional self-regulation supports the development of cognitive (Williams et al., 2016) and behavioral (Edossa et al., 2018) aspects of self-regulation over time. This means that while cognitive self-regulation as measured here was the only regulatory component directly and significantly linked with our academic school readiness measure, it is likely that higher levels of emotional and behavioral regulation have an important but indirect effect through supporting cognitive self-regulation development, as shown in those longitudinal models.

Still, the academic competencies measured here are strong predictors of future school achievement (Panter & Bracken, 2009), and thus remain an important outcome for children prior to school entry. Indeed, children with more developed self-regulation–given its critical role in directing and sustaining focus, resisting distractions, and working in a self-directed manner–are positioned to gain more from home and early education learning environments that offer exposure to this foundational academic knowledge (McClelland et al., 2006).

Each approach to measuring cognitive self-regulation (observation, task, teacher-report) independently accounted for unique variance in children’s school readiness standard scores, after controlling for the other measures, which largely aligns with prior studies documenting the predictive utility of tasks and teacher reports (Finders et al., 2021; Vitiello & Greenfield, 2017). Our expectation that direct assessment approaches (task-and observation-based) would be more highly associated with school readiness was only partly confirmed when modeled together (Fig. 1); however, differences in predictive validity when compared to teacher-report were very small when each predictor was modelled separately (Fig. 2). Instead, a combination of the three cognitive self-regulation measures accounted for greater variance in academic school readiness than any individual index. This aligns with common measurement wisdom that more indices provide better capture of a construct (Bollen & Lennox, 1991) and aligns with recent suggestions that this is similarly true for self-regulation (Duckworth & Kern, 2011).

The ways that the self-regulation measures were combined in the current study have different conceptual interpretations and practical implications which are worth considering. The latent variable approach modelled what is shared among the indicators, thereby assuming that its indicators reflect (or are the effect of) an underlying latent construct of cognitive self-regulation. In the current study, bivariate correlations among cognitive self-regulation indices were moderate (r < 0.43) and the latent variable accounted for only 27% of the shared variance among the indicators. However, this latent variable was a strong predictor of later academic school readiness. Associations between the cognitive self-regulation indices are also notable given the substantial diversity in their measurement approach, which nevertheless yielded stronger correlations than are typically seen in similar domains of early development (e.g., correlations between task-based measures of executive function in pre-school often range from rs of 0.20 to 0.30; Howard & Melhuish, 2017).

In recognition of the need for accessibility and interpretability for any such results to have a practical implication, we also evaluated the utility of a composite standardized score derived from individual indices. Deriving a composite score from indicators positions them as formative, or causes of the broader construct of cognitive self-regulation. Low to moderate correlations are expected in such situations (Bollen & Lennox, 1991), and a composite score would more accurately reflect children’s rank order. Importantly, in addition to its stronger predictive validity, this composite score also yielded some of the strongest combinations of sensitivity and specificity at particular cut points. Combination in this manner can enhance interpretation (i.e., as number of standard deviations above or below the mean) and is more practicable for the field by removing the statistical requirements of latent variable modeling. There have been growing calls for greater consideration of whether indicators of executive functioning (related to self-regulation) in early childhood should be considered as reflective or formative indicators (Willoughby et al., 2013), and our findings concur with this advice.

In evaluating the extent to which self-regulation indices may be useful in pre-emptively identifying children who are likely to start school at risk of poor academic readiness, each of the cognitive self-regulation measures individually provided fair diagnostic utility. However, identifying a threshold for accurate identification of risk group membership tended to provide either insufficient capture of children “at risk”, or excessive inclusion of children not at risk. The composite score of the three measures similarly yielded fair diagnostic utility, although uniquely yielding an adequate balance of sensitivity and specificity. This level of diagnostic utility is most commonly considered as ‘fair’, because it fails to identify some children who are at risk and identifies some children who are not. To illustrate using current data, consider the case of a hypothetical pre-school with 60 children, for which we can expect 10 children to be at risk of poor school readiness (based on established BSRA risk bands). The composite self-regulation index at its optimized cut-off would correctly identify and enable response to seven of them. It would also identify an additional 10 (of the 50) children not at risk of poor school readiness. While this is not ideal, additional educational supports proffered to these children are unlikely to have a negative impact on these children and could be withdrawn as educators deem necessary. Yet problematic for the purposes of risk prediction, our measure at this cut-point failed to identify three children at risk for poor school readiness who could benefit from support, suggesting that the school readiness risk for these children may have been unrelated to self-regulation and its influence on learning (and/or was a consequence of measurement imprecision). This aligns with findings that self-regulation is an important (e.g., Zimmerman & Kitsantis, 2014), but a non-exclusive and perhaps even indirect, contributor to school readiness (e.g., via mental health; Panayiotou et al., 2019).

While the current results do not advocate for use of these self-regulation measures for clinically useful prediction of risk (e.g., for diagnosis or for clinical intervention), either in isolation or conjunction, they may be useful for lower-stakes and more-indicative purposes (e.g., informing practice and planning in early education settings). The potential benefit of targeting intervention efforts to children most at risk of poor outcomes is necessitated, for example, by intervention studies that have found positive effects only for those who have the poorest self-regulation skills at baseline (Tominey & McClelland, 2011), yet have had little success in accurately identifying these children prior to study commencement. For instance, while a number of interventions have targeted children at risk for poor self-regulation and school readiness, to do so they have often relied on proxies such as socio-economic status (Bierman et al., 2008; O’Connor et al., 2014; Raver et al., 2011) that fail to specifically identify children low in self-regulation. As a consequence, largely modest intervention effects for self-regulation-focussed interventions globally (Jacob & Parkinson, 2015; Pandey et al., 2018) might be at least partly due to our inability to identify children most in need. Appropriately targeted intervention efforts might thus benefit from the appraisal of early self-regulation skills at the outset.

While this study makes an important contribution to our understanding of early child self-regulation and its indicators, it is not without its limitations. This study used only one measure from each approach to self-regulation assessment, and thus current results may not apply consistently across all measures within a given approach. However, our measures were selected on the basis of their known psychometric properties, in an attempt to select an ideal exemplar from each approach. Further, only aspects of school readiness related to academic knowledge were measured, and future studies should seek to replicate the findings here with other social-emotional aspects of school readiness (e.g., school adjustment, school liking, peer relations). Practical utility of studies such as this would also benefit from more fine-grained analysis of risk levels, such as the identification of children to monitor (i.e., who do not present as currently at-risk, but whose trajectories may necessitate later support). These studies would do well to adopt more comprehensive and longitudinal indices of academic performance. There is also what may seem like an anomaly in the negative loading of age on school readiness. That is, younger children appeared to be more school ready. However, this must be considered in light of the fact that school readiness standard scores are age-adjusted. Given that all children in this study were in the same learning environment (i.e., pre-school room), what this indicates is younger children in their final pre-school year were performing better relative to children their age than were older children from this sample. This is as per expectation, in this context, as children do not uniformly transition to school at exactly the same age; instead, children can enter school earlier or later based on parents’ and educators’ perceptions of their readiness. Results should also be interpreted with the specific population of this sample in mind. While our sample was largely representative of the Australian population, which is inherently culturally and linguistically diverse, it is not a given that the same findings would apply in other settings characterized by different diversity profiles. Finally, for most children substantial growth in self-regulation will occur over the prior-to-school year, supported by developmental processes and student–teacher interactions (Veraksa et al., 2020), classroom pedagogy and curriculum (Diamond et al., 2019), and peer behavior (Montroy et al., 2016b). Future studies may also seek to understand different levels of school readiness risk based not only on beginning-of-year self-regulation scores, but also trajectories of growth over time, and moderating and mediating effects of classroom and social factors.

Conclusion

Taken together, our analyses provide complementary and cumulative insight into the relationship of self-regulation with academic school readiness, and how this varies by self-regulation component assessed and approach to measurement. Specifically, results suggest each measurement approach–observation, task-based, adult-report–provided predictive utility for later academic school readiness, with shared and independent variance accounted for by each index. This finding suggests superior prediction of school readiness when these indices are combined, yet further analysis was needed to discern whether these indices (and at what cut points) could reliably predict “at-risk” levels of school readiness. Combining the indices improved prediction of risk, and also gave adequate–but not clinically useful–prediction of children at risk who could benefit from additional support. As has been noted in the literature (Duckworth & Kern, 2011), the use of multiple measures appears justified. However, where this is not feasible, even a combination of two measures may be sufficient for some purposes. It is recommended that clinicians and educators use valid and reliable adult-report measures appropriate for their context, as well as task and observation protocols where possible, to identify children most likely to benefit from early support. However, any efforts to support enhanced self-regulatory growth in early childhood are likely to have universal benefits for all children, and at least do no harm. A focus on evidence-based self-regulation support in clinical and educational contexts will optimize the likelihood of improved child outcomes, within available contexts and resources. As the field continues to develop, the ability to pre-emptively identify children at risk and track their developmental progress, over time and as a result of intervention, will be enhanced, with this study contributing important insight in this regard.