Introduction

Crying is one of the first modes of communication between a human infant and their primary caregiver. Infants communicate their needs through their cry and rely on the caregiver’s responsiveness for survival. Compromised communication and social skills are one of the diagnostic criteria for Autism Spectrum Disorder (ASD; American Psychiatric Association 2013). This has led researchers to further examine the quality and phonetic properties of vocal production in toddlers and young children diagnosed with ASD.

Several studies have observed atypical acoustic features in vocal productions of children with ASD (Esposito and Venuti 2009; Oller et al. 2010; Schoen et al. 2011; Sheinkopf et al. 2000; Woods and Wetherby 2003). For example, Esposito and Venuti (2009) analysed vocal productions from retrospective home videos obtained at approximately 12-months old for infants subsequently diagnosed with either ASD, developmental delay (DD) or with no developmental disability (typically developing, TD). Infant crying behaviour and maternal responses were then coded using an acoustic rating system called the crying observation codes (Venuti et al. 2004). The infants with ASD were found to produce cry patterns characterized by fewer pauses and increased dysphonation (shorter aspiratory/expiratory phase) compared to the other two participant groups. It has been proposed that the underlying mechanism for these differences in cries of children with ASD may originate in neural abnormalities in the brainstem that disrupt the coordination of the larynx and the vocal tract during a cry episode (Rodier 2002; Rodier et al. 1996). While these data may indicate a very early manifestation of ASD-related difficulties, the study design is limited by making use of retrospective video footage. The retrospective nature of the data results in several limitations including the variability of the age of the participants, the difficultly in determining the developmental level or ASD symptomatology at the time of the recording and, finally, it can be challenging to control for the variability in length, content and structure in video footage.

Prospective designs are becoming increasingly common in the study of the early development of ASD. Prospective research in ASD typically involves recruiting infant sibling cohorts as a means of investigating potential risk factors longitudinally, prior to diagnostic age, where recurrence risk of ASD in siblings is estimated to be 18.7% (Ozonoff et al. 2011). Sheinkopf et al. (2012) prospectively examined a cohort of infant siblings of children with ASD (high risk, HR) and infants with no family history of ASD (low risk, LR). Cry samples were derived from 17 HR and 11 LR infants when they were 6-months old, and were categorised as being either pain or non-pain related. HR infants were found to produce pain-related cries with higher and more variable fundamental frequency compared with LR infants. There were no differences between groups in the acoustic characteristics of non-pain related cries. The infant cohort was followed until 36 months of age, at which time an ASD diagnostic assessment was performed. HR infants who were later diagnosed with ASD (HR-ASD), recorded the highest fundamental frequency for both types of cries (pain and non-pain) and produced cries that were less phonated compared to TD infants (LR and HR-no ASD).

Several prospective studies have since sought to replicate these findings. Observations of higher fundamental frequency in cries of HR infants, relative to LR infants, have now been reported in several studies (Esposito and Venuti 2010; Esposito et al. 2014). Shorter duration of cry utterances (Esposito et al. 2014), increased variance in cry amplitude (Sheinkopf et al. 2012), shorter length of pauses between cry episodes and increased dysphonation (Esposito and Venuti 2010) have also been reported in cries of at-risk infants obtained during the 1st year of life. Further, HR-ASD have been found to have among the highest fundamental frequency values and produce cries that are more poorly phonated than those of non-autistic infants (Sheinkopf et al. 2012; Esposito et al. 2014; Esposito and Venuti 2010). These studies have advanced this area by using the prospective research design. However, one limitation of these studies is the relatively small samples of HR and LR infant siblings. For example, Sheinkopf et al. (2012) collected recordings from a sample of only 17 HR and 11 LR infant siblings. After categorizing cries into pain and non-pain related, the sample was reduced even further to seven HR infants and five LR infants for pain-related cries.

Previous research has also suggested that infants with ASD do not communicate their needs through crying as effectively as TD children (Esposito and Venuti 2008; Esposito et al. 2011; Venuti et al. 2012) making it more difficult for their parents to perceive what is causing them distress (Esposito and Venuti 2008). A recent study employed functional magnetic resonance imaging techniques to measure brain activity during adult processing of cries from retrospective video footage of infants later diagnosed with ASD compared with cries from TD infants. Using whole brain analysis, they found that cries of infants with ASD, compared to cries of TD infants, elicited enhanced activity in brain regions associated with verbal, prosodic and emotional processing (Venuti et al. 2012). Venuti et al. (2012) suggested that this may be due to altered acoustic patterns that render their cries more difficult to interpret and that the increase in emotional activity may suggest they are experienced more negatively and found to be more emotionally arousing. Taken together, these observations indicate that atypical acoustic properties of cries may represent a vocal signature of ASD in early life.

The aim in the present study was to examine the acoustic properties of multiple cries from a moderately sized sample of HR and LR infants at 12 months of age. Based on previous research, we hypothesized that this study would provide further support for an atypical acoustic profile in HR siblings of ASD. Specifically, we expected that cries from HR infants would be characterized by higher fundamental frequency, shorter cry durations and greater variance in amplitude for the cry utterance. We then investigated whether acoustic properties of infant cries recorded at 12-months old were predictive of ASD symptomatology at 2 years old.

Methods

Participants

Participants were part of the pregnancy investigation of siblings and mothers of children with autism (PRISM) study cohort in Perth, Western Australia. Two groups of pregnant women were recruited as part of this prospective, longitudinal study. The HR group comprised pregnant women who have an existing child (the ‘proband’) who had received a clinical diagnosis of either autistic disorder or pervasive developmental disorder-not otherwise specified according to DSM-IV guidelines (American Psychiatric Association 2000). The LR group comprised pregnant women who have at least one existing child of at least 3 years of age who has not received a diagnosis of any neurodevelopmental conditions.

A total of 33 HR and 44 LR families were recruited to the study during the mother’s pregnancy via advertisements in local newspapers or referrals from their obstetricians or gynaecologists. For each family, enrolment in the PRISM study involved collection of pre- and postnatal data over a period of 3 years. Two and three-dimensional fetal ultrasounds and umbilical cord blood were collected in the prenatal and neonatal stages, followed by a total of five postnatal follow-ups (Unwin et al. 2016). Acoustic cry recordings were collected by parents at the 12-month follow up. A small proportion of families had relocated at the time of the follow-up and were unable to participate. A subset of participants completed the 12-month home visit with a researcher, but due to their other commitments they did not collect acoustic recordings in the weeks following the visit. The final sample with available acoustic recordings was 23 HR and 33 LR infants. Informed consent was obtained from all individual participants included in the study.

Demographic Information

Prior to their involvement in the PRISM study, families were invited to the Telethon Kids Institute (University of Western Australia) for a face-to-face behavioural assessment. Mothers were asked to complete a comprehensive questionnaire about their pregnancy, family and medical history, and the early behavioural development of the proband.

Probands in the HR group were administered the autism diagnostic observation schedule-generic (ADOS-G; Lord et al. 2000) as part of the study protocol. All but four of the probands met criteria for ASD on the ADOS-G. In Western Australia, a diagnosis of ASD mandates consensus by a team comprising a pediatrician, psychologist and speech-language pathologist. Given the rigorous nature of clinical diagnosis in Western Australia, and that several years may have passed since the original clinical diagnosis (during which therapeutic benefits may have mitigated ASD behaviours), a decision was made to include all participants in further analyses. In all four cases, the study investigators cited the original diagnostic report and confirmed the diagnosis of ASD.

Motor skill and language development were also assessed in the proband children using the Mullen scales of early learning (MSEL; Mullen 1995). The MSEL is a standardized assessment of cognitive ability that can be administered to infants and children up to 68 months of age. The early learning composite (ELC) score was calculated by combining T-scores on the receptive language, expressive language, fine motor, and visual reception subscales. probands were administered either the MSEL (Mullen 1995), Wechsler intelligence scale for children (WISC-IV; Wechsler 2003) or the Wechsler abbreviated scale of intelligence (WASI; Wechsler 1999) depending on their age at time of assessment.

PRISM Study 12-Month Follow-Up

At the 12-month postnatal follow-up, HR and LR infant siblings completed the MSEL and the autism detection in early childhood (ADEC; Young 2007) to assess for early DD and for the presence of any early ASD behavioural symptoms, respectively. The ADEC tool is comprised of 16 items; each item is scored from 0 to 2 with zero implying a typical response and scores of one or two indicating an inappropriate response. Total scores can range from 0 to 32, where a score of 11 or more is considered indicative of risk for ASD (Young 2007).

Following the 12 month postnatal follow-up, parents were provided with a Sony 2 GB icd-px312 voice recorder and were asked to record samples of their infant crying. All cry recordings were saved in mp3 format. Cries were to be as naturalistic as possible with recorded cries occurring spontaneously rather than being elicited by the carer. Parents were encouraged to hold the recorder as close as possible to the child in order to minimize surrounding sounds. Infant positioning was not manipulated; however, once a recording had been captured, parents were asked to note on each occasion whether their child was sitting, standing or lying down, since this may have impacted on the quality of the recording (Lin and Green 2007) and also to comment on the perceived reason for the cry. Parents were given the recorder for a period of 1–2 weeks and were encouraged to record at least one crying sample during the daytime and one during the evening. All recordings were collected by parents within 4 weeks of the infant siblings’ first birthday.

Acoustic Analysis Procedures

Acoustic analyses were performed using Praat voice analysis software (Boersma and Weenink 2005) by an independent researcher who was naïve to group membership. The sampling rate was 44,100 Hz, and the signal was low pass filtered at 10,000 Hz (Rautava et al. 2007). A total of 247 mp3 formatted cry recordings were collected by parents and for each infant, at least one cry episode was obtained. Cry episodes were often terminated at points where parents intervened and comforted their child. Thus, for acoustic analysis, cry units were extracted from the episodes of crying. A cry unit was defined as the expiratory phase of respiration during a cry which lasts a minimum of 0.5 s (Sheinkopf et al. 2012). All valid cry units were extracted from the episode of crying that was recorded for each infant. For each infant cry episode the aim was to extract and analyse three cry units.

A total of 146 cry units (three per participant with the exception of one participant with only two acceptable cry units) were extracted and identified as suitable for acoustic analysis based on the absence of background noise that would interfere with the analysis. Recordings with less than 30 dB difference between the mean intensity of the cry episode and the mean intensity of the nearest pause were excluded from the study, as per the guidelines proposed by Deliyski et al. (2005). Where the difference in intensity was less than 30 dB, cry units were extracted from the next recording and reanalysed for differences in intensity until each child had three acceptable cry units. If three units of acceptable quality could not be extracted from the same recording, units were extracted from multiple recordings.

Only voiced segments were included in analyses in order to prevent interference from non-cry characteristics, such as coughs and pauses during expiration. Extracting the voiced segments required identifying the voice boundaries within the cry unit. Boundaries were distinguished through spectrographic analysis focusing on waveform contours, pitch lines and onset and conclusion of pulses. Segments void of pulses and pitch lines indicate the absence of phonation and were therefore excluded from analysis. Voiced segments were extracted and then concatenated to produce an exclusively voiced cry unit for calculating acoustic measures.

Dependent measures produced by acoustic analyses were fundamental frequency (F0), amplitude, formant frequencies (F1, F2), and cry duration. The mean fundamental frequency of the voiced segment was calculated with the parameters set to 100 Hz for the pitch floor and 1000 Hz for the pitch ceiling. These boundaries were selected based on previous evidence indicating that spontaneous infant cries range from 200 to 600 Hz (e.g. Etz et al. 2014; Michelsson et al. 2002). The minimum and maximum fundamental frequency was obtained to allow for observations in the variance in F0. As intensity scores can reflect neural control and the capacity of the respiratory system (LaGasse et al. 2005) the minimum, maximum and range of intensity values were recorded. The first and second formants (F1 and F2) were calculated as they relate to the F0 and reflect vocal tract control (Santos et al. 2013). Finally, the cry duration was documented after being extracted from the original recording as the expiratory phase of respiration lasting a minimum of 0.5 s. Coughs and pauses were included when originally segmenting the cries from the sample, but then excluded in the analyses for all variables except cry duration. The cry was determined to have ended at time of inhale. For each cry, acoustic parameters were computed with Praat software. Definitions of the acoustic variables and their corresponding biological mechanisms are described in Table 1.

Table 1 Description and biological mechanism for each of the acoustic variables

A total of 56 families collected voice recordings for HR (n = 23) and LR (n = 33) infant siblings. Recordings from seven participants were excluded as they either did not satisfy the established requirements for sound quality (the difference between the mean intensity of the cry episode and the mean intensity of the nearest pause was less than 30 dB) or the files were not originally saved in, and could not be converted to, mp3 format (n = 3) due to method of collection (i.e. a parent who attempted to collect extra recordings on their mobile phone device). For one participant, there were only two (instead of three) cry units that met criteria for acoustic analysis. In an effort to maximize the sample size, we decided to include this participant in further analyses. The final sample size consisted of 22 HR and 27 LR infant siblings (total of 146 cry units).

Coding of Cry Episodes

Additional coding was performed to detail parent-reported cause of infant distress (e.g. hunger, fatigue, frustration) and researcher’s perceived level of infant distress. To obtain a cry distress rating, one researcher who was blind to group assignment listened to one cry recording for each infant and rated their perception of the infant’s distress. Adapted from Esposito et al. (2015), perception of infant distress was recorded using a 7-point Likert scale (1 = lowest level of distress and 7 = the highest level of distress).

PRISM Study 2 Year Follow-Up

At the 2-year postnatal follow-up, the HR and LR infant siblings were administered the MSEL, and the current ‘gold standard’ ASD assessment, the autism diagnostic observation schedule-generic (ADOS-G; Lord et al. 2000) module one. Infant siblings were then classified according to one of two criteria; autism spectrum or none. Standardized ADOS-G severity scores were calculated for the social affect (SA) and restricted and repetitive behaviours (RRB) domains (Hus and Lord 2014). All follow-up assessments were obtained within 3 weeks of the infant turning 12 or 24 months, respectively.

Statistical Analysis

Analyses first concentrated on comparing the characteristics of the probands and family demographics of the HR and LR groups using χ2 (categorical variables) and one-way analysis of variance (ANOVA; continuous variables). Prior to performing group analyses for acoustic variables, medians were calculated for each dependent variable across the three cry episodes. The median statistic was selected as a measure of central tendency to reduce the influence of variability, especially extreme scores, across the three episodes per infant, whilst still providing detailed information about each participant’s cry. For the one participant with only two cry episodes, a mean statistic was calculated for each dependent variable. Using ANOVA, our analyses then turned to comparing HR and LR groups on F0 (min, max, variance), F1, F2, amplitude (min, max, variance), and cry duration. Groups were also compared on differences in performance for MSEL, ADEC and ADOS-G severity at their respective follow-ups. If HR and LR groups differed on any potential confounding variables, analysis of covariance was used. Fisher’s exact test of independence was used to compare HR and LR groups when the expected number of infants was small. Pearson correlation analyses were then used to examine whether 12-month acoustic measures were correlated with 24-month outcome measures.

Results

Proband characteristics are presented in Table 2 and family demographics and HR-LR infant data are reported in Table 3. As expected, based on the higher incidence in ASD of males compared to females (Werling and Geschwind 2013), 79% of the HR proband group were male, which compared with 48% of siblings in the LR group (p = .01). Results from one-way ANOVAs indicate that HR probands were significantly older at the time of their assessment (p < .05) and scored significantly lower on the MSEL (p < .001) compared to LR probands.

Table 2 Characteristics of probands in the two groups
Table 3 Characteristics of the high- and low-risk infant groups

Fisher’s exact test of independence identified no significant differences between the HR and LR groups in paternal age at conception, maternal smoking or alcohol use during pregnancy. However, maternal age at conception was significantly older in the HR group, relative to the LR group (p < .05). Analyses of infant sibling demographics observed no significant differences between groups on sex, gestational age at birth, age at time of assessment, nor in the average number of acoustic recordings collected by the parent(s). Family size was not significantly different between the HR and LR groups. There was also no statistically significant difference between groups in the proportion of HR (14%) and LR (0%) infants that were born preterm (prior to 37 weeks gestation, p > .05).

Between-Group Comparisons at 12-Month Follow-Up

Analyses identified no significant differences between groups for F0–F2 or any of the amplitude measurements. However, cry duration was significantly shorter for the HR group relative to the LR group (p < .05, \(\eta\) 2 = 0.08). One-way ANOVA between the HR and LR groups identified significantly poorer performance on the MSEL composite score (p < .05, \(\eta\) 2 = 0.12) and elevated risk scores on the ADEC (p < .05, \(\eta\) 2 = 0.16) in the HR group relative to the LR group. At 12-month follow-up, there were no significant differences between the HR and LR groups on MSEL receptive or expressive language scores.

Four participants in the HR group (18.1%) and one participant in the LR group (3.7%) scored above the risk threshold falling within the moderate-risk range on the ADEC (Table 4). Participants who scored above the ADEC risk threshold appeared to have the shortest cry durations. That is, infants with scores in the moderate-risk range (n = 5) obtained a maximum cry duration of 2.23 s compared to 4.76 s for infants scoring in the low-risk range (n = 44). Cry duration was not significantly correlated with ADEC scores for the total sample. Removing the HR infant siblings of the four probands who did not meet criteria for ASD on the ADOS-G did not change the pattern of effects observed.

Table 4 One-way analysis of variance between high- and low-risk groups for the acoustic measurements

A nonparametric independent-samples t-test was performed to compare the perceived level of distress ratings between the HR and LR groups. The difference between groups on researcher ratings of perceived level of distress was not significant (Mann–Whitney U = 270, p > .05).

Between-Group Comparisons at 24-Month Follow Up

At 2 year follow up, the HR group were differentiated from LR infants on a number of measures, including lower scores on the MSEL, and higher ADOS-G severity scores. Five HR infants and one LR infant met ADOS-G criteria for ASD (p < .05; Table 5).

Table 5 Characteristics of high- and low-risk groups for continuous and categorical variables at 12 and 24-month follow up

Relationship Between Early Markers and 2 Year Outcome

Pearson correlation analyses revealed a trend towards significance for shorter cry durations correlating with more severe ADOS-G RRB severity score (p = .08) and poorer performance on MSEL receptive language (p = .07) and MSEL receptive language performance (p = .06). The infant siblings who received a diagnosis of ASD at age two (n = 6) had amongst the shortest recorded cry durations with a maximum cry duration of 2.72 s compared to a maximum duration of 3.80 s for infants who did not receive a diagnosis of ASD (n = 43). Length of phonation by ADOS-G diagnosis (ASD or none) and risk group (HR or LR) are presented in Fig. 1.

Fig. 1
figure 1

Length of phonation by risk group and ADOS-G diagnostic classification. Blue dots correspond to infants who did not receive a diagnosis of ASD and black dots correspond to infants who did receive a diagnosis of ASD. (Color figure online)

Discussion

Previous research has observed atypical patterns in the acoustic properties of cries in at-risk infants, irrespective of whether they are later diagnosed with ASD. These studies have often relied on small participant numbers and have employed a retrospective study design, making it difficult to account for confounding variables. The present study aimed to overcome these previous limitations by prospectively analysing the acoustic properties of multiple cries of 12-month old HR and LR siblings enrolled in the PRISM study cohort in Western Australia. We hypothesized that this research would provide further support for an atypical acoustic profile characterized by higher fundamental frequency, shorter cry durations and increased amplitude in HR relative to LR siblings. We also investigated whether acoustic properties of infant cries recorded at 12-months old were predictive of ASD symptomatology at 2 years old.

The hypothesis was only partially supported, with cry duration found to be significantly shorter for the HR group relative to the LR group. Of this sample, the infant siblings who obtained elevated scores on ASD-risk measurements at 12 months of age had amongst the shortest cry durations recorded. These findings are comparable with previous research that observed infants subsequently diagnosed with ASD to show a smaller proportion of the cry sequence occupied by the aspiration/expiration phase compared to infants subsequently identified as either TD or developmentally delayed (Esposito and Venuti 2009). Furthermore, shorter cry durations have been previously recorded amongst 15-month old HR infants relative to LR infants, and of these, those toddlers who were diagnosed with ASD (HR-ASD), had among the shortest overall cry durations of the sample (Esposito et al. 2014).

Neural control of the respiratory system is thought to be the biological mechanism underlying shorter cry duration, where shorter utterances are indicative of increased tension and instability of neural control of the vocal tract and poorer control and capacity of the respiratory system (LaGassee et al. 2005). The same brainstem structures are involved in both the regulation of heart rate via the vagus (i.e., cardiac vagal tone) and vocalizations via the laryngeal and pharyngeal muscles (both efferents of the supradiaphragmatic vagus; Stewart et al. 2013). Thus, shorter cry durations have also been linked to disruptions in autonomic regulation (i.e., increased heart rate; Stewart et al. 2013). Research into neurological characteristics of ASD has described abnormalities within the brainstem, possibly due to complications at the time of neural tube closure (Rodier 2002). Specific areas of abnormalities identified vary between studies including shortening of the midbrain, major reduction of neurons in the facial nucleus (Bailey et al. 1998), and enlarged arcuate nuclei in the medulla which is involved in respiratory regulation (Rodier et al. 1996). Instability in neural control of the respiratory system can impact the inspiration and expiration capabilities of an individual. In the case of infant cries it can be reflected in the length and amplitude of expiration which has been shown to be shorter and louder in some children with ASD (Esposito et al. 2014; Sheinkopf et al. 2012). The variability of findings makes it difficult to pinpoint an exact area of deficit in individuals with ASD, however there does appear to be evidence to support brainstem malformation.

At 2 year follow-up, the HR group were differentiated from LR infants on a number of measures, including lower scores on the MSEL (indicative of poorer performance), and higher ADOS-G severity scores (indicative of elevated ASD-risk). The six infants who met ADOS-G criteria for ASD at 2 years old recorded some of the shortest cry durations at 12-months old. There was a trend towards significance for shorter cry durations to be associated with more severe ADOS-G RRB severity score and poorer performance on the MSEL language subscales. These results provide some preliminary evidence for a unique acoustic profile characterized by shorter overall duration of cries in infants at-risk of ASD. Further research is required to better understand the biological and physiological underpinnings of this finding in the context of ASD. Future research may consider investigating some of these behavioural risk indicators in a larger community risk sample. These cohorts may include a larger proportion of at-risk infants who receive a diagnosis of ASD at follow-up, which is required to facilitate in depth analyses of the relationship between early signs and later diagnosis.

There were no significant differences between the HR and LR infants in measures of fundamental frequency, first and second formants or in the amplitude of the cries. These observations are inconsistent with previous research that has reported higher fundamental frequency measurements for HR infants, relative to LR infants (e.g. Sheinkopf et al. 2012; Esposito et al. 2014). The Sheinkopf et al. (2012) study, which categorized cries as pain-related or non-pain related, observed the elevation in F0 for the HR infants (HR-ASD and HR-no ASD) in the pain-related cries only. Esposito et al. (2014) similarly observed elevations in fundamental frequency among 15-month old HR when compared to LR toddlers. In their study, cries were elicited during a social attachment paradigm known as the strange situation (Ainsworth and Wittig 1969; Ainsworth et al. 1978). Together, these findings indicate that the way in which cries are categorized (pain or non-pain), the specific cause of the cry (i.e. hunger or frustration) or the method for elicitation are all factors that may influence the acoustic properties of subsequent cries. The subtle differences in methodology or lack of detail used to explain coding criteria for acoustic variables provides additional challenges when comparing observations across studies.

In the present study, there was also no significant difference between the two groups on researcher-rated perceived level of distress for HR and LR infants. Venuti et al. (2012) provided evidence of a unique pattern of brain activity during adult processing of cries of infants later diagnosed with ASD. Their research indicated that characteristics of cries from infants at-risk of ASD (i.e. higher F0 and shorter duration) may be more negatively experienced by caregivers and may be perceived as more distressing. While this observation was not supported, the current study was limited by using researcher ratings that may have been influenced by their familiarity with caregiving (Esposito et al. 2015). Future research may consider employing magnetic resonance imaging as an alternative method for detecting subtle changes in perceptions of distress or negative experience.

High- and low-risk groups were significantly different in their performance on the MSEL and the ADEC assessments which were completed at the 12-month follow-up. HR infants obtained higher scores on the ADEC, which indicates that more ASD symptoms were endorsed. It is important to note that whilst the HR group have obtained elevated ADEC scores, relative to the LR group, a mean of 7.59 is still within the LR range according to the ADEC scoring criteria. However, this observation suggests that at 12-months old, these HR infants, at a group level, are displaying more symptoms of ASD irrespective of whether they later receive a diagnosis. HR infants also obtained lower scores on the MSEL, which suggests that their developmental milestones are slightly behind their LR peers at 12-months old. Follow-up of these HR infants until diagnostic age will provide further insights on their cognitive behavioural development including ASD symptomatology.

Using a naturalistic method of data collection, we found that HR infants had shorter cry utterances, elevated ADEC scores and lower MSEL scores, relative to LR infants. The absence of any differentiation between groups on the remaining acoustic variables that were investigated appears inconsistent with previous findings for pitch variables in at-risk infants. Thus far, it appears that the method of elicitation, the infants’ internal cues, and the categorizing of variables influences the subsequent acoustic analysis. Detailed reporting of the methods used in acoustic analyses is required in order to further this area of research.

To our knowledge this was the largest prospective study to have collected acoustic cry data from infants at-risk of ASD. However, we acknowledge that the sample size under investigation is still relatively small, which may have limited the statistical power to identify true effects. However, six (24%) of the HR infants were found to reach a clinical threshold for ASD on the ADOS-G at 24 months of age, which is consistent with previous estimates of risk for ASD among siblings of ASD probands. To maximize our statistical power, we examined 12 month behavioural variables in relation to continuous ADOS-G scores, though it is possible that a larger sample is still required to identify clear relationships over time. The limitation of multiple-comparison testing in the current study is also of note, especially given the small sample size. Further replication of these findings in a larger cohort of at-risk infants is required. A key strength of the study was the closely matched HR and LR groups. There were no significant differences in pregnancy variables such as smoking or alcohol use during pregnancy. Nor were there any differences in paternal or maternal education level or in the sex of the infants between groups. There was only a slight difference in maternal age at conception, with mothers conceiving at an older age in the HR group. Parents collected all cry recordings within the home which facilitated the naturalistic method. One limitation of this methodology was the minimal reporting of infant positioning during a cry episode. Due to the competing demands already placed on parents and caregivers, it was an additional task for them to record infant crying and report details of their child’s position. Limited recording of these details, known to influence the quality of cry recordings, has been a weakness in this area of research and may best be addressed in the future by collecting samples of infant cry within a clinical setting. Due to naturalistic method of data collection, it was not possible to control for the distance between the infant and the recording device when analysing measures such as amplitude. However, all of the participants were given detailed recording instructions, including both verbal and written guidance, so the impact of this is likely to be minimal.

This study provides further support for a unique acoustic profile in 12-month old infants at high-risk of ASD. Shorter cry duration has been replicated across multiple studies indicating that it may be a stable difference amongst infants at risk for disorder. Further research is required to better understand the physiology of shorter cries and also determine how these cries influence parental perception and response to their infant’s cry.