Introduction

Although the rate of unexpected cardiopulmonary arrest (CPA) in hospitalised children is relatively low at 0.19–2.45 cases per 1,000 admissions [1], mortality and morbidity remains high [25] despite advances in treatment. Emphasis is now shifting from treatment of CPA to prevention, after research in adults demonstrated that CPA and other serious adverse events (SAE) are often preceded by a period of physiological instability that, when recognised, offers a window of opportunity for the health-care team to intervene to improve outcome [69]. A similar window of opportunity may exist within which to identify hospitalised children at risk of SAE [1012]. Evidence from the USA demonstrated that 95% of paediatric in-hospital CPAs were witnessed or monitored, suggesting that clinicians recognised that the child was at risk of a SAE [1]. Similarly, a UK study examining the clinical signs of children in the 24 h prior to admission to the paediatric intensive care unit (ICU) or high dependency unit (HDU) found that 87% had documented evidence that might represent physiological deterioration, although the study was limited by a lack of controls and a large number (>55%) of missing records [13]. Of perhaps greater concern are the findings of a detailed confidential review of 126 UK child deaths [14] that reported that in the 89 deaths occurring in hospital, 63 (71%) were deemed avoidable or potentially avoidable by the panel. Failure to recognise severity of illness was highlighted as a major factor along with the failure to understand the importance of the clinical history, failure to examine and interpret physical signs correctly and failure to recognise complications cited as contributing factors [14]. This led the review panel to recommend the use of a standardised, rational monitoring system and/or early warning (EW) score for all children in hospital [14].

Early warning scores or systems aim to alert staff to patients at risk of SAE through periodic observation of clinical signs and predetermined criteria to prompt staff to call for urgent assistance. To augment this approach, some hospitals have assembled specialist teams, often based in the ICU, who have the knowledge, skills, experience and equipment to assess and treat deteriorating patients on hospital wards [15]. These teams vary in composition and name (e.g., medical emergency, critical care outreach, patient at risk, rapid response teams) and will be referred to as rapid response teams (RRT) throughout this review. The majority of RRTs are activated by ward staff in response to predetermined trigger or activation criteria that alert clinicians to patients at risk of a SAE in a similar way to early warning scores/systems. Although conceptually plausible, the research evidence for the effectiveness of the EW scores/systems to alert clinicians to children at risk of critical deterioration has not been subject to a systematic critical review. Therefore, the aim of this study was to systematically review the published research literature in order to identify the number and nature of paediatric alert criteria and evaluate their validity, reliability, clinical effectiveness and clinical utility.

Methods

The methodology for this review followed the 2009 NHS Centre for Reviews and Dissemination (CRD) guidance on conducting systematic reviews of interventions and clinical tests in health care with regard to the review question, inclusion criteria, search methods, data extraction, quality assessment and data synthesis [16].

Search strategy and data sources

A search of biomedical research published between January 1990 and February 2009 was conducted using the databases Cumulative Index of Nursing and Allied Health Literature (CINAHL), Cochrane Library, Database of Reviews of Effectiveness, EMBASE and MEDLINE. Papers were included if they were published in full and in English and described the development, testing or use of either an EW score/system, or activation/trigger criteria to mobilise a RRT in hospitalised children cared for on wards outside the critical care setting. For the purposes of this review, we will use the umbrella term paediatric alert criteria (PAC) for EW scores/systems or RRT trigger/activation criteria. Review papers and those primarily concerned with adult patients were excluded unless data relating to paediatric patients could be adequately separated.

A broad search strategy was used with free text searching using keywords in the title or abstract. Search terms were based on those identified in a systematic review of adult track and trigger systems (TT) [17] and terms relating to the various forms of RRT. Details of the keywords and filters are presented in the Electronic Supplementary Material (ESM). Abstracts of potentially eligible papers were reviewed against the inclusion criteria, and full text of all candidate citations were obtained and reviewed. The reference lists of included papers were hand searched for potential articles, and a citation search was performed on Web of Science [18]. Corresponding authors of included papers and additional experts who have written papers on paediatric EW scores or RRTs that did not fulfil the criteria for inclusion in this review [12, 19, 20] were contacted and requested to review the list for completeness.

Data extraction and synthesis

A data extraction form was developed that included key elements relevant to the study, based on a previous systematic review of adult TT [17]. Key elements extracted were: hospital setting and country of origin, patient characteristics, the type, purpose and origin of the PAC, whether the PAC was dependent or independent of the child’s age and the age ranges identified, the number and type of physiological parameters included, the scoring system/trigger thresholds and the nature of the response. PACs were categorised according to classifications outlined in the systematic review of adult TTs as: single-parameter systems: periodic observation of selected clinical signs, which are compared to a simple set of criteria with predefined thresholds, with a response algorithm being activated when any criterion is met; multiple-parameter systems: where the response algorithm involves more than one criterion being met or differs according to the number of criteria met; aggregate weighted scoring systems: where weighted scores are assigned to physiological values and clinical signs and compared to pre-defined trigger thresholds; or combination systems: involving single- or multiple-parameter systems in combination with aggregate weighted scoring systems [17]. PACs were then classified as age dependent, where some or all of the scoring of the parameters varied based on the child’s age or age independent, where scoring of all parameters was standard regardless of the child’s age.

Parameters within each PAC were classified as one of the following seven categories: diagnostic: where the parameter related to a specific diagnosis (e.g., cerebral palsy); event: occurrence of a specific event (e.g., seizure); intervention: where the parameter related to a specific intervention (e.g., central venous catheter in situ); intuitive: knowledge without the need for rational or conscious reasoning (e.g., ‘worried’). Objective finding: clinical finding with an objective measure (e.g., oxygen saturation below 92%); subjective finding: clinical finding with a subjective measure (e.g., increased work of breathing) or mixed: where the category was a combination of types. Parameters were considered as a single-parameter if the guidance indicated that either parameter could be fulfilled (i.e., increased work of breathing or cyanosis) or as two distinct parameters if both must be fulfilled (i.e., increased work of breathing and cyanosis).

Quality assessment

The data extraction form for studies concerned with the development and evaluation of PAC incorporated additional elements based on recommendations for developing clinical decision rules [21] and guidance for undertaking systematic reviews of tests of diagnostic and prognostic accuracy [16, 22, 23]. If a PAC is to add useful clinical information [16], it must allow sufficient time for clinicians to assess the child and intervene before occurrence of SAEs; therefore, data were extracted on the ‘time to event’ and time period where data collection was censored [24]. Estimates of diagnostic accuracy extracted included positive predictive value (the probability of the target condition among people with a positive test result), sensitivity (the proportion of people with the target condition who have a positive test result) and specificity (the proportion of people without the target condition who have a negative test result) [16]. In addition to the data extracted about the PAC, the following items related to the development and testing of the PAC were extracted: study design, sample and follow-up of patients, outcome measures, prognostic variable and statistical analysis. Papers were subsequently assessed for quality based on criteria related to the study design and rated as adequate, unclear or inadequate by two authors (S.C. and L.F.) in accordance with methodological quality standards of the 2009 CRD and other guidance for undertaking systematic reviews [16, 21].

Results

General characteristics of published paediatric alert criteria

A summary of the literature search result is presented in Fig. 1. Eleven papers were identified that met the inclusion criteria, describing ten PACs published from 2005 onwards [2535]. The studies were set in the USA [25, 29, 31, 32, 35], England [29, 31], Australia [33, 34], Canada [26] and Wales [27]. The majority of the studies were conducted in children’s hospitals [25, 26, 2835] with a single study conducted in tertiary centre for paediatrics within an university hospital [27]. Six papers described the introduction of a PAC as part of the implementation of a RRT or equivalent system [2934], four focused on the development and testing of a PAC, [2628, 35] and a single paper combined both aspects [25]. The purpose of the PAC varied and included activation of a RRT [25, 29, 3134], screening of the acutely ill child [27, 28, 30, 35] or identification of the child at risk of a code blue (i.e., a request for immediate assistance for imminent or actual CPA) [26]. An overview of the ten published PAC is presented in Table 1. Four PACs were described as original [2527, 29, 32], three were adapted from paediatric tools [31, 34, 35] and two adapted from adult tools [30, 33]. One PAC was modified from both adult and paediatric tools [28].

Fig. 1
figure 1

Summary of search results. Papers not identified in electronic search are indicated by asterisks. Double asterisks indicate authored papers on paediatric EW scores or RRTs that did not fulfil the criteria for inclusion in this review [12, 19, 20]

Table 1 Overview of paediatric alert criteria

Number and type of parameters

Seven tools were single-parameter [25, 2729, 3134], with the remaining three classified as aggregate weighted [26, 30, 35]. The PACs were equally divided between age-independent [25, 2932, 35] and age-dependent tools [2628, 33, 34]. All of the age-dependent PACs identified five age bands, but the specified age ranges were inconsistent between the tools, other than considering children over 12 years of age as a single group. Three tools included age ranges that overlapped [2628].

The details of individual parameters within the PAC are presented in Table 2. When examining the types of parameters within each PAC, all contained subjective clinical findings and most included objective findings [2528, 33, 34]. Intuitive and events parameters were confined to the single-parameter systems. One tool featured a relatively large number of intervention parameters [26], whilst another included all six types of parameters [28]. The most complex PAC had 19 separate parameters [26] with a complicated matrix for determining the score. The tool requires calculation of the Glasgow Coma Score as well as a medication subscore derived from number of medication administered in 24 h. It also includes three age-dependent parameters and seven weighted scores. In total, the tool has 111 individual rules, and the complexity of this tool has previously been criticised [36]. The remaining PAC had between 5 and 14 parameters.

Table 2 Detail of paediatric alert criteria parameters

All PACs contained a measure of consciousness, and the majority included a measure of respiratory rate, heart rate and oxygen saturation, with three tools specifying lower oxygen saturation levels for children with congenital heart disease [28, 33, 34]. All of the age-dependent tools included heart rate, systolic blood pressure and respiratory rate as the age-related parameters [2628, 33, 34]. Temperature was an item in only one tool [26].

The cutoff point for activation of PAC for the commonly monitored vital signs is presented in Table 3. The cutoff points of all parameters showed considerable diversity, particularly around systolic blood pressure and oxygen saturation measurement. For example, for oxygen saturation levels, the threshold for triggering a response varied from values below 96% to below 90%. Some tools specified saturation levels in the presence of supplemental oxygen therapy [25, 28], whilst others did not [26, 31, 33, 34]. One tool used the subjective measure of an acute change in oxygen saturation level [31], whilst another referred to a decrease in saturations despite first-line interventions (the nature of this intervention was not stated) [29, 32]. One tool focused on giving supplemental oxygen to keep saturations above 90% [27], and two tools did not include oxygen saturation [30, 35]. Overall, there was a lack of consistency in the type and definition of parameters in the PAC. Where tools were age dependent, there was a lack of agreement on age groupings. Although most tools made a reference to commonly measured vital signs, there was no concurrence on the method of assessment or the threshold or cutoff point that should trigger action.

Table 3 Cut-points for commonly monitored vital signs

Validity

An overview of the papers reporting methods of development and testing and diagnostic accuracy of PAC [2528, 35] is presented in Table 4. Three studies used a retrospective case note review methodology [25, 26, 28]. Two studies used a prospective design [27, 35] but failed to determine which predictors were the most powerful and which could be omitted from the PAC without loss of predictive power.

Table 4 Overview of papers reporting method of development and diagnostic accuracy of paediatric alert criteria

Positive predictive value was reported for three PACs [26, 27, 35], and sensitivity and specificity were reported for four PACs [2628, 35]. However, for one PAC [28], data were not collected on children who triggered the PAC but did not require intervention (false positives) nor for those who did not trigger the PAC and did require intervention (false negatives), rendering the reported sensitivity and specificity invalid, and the results are recommended to be disregarded [37]. Only one paper [26] addressed the issue of ‘time to event’ by excluding data in the final hour before the outcome of interest (i.e., ‘code blue’).

No study reported the impact of introducing a PAC on patient outcome, although five papers [25, 29, 31, 33, 34] reported the effect of the RRT activated as a result of the PAC on rates of cardiac arrest [25, 29, 31, 33, 34], respiratory arrest [25, 29, 31, 34] and hospital-wide mortality rates [31, 33, 34].

Clinical effectiveness

The use in clinical practice of seven PACs was described in eight papers [25, 2935], of which six reported patient outcomes [25, 29, 31, 3335]. Five papers [25, 29, 31, 33, 34] focused primarily on the effect of introducing a RRT activated by a PAC using a before and after intervention study design. Two studies demonstrated statistically significant improvements post-RRT introduction, including a reduction in hospital-wide mortality [31, 34] and code rates [31]. One of these studies reported a statistically significant reduction in ‘preventable’ ward CPA (defined as CPA in children who transgressed the PAC), but the overall ward CPA rate showed no improvement [34]. A high proportion of these CPAs were deemed ‘non-preventable’ as they did not transgress the PAC prior to the event (58%) [34]. Similarly high figures of non-preventable CPA (83%) [29] and code rates (61%) [25] were also reported in two of the studies that did not achieve statistical significance. None of these studies presented data on the number of children who transgressed the PCA but were not reported to the RRT, nor those for whom the RRTs were activated but no SAE occurred. A single paper [35] reported the prospective evaluation of the PAC without the introduction of a RRT.

Reliability

Only one study evaluated interrater reliability [35]. Fifty-five patients were independently assessed by two registered nurses, and interrater reliability was found to be high (intraclass coefficient = 0.92, P < 0.001).

Clinical utility

No papers examined the ease and efficiency of use and user acceptability. One paper assessed staff satisfaction with the RRT, but did not make reference specifically to the PAC [25].

Discussion

This is the first systematic review of PAC, and we found ten published tools with considerable diversity in the number and type of parameters monitored. We also found wide variation in the thresholds for action by health care staff. All of the tools included measurement of some commonly monitored vital signs, but some PAC used trends, and others used absolute values. Current analysis of validity around vital sign criteria is confined to PACs with absolute values only [26, 27], but a study of adult TT criteria suggests that activation of RRT is most frequently based on subjective rather than objective criteria, leading the authors to suggest that objective criteria may lack sufficient sensitivity and specificity [38]. A few of the PACs included additional parameters related to diagnosis, observations of clinical status or clinical interventions; however, it remains unclear as to the added value of these additional parameters in increasing prediction of critical deterioration. Furthermore, we found weak evidence for the differing choices of age groupings and cutoff point values to trigger action in the eight PACs. Only one PAC [37] examined interrater reliability, and none were evaluated for clinical utility.

A systematic review and meta-analysis of adult TTs identified 25 distinct TTs, but stated that the number in clinical practice was far higher, acknowledging the diversity of the clinical datasets submitted for meta-analysis [17]. It is likely that a similar situation exists for PACs, as demonstrated by two recent surveys of hospitals with significant paediatric activity [39, 40]. In the USA and Canada, 6 out of the 29 (21%) hospitals with a RRT reported the use of specific activation criteria [40], whilst a UK survey identified that 31 (21.5%) of the 144 hospitals caring for children had an early identification system for children in need of urgent help [39]. The UK study highlighted 36 different parameters currently in use, concurring with our findings of significant variability in the structure and content of the PAC in the literature.

Although five papers described the development and testing of a PAC, only three reported accurate values for positive predictive value, sensitivity, specificity and area under the ROC curve [26, 27, 35]. Furthermore, only two were prospectively tested [27, 35], and it appears that the introduction of PAC is in danger of taking a similar path to that of TTs in adults, where there has been a proliferation of tools with weak evidence supporting their effectiveness [17]. A number of PACs have been reported as part of the evaluation of RRT introduction [25, 29, 31, 33, 34]; however, none have examined the diagnostic accuracy of the PAC, the optimal level and frequency of patient monitoring, or whether staff always alert the RRT if the PAC is transgressed. Evidence from adult studies suggests these may be important factors in demonstrating statistically significant benefits of RRT introduction [41].

If a PAC is to identify children at risk in order that clinicians may intervene and prevent SAE, then it must be activated sufficiently early to allow the intervention to take place. Only one paediatric study acknowledged this by ceasing data collection 1 h before the SAE in order to evaluate if the score could act as an EW mechanism [26]. Studies that collect data up to the time of the SAE risk over-estimating the performance of the PAC. The high incidence of CPA and code rates deemed ‘non-preventable’ in a number of studies [25, 29, 34] also raises concerns that current PACs may simply have insufficient sensitivity to identify children at an early stage to allow time for clinicians to intervene and prevent SAE.

The systematic review of published adult TTs identified 25 distinct TTs, but found only one that was developed using recognised statistical techniques to select the most powerful predictors of outcome followed by further analysis to determine which predictors could be omitted without loss of predictive power [17, 21]. The authors reported a lack of evidence on the validity, reliability and utility of adult TTs, but concluded there was no reason not to use them. Current UK national guidance from National Institute for Health and Clinical Excellence (NICE) recommends the use of TTs for all acutely ill adults, advocating multiple-parameter or aggregate-weighted systems that included heart rate, respiratory rate, systolic blood pressure, level of consciousness, oxygen saturation and temperature, together with clear and explicit thresholds for activation [42]. NICE did not recommend single-parameter systems on the basis of low sensitivity, low positive prediction value and the inability to track a patient’s progress in order to facilitate a graded response.

If the NICE guidance was extended to PAC, only one [26] would fulfil the recommendations. However, this PAC is substantially more complicated than other published PACs and likely to require more staff time and carry with it a greater risk of error due to miscalculation or incomplete uptake. Most of the other PACs would be excluded because they are single-parameter [25, 2729, 3134], with two further PACs not achieving the minimum parameters suggested by NICE [30, 35]. Nevertheless, however appealing it may be to extend the NICE guidance from adults to children, the differing physiology and nature of critical illness in children may require a different approach and therefore robust primary research is required to establish the validity and reliability of PAC in detecting critical deterioration in children prior to generating any national guidance about implementation in practice.

Implications for practice

Paediatric alert criteria to prompt clinician’s to potential deterioration in a child’s condition and trigger corrective action intuitively seem to be a good idea, but the current lack of evidence raises concerns about their widespread adoption without more robust research. For hospitals with a PAC already in use, there should be ongoing performance monitoring to ensure early identification of children at high risk of critical deterioration (defined by their subsequent clinical course), without falsely identifying those at low risk (which would lead to inappropriate use of resources and unnecessary patient concern). Ongoing review of individual cases of critical deterioration or RRT intervention, particularly those who fail to trigger PAC criteria, may highlight modifications that may improve the performance of the tool. For hospitals considering introducing a PAC, clinicians should consider the PAC that best meets their local needs and patient population, as evidence on validity and reliability is currently limited [17, 19]. Of the current tools available, three [26, 27, 35] have undergone a more rigorous evaluation; however, there remain issues of complexity, user acceptability, resource use, and inter- and intra-user reliability.

Implications for research

For existing PACs, further validation studies are needed to accurately determine levels of sensitivity and specificity in a variety of settings, taking into account the impact of age and level or types of illness on PAC performance. The role of commonly monitored vital signs in identifying physiological instability is an area that warrants closer examination, particularly in light of their prevalence within the published PAC. The lack of agreement on cutoff points and age-related thresholds for vital signs within the PAC are aspects that would benefit from future research.

New PACs should be developed using recognised statistical techniques to determine the clinical signs most predictive of critical deterioration, followed by prospective analysis to determine which parameters might be omitted from the PAC without loss of predictive power [21]. PACs must be prospectively validated in a variety of settings with attention paid to missing data, false positives and selection of an appropriate control group before implementation into clinical practice. Reliability of PAC must be established, and the tools should then be prospectively validated in adequately powered multi-centred studies to establish generalisability [17]. Clinical utility, including qualitative examination of user (ward staff and RRT team members) acceptability, resource requirement and cost-benefit burden also need to be described and compared amongst PAC. Finally, health services research is needed to examine the role of the PAC in mobilising expert assessment and treatment of patients and whether the use of these tools does indeed provide added value in improving outcome through early identification of high-risk children [21]. Readers are referred to Laupacis [43] and McGinn [21] for a more complete description of rigorous methodology to develop clinical decision rules, and this guidance is recommended to anyone developing a new PAC or considering further validation of an existing PAC. New technologies such as continuous and remote (wireless) monitoring may offer a more efficient or effective approach to screening of patients (children or adults) for signs of critical deterioration in the future [44]. For example, a relatively simple PAC (or adult TT) may provide an initial alert that a patient would benefit from continuous or remote monitoring of vital signs. Sophisticated analysis of the monitor output using signal variability or patterns/trend analysis would then be employed prior to mobilisation of the expert teams, thus reducing the number of false call-outs.

Conclusion

The number of published PACs is currently small, and divergent in purpose, content and thresholds for activation. This limits comparison between centres and undermines the development of an evidence base for PACs. The potential of PACs to improve the care of hospitalised children by aiding earlier identification of those at risk of critical deterioration and thereby improved outcome has not, as yet, been demonstrated. The ideal PAC would utilise existing routinely monitored clinical signs, be simple to use, have a high level of sensitivity and specificity, and be triggered at an early enough point in the child’s illness to allow sufficient time for interventions to improve outcome. A more homogenous approach to PACs may produce wider benefits, in terms of training, clinical practice and research. Evidence supporting the validity, reliability and utility of current PACs is lacking, and further well-designed and conducted studies are needed before the widespread adoption of these tools into clinical practice can be recommended.