Introduction

Concussions remain a concern for athletes participating in contact sports, bringing the role and effectiveness of sports safety equipment into focus.5,15,40,43 Laboratory testing to compare and quantify the protective capabilities of football helmet models is frequently used as a surrogate for a helmet model’s on-field effectiveness.37,39,49 In order to predict the associated on-field performance of a helmet model, laboratory tests should mimic the on-field environment and evaluate the helmet’s ability to reduce severity metrics relevant to the injury mechanisms involved. Recently, test methodologies have been updated with additional rotational severity metrics given the growing body of scientific evidence suggesting their role in concussion causation.33,46

The National Operating Committee on Standards for Athletic Equipment (NOCSAE) utilizes laboratory testing as a pass/fail requirement for football helmet performance.32,33 The governing bodies of football at various levels of play, such as the National Football League (NFL) and National Collegiate Athletic Association (NCAA), require that players use equipment that meets NOCSAE standards. These standards ensure that all helmets exceed a threshold level of protection for preventing injury.33 While these standards have proven to be very effective at reducing the rate of serious and fatal football head injuries,8,48 their effectiveness in reducing sports concussions has not been documented to the same degree.2,12,13,45 In addition to NOCSAE, other research institutions have designed laboratory tests to rate or rank helmet performance based on a helmet’s ability to reduce measures of head impact severity.3,39,49

While the development of laboratory test methods has been informed by the characteristics of on-field helmet impacts, relatively few investigations have examined the relationship between the laboratory performance of the helmets and the associated on-field injury rates for football players wearing those helmets. Several studies have examined the influence of helmet model on the risk of concussion.10,26,40,50 Zemper compared concussion rates for ten helmet models used in collegiate play and found statistically different rates of concussion among players wearing different helmet models. This finding led Zemper to advocate for monitoring concussion rates of helmets as a supplement to traditional laboratory testing.50 Collins et al. observed significant differences in the percentage of concussion suffered by players wearing the Riddell Revolution helmet compared to those wearing other models across three seasons of high school football,10 but did not account for player-specific exposure information such as the number or types of plays. More recently, McGuine et al. found no statistical difference in the rate of concussions by game and practice exposures for helmets produced by different manufacturers.26 Given that most football helmet manufacturers produce a number of different helmet models, aggregation of helmet performance by manufacturer provides insufficient resolution for assessing on-field injury rates associated with specific helmet models.25 The most helmet model-specific study to date was one by Rowson and Duma that analyzed the on-field performance of two helmet models used in collegiate play, the Riddell VSR-4 and Riddell Revolution helmets, and found a statistical difference in the number of concussions sustained per head impact between the two helmets.40

Several hurdles exist for quantifying the on-field performance of helmets. First, a large data set is required to distinguish between the many helmet models used on-field. Both incidence of concussion and a measure of exposure (e.g., game participation, number of impacts) must be collected, with the ability to map each to individual helmet models. Further, other risk factors such as play type, position and prior injury should also be considered in the analysis as they have been shown to be associated with increased rates of concussion.7,22,34,50 Kickoffs have been associated with higher concussion rates than other play types since the larger distances between players result in higher closing speeds at impact.22,27 Likewise, certain positions such as defensive backs and wide receivers typically experience higher-severity open-field impacts.27,36 Several studies have reported that prior concussion increases concussion risk, with players who sustain a concussion reported to be 3 to 5.8 times more likely to sustain a second concussion than a player with no prior concussion history.21,26,50,51

While these limited number of studies have attempted to examine helmet performance with laboratory testing or on-field usage, none have investigated the relationship between the laboratory and on-field performance of helmets across a broad range of helmet models accounting for these confounding factors. As a first step towards evaluating this relationship, on-field concussion data from five seasons (2015–2019) of NFL games were collected along with player participation and helmet use data. Performance scores were generated from laboratory tests performed on 31 helmets models worn by NFL players during that same period. The relationship between laboratory performance of helmet models and concussion rates for players wearing those helmet models on-field was assessed.

Materials and Methods

Helmet Testing and Data Analysis

Two helmet samples of each of 31 football helmet models (size large or equivalent) were selected for testing based on on-field usage by NFL players during the 2015–2019 seasons. Each helmet was donned on an instrumented 50th percentile Hybrid III male head and neck on a sliding table (Fig. 1). The helmets were tested using a pneumatic ram and slider table test fixture (Biokinetics and Associates Ltd., Ottawa, ON, Canada) in 18 test conditions comprised of the full combination of three impact speeds and six impact locations (Table 1).3 The test conditions were designed to represent impacts characteristic of those that caused concussions in NFL games, as previously described.3 The pneumatic ram was outfitted with a compliant end cap comprised of a vinyl nitrile foam puck with a spherically shaped nylon face meant to represent the stiffness characteristics of a helmet-to-helmet impact.

Figure 1
figure 1

Laboratory test fixture highlighting the pneumatic ram and slider table, as well as the ram end cap used to simulate an opposing player’s helmet.

Table 1 Description of laboratory test impact locations and coefficients for calculating helmet performance score (HPS).

Six degree-of-freedom head kinematics were calculated from a nine-accelerometer array package and transformed to the center of gravity of the Hybrid III head. Accelerations were filtered using a Channel Filter Class (CFC) 180 filter.16,42 Processed head kinematics were used to calculate head acceleration response metric (HARM) comprised of a linear combination of the head injury criterion (HIC), and the Diffuse Axonal Multi-Axis General Evaluation (DAMAGE) for each test condition, with coefficients based on the results of a matched pair study of head kinematics estimated for injured and non-injured players using dummy reconstructions (Eq. 1).19,20,47 Subsequently, the 18 HARM scores associated with each speed and impact location were weighted to calculate a composite Helmet Performance Score (HPS) for each individual helmet (Eq. 2).3,20 Test results for both helmet samples of a given model were averaged. Since HARM was derived from metrics associated with head kinematics, a lower HPS indicates better overall helmet performance.

$${\text{HARM}} = 0.0148\;{\text{HIC}} + 15.6\;{\text{DAMAGE}}$$
(1)
$${\text{HPS}} = \mathop \sum \limits_{i = 1}^{18} M_{i} ({\text{HARM}})_{i}$$
(2)

The weight assigned to each test condition in the HPS aimed to reflect the general incidence of concussion by impact velocity and location from concussions sustained in NFL games from 2015–2017 (Fig. 2). The composite of six impact locations at each of the two lowest test velocities (5.5 and 7.4 m s−1) were assigned weights of 25% each while the combination of impact locations at the highest test velocity (9.3 m s−1) accounted for 50% of the weighting for the HPS calculation. Across the three velocities, individual impact locations were given weights of 36% for Side Upper (SU), 13% for Oblique Front (OF), 16% for Side (C), 8% for Facemask Side (FMS), 15% for Facemask Central Oblique (FMCO), and 12% for Rear Oblique (D).

Figure 2
figure 2

Target contribution of each test condition to the Helmet Performance Score (HPS) by speed and location weighted according to on-field concussion incidence. Weights are included for three speeds (5.5, 7.4, and 9.3 m/s) and six impact locations (side upper (SU), oblique front (OF), rear oblique (D), side (C), facemask side (FMS), and facemask central oblique (FMCO).

On-Field Injury Rate Analysis

Helmet usage and injury data were obtained for five seasons of NFL games. The NFL Game Statistics & Information System (GSIS) provided a listing of each player participating in every play during the 2015 to 2019 seasons. Additionally, this dataset provided basic play type information (e.g., run, pass, kickoff, etc.) and the roster positions for each player. Basic play types were further distinguished using the returned status of the ball for kickoff and punt plays (i.e., “returned” or “not returned”) using details provided in GSIS.

Helmets used by individual players were assigned using the NFL Game Management System (GMS) database. The GMS provides a listing of the helmet worn by each player for each game. For 2015 and 2016 this information was provided through a survey with equipment managers during the season across all teams indicating the helmet worn by each player during the season. The helmet model worn by a player when injured was confirmed by the team’s equipment managers using video. For the 2017 to 2019 seasons, helmet use was tracked on a per-game basis using a radio frequency identification (RFID) tag tracking system (Helmet Tracker, Lenexa, KS) attached to the helmet shell. For these seasons, helmets were scanned by equipment managers and assigned to each player on a game-by-game basis.

Over the same time period, concussions diagnosed using the NFL Game Day Concussion Diagnosis and Management Protocol were documented in the NFL electronic medical record (EMR) system.17,18 Helmets worn by players diagnosed with concussion were confirmed by direct communication with the team equipment staff after every game. The EMR was used to determine prior concussion count on a game-by-game basis for each player. A video-review process was conducted to identify the play in which each concussion occurred.25 This process involved using game footage, unaffiliated neurotrauma consultants, and athletic trainers to identify head impacts and signs of post-impact neurological impairment.25 In total, player information, helmet model, and play information was collected for 5,986 players and 1,040 concussions across 5,905,882 player-plays (i.e., a single player involved in a play constitutes one player-play such that there are 22 player-plays per snap). The helmet model was known for all 1,040 concussions and an individual play was identified for 908 (87%) of the concussions using the video review process.

Concussion rates in terms of concussions per player-play were calculated for individual helmet models that were used for greater than 50,000 player-plays. This minimum number of plays was selected because it corresponds to roughly ten expected concussions based on the average concussion rate. This “rule of ten” is often used by statisticians for a general estimation of the number of events required per predictor variable to provide reliable estimates within in a statistical model.11,35

The relationship between laboratory performance and on-field injury rates was assessed using a logistic regression model that controlled for player position, play type, season year, season type (i.e., preseason, regular season, or postseason), and each player’s number of prior concussions in the NFL as recorded by team staff in the player’s medical record. Previous studies indicated a relationship between concussion risk and play type, player position, and prior concussion history.7,22,34,50 Season year was included in the regression analysis to account for different baseline levels of concussion risk associated with a season due to changes in the rules or style of play. The distinction between season types was incorporated to account for differences in play style and experience level of the players participating in different parts of the season.

Since the small number of injuries for an individual helmet model in the data set limits the ability to precisely measure the on-field performance, helmets were further grouped based on whether or not they were prohibited for use in the NFL based on the NFL NFLPA 2020 Helmet Laboratory Testing Performance Results (henceforth “helmet poster”).30 The complete list of prohibited helmet models can be found on the NFL’s Play Smart Play Safe website.30 A categorical designation of “prohibited” or “not prohibited” was assigned to each helmet model. A logistic mixed regression model was fit using the lme4 package for R version 4.0.0 (R Foundation for Statistical Computing, Vienna, Austria)4 to compare the probability of a concussion outcome in a given player-play for helmets prohibited for use in the NFL compared to non-prohibited helmet models. Random effects were included to account for repeated measurements on individual players. Statistical significance was assessed using likelihood-ratio tests.

Players wore one of the models included on the helmet poster in 5,492,908 player-plays (93% of all player plays in the data set). All of these player-plays were included in the analysis except those which had an unknown play type in the GSIS (n = 32,520) leaving a final sample size of 5,460,388. Plays with unknown play type were typically those that were whistled dead prior to the play developing. If a concussion occurred during a play with an unknown play type in GSIS, video review was used to determine play type for that play. Each player-play that resulted in a concussion with a play subsequently confirmed through the video review process was treated as an injury data point within the statistical model.

Play information was known for 879 of the 1008 concussions included in this reduced data set of 5,460,388 player-plays. In order to maintain the total number of concussions in the data set, the remaining concussions for which a specific play could not be identified were randomly imputed to one of the player’s plays using an estimated per-play concussion probability accounting for play type and quarter, and further accounting for all information available about the timing of the concussion within game. For example, if the player indicated that the concussion occurred in the 3rd quarter, the concussion was assigned to one of the player’s 3rd quarter plays. Imputing the play for the 12.8% of concussions where the exact play could not be identified generated a data set that captured the total number of concussions associated with each helmet and player position, and captured, on average, the observed rate of concussions by play type and game quarter. Ten iterations of imputation were performed to check stability of the regression estimate related to helmet model prohibition.

Results

Laboratory HPS for the 31 helmet models included in the study ranged from 0.874 to 1.163, with an average of 1.03 and a standard deviation of 0.07. Of the 31 helmet models tested, 24 of the models had an HPS within one standard deviation of the average. The helmet models tested in the laboratory study were worn in 89% of the total player-plays during games during the 5-year span. Concussion rates for individual helmet models with greater than 50,000 player-plays ranged from 1.23 to 3.11 concussions per 10,000 player-plays. A weighted linear regression (weighted by number of player-plays) revealed a positive correlation between the on-field concussion rates for helmet models with more than 50,000 exposure player-plays and HPS for individual helmet models (r2 = 0.61, p < 0.001) (Fig. 3).

Figure 3
figure 3

On-field concussion rate (concussions per 10,000 player-plays) compared to laboratory helmet performance score (HPS) by helmet model for the 2015–2019 NFL seasons. Bubble size indicates number of player-plays.

The raw (unadjusted) concussion rate for the group of helmets prohibited by the 2020 helmet poster was 1.62 times higher than that of non-prohibited helmet models. After accounting for effects of play type, player position, prior concussion count, season year, and season type, the regression model showed that the concussion rate for prohibited models was significantly higher than for helmets that were not prohibited, with the odds of sustaining a concussion while wearing a prohibited helmet being 1.24 (95% CI 1.04–1.47; p = 0.017) times higher than if wearing a non-prohibited helmet model.

The percentage of player-plays attributed to each of the helmet models with more than 50,000 player-plays was tabulated for each of the covariates used in the regression analysis (Table 2). All covariates included in the regression model were statistically significant to a significance level of 0.05 (Table 3). The model indicated that returned kickoffs were associated with the highest probability of concussion, while point-after-touchdown, non-returned kickoff, and non-returned punt plays had the lowest probability of concussion. The preseason had the highest odds ratio for concussion compared to the regular and postseasons. The highest and lowest concussion probabilities by position were associated with tight ends and defensive linemen, respectively. In general, concussion probability trended downward with season year with the 2018 and 2019 seasons having a significantly lower concussion rate than the 2015 season. Lastly, the model indicated that concussion probability increased with the number of prior concussions a player sustained.

Table 2 Percentage of player-plays by player position, play type, season type, and season year attributed to each of the helmet models with more than 50,000 player-plays. Percentages are based on the whole data set of 5,460,388 player-plays.
Table 3 Results from logistic regression model to assess relationship between helmet model prohibition, position, season year, play type, season type, and prior concussion count and concussion risk.

Discussion

This study assessed the relationship between laboratory performance of helmet models and on-field concussion rates in the NFL by analyzing raw concussion rates for individual helmet models and through a logistic regression analysis that accounted for various factors contributing to concussion risk. The laboratory performance of helmets tested correlated positively with the incidence of concussion for helmet models worn during professional football games, with worse-performing (i.e., highest HPS) helmet models exhibiting a statistically significantly higher concussion rate on-field. When accounting for additional factors that may affect concussion risk, helmet models prohibited by the NFL based on poor laboratory performance were associated with 25% higher (p = 0.017) concussion risk than non-prohibited helmet models. Since other factors influenced concussion rates to a degree similar or greater than helmet model, it was necessary to incorporate these explanatory variables into a logistic regression model to assess the effect of helmet model on concussion risk.

Previous studies have investigated the relationship between helmet model and injury rate using more limited data.9,10,26,28,40 A study by Rowson and Duma (2014) demonstrated a significant difference between on-field helmet performance of two helmet models in terms of concussions per impact.40 Moran et al. analyzed helmet-specific concussion rates in the NCAA and concluded that modern helmets had lower concussion rates per athlete exposure (presence in a game or practice) than older models.28 The present study expands upon these previous efforts by calculating concussion rates for a larger number of helmet models (n = 31), measuring exposure on a more granular level (the player-play), conducting an in-depth video review to identify concussion plays,25 and accounting for a number of important confounding variables (e.g., season year and play type) in the statistical regression model.

Concussion frequency varied substantially by player position and play type in this study, which is in agreement with previous studies.7,34 For example, the concussion rate is higher on kickoffs than run plays because players generally approach each other over larger distances traveled and at higher speeds. One of the primary confounding factors in this analysis was that helmet model choices varied substantially by player position. In general, speed players preferred smaller, lighter helmets like the Schutt Air XP Pro, accounting for 25% of wide receivers and cornerback player-plays. Only 3% of linemen player-plays were in the Air XP Pro (1.75 kg), while 37% were in helmets with the Riddell SpeedFlex shell (2.03–2.21 kg), which were among the largest and heaviest helmets tested. Likewise, offensive and defensive linemen exhibit very different impact profiles and concussion rates compared to cornerbacks and wide receivers.25 On average, linemen experience a high number of low-severity head impacts at the line of scrimmage, while cornerbacks and wide receivers tend to experience fewer head impacts overall, but a higher number of higher-severity open-field impacts.7,21,27,36 For these reasons, player position and play type were included as covariates in the logistic regression analysis in the absence of specific information about the frequency and severity of individual impacts for individual players.

Season year was included in the regression analysis to account for different baseline levels of concussion risk by season that may have been influenced by rule changes. Rule changes implemented during this time included stricter regulations about lowering the head to initiate contact with the helmet (2018), kickoff rule changes (2016, 2018), and the prohibition of the blindside block (2019).29 In 2018, the NFL and NFLPA initiated the prohibition of certain helmet models.31 Beyond the rule changes, the inclusion of the season year served as a catch-all for season-dependent changes to the exposure environment which include changes to play style as well as the introduction of new helmet models each season. Regression model odds ratios trended downward, with the lowest probability of concussion being associated with the 2018 season. Players during the 2018 and 2019 seasons each had a significantly lower odds of concussion than those who played during the 2015 season (p < 0.05).

Preseason plays were found to have the highest odds of concussion compared to regular and postseason plays. This finding is consistent with the results of a recent NCAA study by Stemper et al. that observed higher concussion rates during the preseason.44 Though Stemper et al. hypothesized that this may be due to more frequent repetitive head impacts during preseason practices, many factors distinguish the preseason from the remainder of the season such as a higher participation of rookie players.44

An individual player’s history of concussion was incorporated into the model by including the number of prior concussions a player had sustained. While this data alone is unable to distinguish players exhibiting risky behavior from an underlying change in their susceptibility to concussion, the risk of concussion has independently been shown to increase with the number of prior concussions sustained.1,41 The present study found prior concussion count to be a significant predictor of concussion (p < 0.001). While this regression model treated the number of prior concussions as a linear regressor, future efforts should investigate alternative modeling approaches since the effect may be nonlinear.

An on-field study of the distribution of head kinematics by helmet model directly measured via head impact sensors would greatly improve the understanding of the relationship between on-field and laboratory helmet performance. Such a study would provide a measure of impact severity for each impact exposure. While such studies have been performed using HITS, these data sets have been limited to helmet models compatible with HITS and have studied only lower levels of play.6,14,38 Studies that have collected data from head impacts using other sensor types have not collected helmet model information and have also been limited to lower levels of play.23,24 In the absence of sensor data, the present study has utilized exposure by play type as a surrogate for helmet impact exposure, which the authors recognize as a limitation.

Another limitation of this study is that the laboratory testing was limited to a finite number of impact conditions, specifically designed to mimic helmet-to-helmet collisions, the most common impact scenario for concussions in the NFL but only accounting for about 36% of concussions.25 Helmet-to-ground and helmet-to-body or -shoulder represent other injury scenarios observed. While the test protocol gave some consideration to accounting for impact locations associated with non-helmet-to-helmet impact sources, a helmet’s ability to protect its wearer in a helmet-to-helmet collision may not reflect its protective capabilities in other types of collisions, such as helmet-to-body and helmet-to-ground collisions, due to differences in the interface characteristics (i.e., stiffness and effective mass). The laboratory test protocol also weighted the importance of each test speed and impact location by concussion incidence independent of player position. It is recognized, however, that different positions experience different types of concussive impacts. Therefore, it is hypothesized that position-specific weighting of different impact locations and speed may strengthen the relationship between laboratory and on-field performance of helmets but was beyond the objective of assessing general performance of helmets for this evaluation. Additional limitations of the laboratory test method arise from the fact that only one size of helmet per model was tested. Helmet performance may vary by size in scenarios where two sizes of helmet share the same shell size but the thickness of padding material in the helmet filling the space between the head and shell varies. Furthermore, helmets that are customized for a particular player’s head size and shape (e.g., the Riddell SpeedFlex Precision and Schutt F7 UR1 models) may perform differently due to different effective thicknesses or stiffnesses of padding in regions of the helmet despite having the same helmet model designation. These factors may influence the ability of the laboratory test to predict on-field performance of different helmet models.

In summary, concussion rates for players wearing specific helmet models have been analyzed to assess a laboratory test protocol’s ability to relate to on-field performance. The relationship between laboratory and on-field performance demonstrated by this study supports the use of laboratory test results for ranking helmet performance and for prohibiting helmets that demonstrate poor performance. A major finding of this study was that in addition to performing worse than other helmets in laboratory testing, helmets prohibited by the NFL have a statistically significantly higher concussion rate than non-prohibited models. As stated throughout the paper, it is important to note that the laboratory testing performed for this study was tailored to impact conditions characteristic of concussive impacts sustained by NFL players in games. The information on helmet performance from this study (both laboratory and on-field) is indicative of performance at the professional level of play and should not be applied to lower levels of football (e.g., youth, high school, or collegiate).