FormalPara Key Points

Student-athletes participating in contact sports do not show clinically meaningful differences in pre-season cognitive and balance testing compared with their peers in limited and non-contact sports.

Data distributions are presented on a large, racially diverse cohort of male and female uninjured college athletes for commonly implemented concussion assessments.

There are potential limitations when using published ImPACT norms for injured athletes.

1 Introduction

Concussion, or mild traumatic brain injury (mTBI), is a major concern in both sport and military medicine. The media has alerted the public to this health issue not only in the United States, but also worldwide, with growing focus and concern among clinicians, researchers, sporting organizations, and athletes [1,2,3,4,5,6]. Current estimates indicate that nearly four million sport and recreation-related concussions occur in the US annually [4]. While the medical community remains vigilant in its search for improved methods of treatment and injury management, there continues to be concerted efforts to improve the initial identification and diagnosis of mTBI and perfect the decision-making process for safe return to activity that protects the athlete against potential for long-term negative consequences.

The clinical effects of concussion can be subtle and difficult to detect with conventional assessment tools [7, 8], and in the absence of well validated diagnostic biomarkers, most of the information available to the clinician comes from patient-reported symptoms. These are frequently under-reported to hide the injury and/or accelerate return to play because of competing messages from stakeholders who pressure medical personnel for early return to play [9], or in hopes of a rapid return to competition [10, 11].

To address these challenges, the Concussion Assessment, Research, and Education (CARE) Consortium was established to define the short-term (i.e., 6-month) natural history of concussion in college-age participants across the entire spectrum of competitive collegiate sport. The CARE Consortium represents the largest multicenter cohort of its kind to date and is designed to compare clinical, biomechanical, and imaging on male and female athletes who participate in diverse sports. A general description of the CARE consortium structure has been described previously [12]. The aims of this initial report from a national dataset of collegiate student-athletes was threefold: (1) detail the cohort’s demographics and characteristics, (2) define normative baseline data for concussion symptoms, postural stability, and cognitive and neurological status, and (3) assess differences among sports by contact level.

2 Methods

At the time of writing, the CARE study enrolled all consenting varsity athletes participating in National Collegiate Athletic Association (NCAA) sports at 26 colleges and universities, and all consenting cadets (NCAA athletes and non-NCAA athletes) at three US military service academies. This report focuses on only the NCAA student-athletes at the colleges, universities, and service academies.

Fourteen schools began enrollment in the fall of 2014, six additional schools in 2015, and another nine schools in 2016; all three NCAA divisions are represented among the participating schools. Institutional Review Board approval was obtained from all research sites and data centers with additional approval provided by the Human Research Protection Office (HRPO) of the Department of Defense. The study was performed in accordance with the standards of ethics outlined in the Declaration of Helsinki. All athletes provided written informed consent prior to data collection. During each academic year and prior to the start of the competitive season, self-reported demographic and characteristic information and medical history was provided by consenting student-athletes. This included information about self-reported concussion-related symptoms, concussion history (diagnosed and undiagnosed), household income, years in the primary sport, age, and race. Participants also underwent assessment of neurocognitive function, neurological status, and postural stability administered by trained research personnel.

2.1 Data Collection

To better understand post-concussion performance across multiple neurocognitive assessment platforms, sites were permitted to use the platform with which they were most familiar. Sites chose one test and used only that test. The majority of CARE sites used Immediate Post-Concussion Assessment Testing (ImPACT®) (n = 24) [13], and thus the ImPACT analyses were limited to those sites. All other analyses used data from all 29 sites. The neurocognitive data used for this analysis was limited to ImPACT’s four major composite scores (verbal memory, visual memory, visual motor speed, and reaction time). In addition to ImPACT, the Standardized Assessment of Concussion (SAC) was implemented as a neurocognitive screening tool [14,15,16,17], and the Balance Error Scoring System (BESS) was used to evaluate postural stability [18]. The Sport Concussion Assessment Tool (SCAT) symptom checklist [19] was employed to assess common post-concussion symptoms, and the Brief Symptom Inventory–18 (BSI-18) was used to evaluate psychological health status [20]. All site principal investigators underwent standardized training at annual meetings for these measures and they trained research personnel at the site. Tests were administered in small groups (e.g., neurocognitive testing), one on one (e.g., SAC and BESS), or completed individually by the athletes (e.g., symptom reports). Invalid tests were excluded and, when available, a retaken test was considered as the baseline.

2.2 Statistical Analyses

Routine descriptive statistics were used to summarize categorical and continuous data. The primary factor of interest for this report was contact category. Athletes were placed in one of three contact categories (contact, limited contact, non-contact) based on the amount of contact typically associated with their sport according to the American Academy of Pediatrics [21], and these were slightly modified by moving track and cross country to limited contact for consistency with the NCAA’s Injury Surveillance System (Table 1). Violin plots [22] are also presented for the tests by sport category to illustrate the shape of the data distributions, the range of values, interquartile range (white rectangle), and the median.

Table 1 Sport classification

Comparisons among the three contact categories were assessed using statistical models adjusting for gender [23], concussion history [24], and household income [25]. Years of participation in primary sport was also considered but the quality of the data was not sufficient for analysis. For example, 500 (3%) reported 0 or 1 year in their primary sport. In contrast, assessing age of starting sport participation by subtracting the number of years playing their primary sport from their age resulted in athletes beginning participation at an age that corresponded to not being able to walk or run.

Most of the primary scores had distributions that were amenable to analysis using linear regression. These included SAC, BESS, and the four ImPACT domains (visual memory, verbal memory, reaction time, and processing speed). The SCAT symptom count and severity score and the BSI-18 all exhibited extreme right skewness because of an excess of zero values and over dispersion. Therefore, the most appropriate model was a zero-inflated negative binomial model (ZINB) [26].

The ZINB model consists of a combination of two processes. The first process assumes that excessive number of zeros come from a binary distribution (i.e., two outcomes: zero and not zero) and the second process models zero and the other values as a negative binomial distribution. The model can be written statistically as

$$f_{\text{ZINB}} (y;x,z,\beta ,\gamma ) = \left\{ {\begin{array}{*{20}l} {f_{\text{zero}} (0;z,\gamma ) + \left( {1 - f_{\text{zero}} \left( {0;z,\gamma } \right)} \right)f_{\text{count}} \left( {0;x,\beta } \right)} \hfill & {{\text{if }} y = 0} \hfill \\ {\left( {1 - f_{\text{zero}} \left( {0;z,\gamma } \right)} \right)f_{\text{count}} \left( {y;x,\beta } \right) } \hfill & { {\text{if }} y \ge 1} \hfill \\ \end{array} } \right..$$

Thus, the model can be thought of in two parts: first, a logistic regression model for the binary (zero/non-zero) part and then a negative binomial regression model. Both models include explanatory variables. Thus, we obtained two sets of coefficients for the independent variables in the model. All pairwise follow-up comparisons in the regression models (linear and ZINB) among the three sport categories were adjusted using Tukey’s multiple comparison procedure within each outcome. Adjusted effect sizes (ES) were calculated as the difference between the adjusted group means divided by the overall standard deviation of the score. These can be interpreted along a scale similar to Cohen’s D (e.g., 0.2 is small, 0.5 is medium, and 0.8 is large), which uses the unadjusted difference in the means. Ninety-five percent confidence intervals (95% CIs) for the ES are also presented. For the ZINB logistic regression, adjusted differences in the percentage of zeroes (DZ) is presented. For example, if the estimated percentages of zeroes from the model, adjusted for covariates, were 30% for the contact sport group and 20% for the non-contact group, then the DZ is 10%.

3 Results

3.1 Baseline Characteristics

Only the first baseline evaluation completed by the athlete was used, as some subjects have had multiple assessments as part of this ongoing study. From August 2014 through September 2016, we enrolled 15,681 unique athletes from the 29 institutions, 6423 females (41%) and 9258 males (59%). Fifty-three percent of the athletes were in the contact sport group, 31% were in the limited contact group and 17% were in the non-contact group (Table 2). Amongst the females, 2193 (34%) participated in contact sports, 2555 (40%) in limited contact sports, and 1675 (26%) in non-contact sports. For the males, 6073 (66%), 2234 (24%), and 951 (10%) participated in contact, limited contact, and non-contact sports, respectively. The three sports with the greatest number of participants were football (n = 3514, 22% of all participants), cross country and track (n = 1944, 12%), and soccer (n = 1774, 11%). Within genders, the three sports with the greatest number of male participants were football (n = 3514, 38% of all males), cross country and track (n = 927, 10%), and baseball (n = 899, 10%). The three sports with the greatest number of female participants were cross country and track (n = 1017, 16%), soccer (n = 911, 14%), and rowing (n = 832, 13%). Of African American athletes, 75% were involved in contact sports as compared with only 48% of White athletes. Over half of all African American athletes played football. Table 2 summarizes the number of participants by contact category, gender, race, and sport.

Table 2 Number (%) of athletes by contact category, sport, gender, and racea

The project began in August 2014 when we tested all consenting athletes regardless of the year of academic standing. As such, the class with the greatest participation was freshman (44%) and decreased for sophomores (21%), juniors (19%), and seniors (13%). A small fraction were fifth year seniors or graduate students. This pattern was consistent across each contact category.

Table 3 presents the self-reported concussion history for the participants. A total of 11,218 (72%) participants reported never having a concussion. Twenty percent of all subjects reported a single concussion diagnosis (females 18%, n = 1121; males 21%, n = 1939). Less than 3% (n = 323) reported a history of three or more diagnosed concussions. For each gender, 74% of females (n = 4733) and 70% of males (n = 6485) reported no previous concussions. The contact sport group had the largest percentage of athletes who reported previous concussions (34%), followed by the limited contact group (21%) and the non-contact group (15%). The sport with the most participants with at least one previously diagnosed concussion was football; 1185 of 3514 football players (34%) reported at least one concussion. For soccer, 32% of the males (269 of all 848 male soccer players) and 42% of the females (375/896) reported at least one previously diagnosed concussion. For basketball, 31% (130/420) of the females and 29% of the males (124/422) reported at least one previously diagnosed concussion. Soccer had the highest percentage of women with at least one prior concussion; among men’s sports, ice hockey had the highest percentage (43%, 73/168) of participants with at least one concussion. All limited contact sports had 29% of athletes or less with a prior concussion and non-contact sports all had at most 20% of athletes with at least one prior concussion.

Table 3 Self-reported concussion history frequency (%) by contact category, sport, and gender

3.2 Primary Assessments at Baseline

The means and standard deviations of ImPACT test performances for 11,611 athlete baselines by sport category and sport are presented in Table 4. The linear regression model for the two ImPACT memory composite scores (verbal and visual) showed significantly higher (better) mean values for the contact group compared with the non-contact group (ES 0.08, 95% CI 0.02–0.13, p = 0.018; and ES 0.09, 95% CI 0.05–0.15, p = 0.001, respectively). In addition, athletes in limited contact sports scored higher on the verbal and visual scores than the non-contact group (ES 0.09, 95% CI 0.03–0.14, p = 0.007; and ES 0.11, 95% CI 0.05–0.16, p < 0.001, respectively). For reaction time, the non-contact and limited contact groups were significantly faster than the contact group (ES 0.08, 95% CI 0.03–0.12, p = 0.008; and ES 0.15, 95% CI 0.08–0.18, p < 0.001, respectively). There were no contact category differences for processing speed. The full distributions for these composite scores are visualized in Fig. 1.

Table 4 Mean (SD) of ImPACT test performance by contact category, sport, and gender
Fig. 1
figure 1

Violin plots of ImPACT composite scores illustrate the shape of the data distribution over the full range of values, the interquartile range (white rectangle) covering the middle 50% of the values and the median (line in the rectangle). ImPACT Immediate Post-Concussion Assessment Testing, NCAA National Collegiate Athletic Association

Table 5 presents the results for the clinical tests by contact category, sport, and gender. The contact sport group generally had lower mean scores for the SCAT symptom count and severity score and BSI-18, but worse scores for the SAC (lower) and the BESS (higher). These differences were small and of questionable clinical significance. The full distributions for these instruments are presented in Fig. 2. The distributions of scores for the SAC and the BESS are symmetric. The BSI-18 and the two SCAT scores each have a large proportion of zero values as evident by the large flat base of the plots and the skewness from the long tail.

Table 5 Mean (SD) clinical test performance by contact category, sport, and gender
Fig. 2
figure 2

Violin plots of clinical evaluations illustrate the shape of the data distribution over the full range of values, the extreme right skewness of the BSI 18 and the SCAT scores, the interquartile range (white rectangle) covering the middle 50% of the values and the median. BESS Balance Error Scoring System, BSI-18 Brief Symptom Inventory–18, NCAA National Collegiate Athletic Association, SAC Standardized Assessment of Concussion, SCAT Sport Concussion Assessment Tool

The linear regression model for the SAC showed a significantly lower (worse) adjusted mean score for the contact group than either the limited or non-contact groups (ES 0.06, 95% CI 0.02–0.10, p = 0.015; ES 0.08, 95% CI 0.13–0.13, p = 0.008). There was no difference between the two latter groups. For BESS, the regression model showed no statistically significant differences among the three sport categories.

The BSI-18 and the two SCAT scores (symptom count and severity score) were analyzed using the ZINB model, so conclusions about significant differences are made for both parts of the model (the proportion of zero values and the group means). For all three instruments (BSI-18, SCAT symptoms, SCAT symptom severity), the proportion of zero values (indicating no symptoms or no problems) for the contact group was significantly higher than both the limited contact group (DZ 6%, p < 0.001; DZ 4%, p = 0.008; DZ 4%, p = 0.004, respectively) and the non-contact group (DZ 9, 7, and 8%; p < 0.001 for all three instruments).

Athletes competing in contact sports had significantly better scores than both the limited contact and non-contact sport athletes for the BSI-18 (ES 0.02, 95% CI 0.01–0.04; and ES 0.04, 95% CI 0.03–0.06; p < 0.001 vs either group), SCAT symptoms (ES 0.02, 95% CI 0.01–0.04, p = 0.007; and ES 0.06, 95% CI 0.03–0.06; p < 0.001, respectively) and for SCAT symptom severity (ES 0.01, 95% CI 0.004–0.06; and ES 0.03, 95% CI 0.02–0.04; p < 0.001 vs either group). For the SCAT symptom and symptom severity scores, the limited contact group had, on average, fewer (ES 0.04, 95% CI 0.03–0.06, p < 0.001) and less severe (ES 0.02, 95% CI 0.01–0.03, p < 0.001) symptoms than the non-contact group.

4 Discussion

The most valuable study characteristic, which was a design intent of the CARE Consortium, is simply the extensive number of athletes enrolled in the study (N = 15,681 at time of writing) and the wide variety of sports (N = 24), including those with documented higher risks of concussion (e.g., football, soccer, basketball), but also those with low risk (e.g., tennis, golf, sailing). The largest number of enrolled athletes participated in football (n = 3514), followed by cross-country and track (n = 1944), and soccer (n = 1774).

Overall, 6485 males (70.0% of all men in the study) and 4733 females (73.7% of all women in the study) reported no concussion history, a substantial number on which to establish cognitive and clinical norms for college athletes with no history of concussion. Of note in CARE, we ask about both diagnosed and undiagnosed concussions and the combined number was used in these analyses. Although the accuracy of recalled concussion history has been questioned in the literature [27,28,29], this is the best available estimate for previous injury. Approximately 34% (2769/8063) of contact sport athletes reported having sustained at least one concussion prior to enrolling in the study. A history of at least one concussion was reported by 21% (967/4696) and 15% (388/2593) of athletes in limited contact and non-contact sports, respectively. While history of three or more concussions has been implicated as a risk for future concussive injury in collegiate football [30], only 2.7% of the athletes in contact sports (and 2.6% of the football players) reported a history of three or more concussions. The frequency of athletes with a history of three or more concussions was similar in the limited contact (1.8%) and non-contact (1.4%) groups.

4.1 Neurocognitive Evaluation (ImPACT)

Most of the published literature on the effect of concussion on cognition compares cognitive function in concussed versus non-concussed control subjects or compares post-concussion results with each subject’s personal baseline, usually obtained prior to training and competition. In either case, the number of control subjects or individual baseline results tend to be relatively small and are study-specific. Data that are sample-specific make comparisons across studies a challenge and interpretation of differences or similarities with published studies impractical and sometimes even futile.

To establish a normative dataset, we first compared the findings from this cohort with published norms in the ImPACT Clinical User’s Manual [31]. This manual presents tables according to commonly used classifications (e.g., low average, average, high average, etc.) as well as by percentiles with the 25th and 75th percentiles as the range for ‘average.’ For male university students (n = 410), the manual shows the range of scores that would be ‘average’ for verbal memory (83–94; median 88.82), visual memory (69–94; median 77.78), reaction time (0.52–0.60; median 0.553), and processing speed (32.5–42.0; median 37.23). Table 4 shows that the mean baseline results for males in the cohort are all within those ranges at or marginally above the median for verbal (52nd percentile) and visual memory (57th) while reaction time (29th) was closer to the borderline between average and low average and processing speed (75th) was closer to the borderline between average and high average . For females (n = 97), the test manual reports average baseline ranges are 87–97 (median 91.67), 70–88 (median 77.8), 0.59–0.52 (median 0.541), and 34.4–42.1 (median 38.65), respectively. Table 4 also shows that the mean values for females in the cohort were in the middle of the average range for verbal (48–51st) and visual memory (45–48th), processing speed (30th) was in the lower end of the average range, and reaction time (72–67th) was at the higher end of the average range.

It is worth noting that the percentile ranges for each category as defined by ImPACT and then applied to the CARE data show some discrepancies (Table 6). For example, the range of ‘average’ (25–75th percentile) for verbal memory shows a 45% wider range for males in the CARE dataset, while the females have a 60% narrower range. For visual memory, males in the CARE study had a 24% narrower range for visual memory and the range for the females was similar (+ 5%). This could simply represent random variation or may be a function of the study samples. ImPACT defines their sample as ‘university students’ without description of geographic diversity or athletic status. The CARE sample included student-athletes from across the spectrum of NCAA member institutions and sports. The CARE dataset therefore provides a more accurate representation of collegiate level athlete neurocognitive performance that clinicians may find to more accurately represent their athletes. The injured male student-athlete who scores a 77 for verbal memory would be within the upper range of ‘low average’ based on the CARE norms, but ‘borderline’ using the ImPACT norms. The female student-athlete who scores 93 on verbal memory would be ‘low average’ using CARE norms versus ‘average’ using the ImPACT norms. This presents a potential limitation of the ImPACT norms for use with an injured athlete despite the typical use of test results over time to document change as the athlete recovers.

Table 6 Comparison of ImPACT norms versus CARE norms using ImPACT’s grading criteria by test results and gender

4.2 Clinical Assessment Tools

The CARE cohort includes a racially and socioeconomically diverse group of collegiate athletes from a wide variety of sports (N = 24), including those with documented higher risks of concussion, but also those with low risk, that includes the largest number of women athletes to date. Not surprisingly, the results of the preseason neurocognitive and concussion-related battery of tests are overwhelmingly within standard norms. What was unexpected is that the raw scores for those participating in contact sports for many of the measures were consistently better than those who participated in limited contact and non-contact sports.

Statistical modeling of these results (adjusted for gender, concussion history, and household income) showed significantly better mean scores for the contact group for many of these clinical measures including the SCAT and BSI-18, which measure symptoms. The exception to that was the SAC results where the contact group had significantly lower scores and the BESS where there were no differences. It should be noted, however, that the average differences between contact categories for all the assessments were quite small, and therefore not likely to be clinically significant. Based on comparable data in the literature, it seems clear that the contact sport cohort does not differ on any clinical or neurocognitive measures compared with their peers in other sports that involve lesser degrees of contact.

It should be considered that several factors may influence athlete responses to these instruments that are not related to their health or mental condition. For example, it is common medical practice for interscholastic contact sport athletes to receive a preseason evaluation that might introduce a practice effect that results in higher performance relative to their limited and non-contact athlete peers, who might not have undergone a similar preseason evaluation. Other test administration factors that can affect clinical performance on assessed clinical and neurocognitive performance include athlete age, number of athletes in the setting of group administration, fatigue and anxiety, motivation, misunderstanding test directions, room or computer noise distractions, among others [32].

4.3 Use of Normative Data

One of the important features of the results (see Tables 4 and 5) relative to the published literature is the need to consider when to use sport-specific norms to interpret results for an athlete under scrutiny and when it is acceptable to use baseline norms in lieu of pre-injury results. For example, one could use the overall means or the gender-specific means of the various tests, but in doing so might miss the nuances inherent within and between the sports. A woman participating in sailing might report symptom severity of 10, which is 30% higher than the overall mean for females, but that score is about 40% below the baseline norms for the 19 females in sailing (Table 4). Thus, a careful evaluation of these measures and more data for some sports is needed to determine if sport-specific interpretations are preferable in the long run. We hope to be able to address these questions as the database matures.

Such an extensive dataset as this presents a multitude of comparative possibilities beyond just gender, contact category, or sport comparisons. For example, previous work has suggested that a history of concussion influences test results on such items as the ImPACT test [24, 33, 34], as does socioeconomic status [35], race [36], sex [23], and sport [37]. The size and breadth of this dataset offers the opportunity to query the data for comparative information about any number of interactions of the various demographic variables; for example, white soccer players with a history of two or more concussions who come from a household with a lower family income.

4.4 Limitations

One limitation of the statistical models is that the inclusion of contact category, gender, concussion history, and household income explains only a small proportion of the variance in these scores, about 1% on average. It is more difficult to explain variance for a relatively homogenous, symptom-free population, but there may be other factors that are affecting the scores on these instruments, beyond those that we considered. A second limitation is that the cohort only includes athletes still playing their sports. Not all freshman athletes compete all four years, dropping out of sport for any number of academic, social, economic, personal, or health-related reasons. While the number of athletes who had problems resulting from sport-related concussions or multiple head impacts are unknown, we should acknowledge that some athletes who had enrolled at the participating universities may have already left their sport for any number of reasons, including head trauma, adding the potential for some selection bias because this sample of student-athletes does not include those most affected by head trauma. Another data-related limitation is that the quality of the participation data did not allow for the consideration of years of participation in the models, and the reliance on self-report for concussion history was also an issue.

5 Conclusion

Herein, we present the basis for norms for demographics, concussion history, and neurocognitive and basic clinical test results for a large sample of collegiate athletes participating in 24 sports from the 29 colleges and universities that make up the CARE Consortium. Of note, the data suggest potential limitations of using published ImPACT norms when evaluating injured athletes.

Initial review of the data finds no evidence that a history of participation in contact sports results in neurocognitive or clinical assessment deficits among college-level athletes at the group level. Probably the most interesting finding is that athletes competing in contact sports performed among the best on several measurements under study, although the differences were small. This is somewhat reassuring in light of concerns raised about the potential impact of repetitive concussions as well as exposure to repetitive head impacts associated with some contact sports. Although these baseline results are intriguing, it is important to follow the athletes who sustain a concussion for comparison with suitable controls in the short-term and then follow those concussed athletes through the recovery process, but also long term after college and beyond.