FormalPara Key Points

Many retrospective correlations have been made to assess the role of training-induced increases in muscle size and strength; however, these studies were not originally designed to answer this question.

We detail how to appropriately test this hypothesis and explain why correlational analyses will never be able to detail the causative effect that training-induced increases in muscle size have on increases in strength.

Data from our laboratory suggest that training-induced increases in muscle size appear to play little if any role with respect to training-induced increases in muscle strength.

1 Introduction

When researchers seek to determine the causative effect that one variable has on another variable, a study is designed with the intent that the intervention produces differential effects on one variable based on group membership (i.e., experimental vs. control) and proceeds to observe how this impacts the other variable of interest. In an animal model, this is often done by genetic modification to see how the overexpression or deletion of a specific gene impacts the animal’s phenotype. As this would be unethical in human models, and does not mimic the effect of resistance exercise, the researcher must design an intervention that produces differential effects on one variable (e.g. muscle size), and proceed to observe how this impacts the other variable (e.g. muscle strength). Rather than focusing on a continuous independent variable in a sample of individuals (i.e., a regression analysis), the independent variable is categorical, with the category being the specific group that an individual is assigned to (i.e., experimental or control). This is likely well known to most researchers as linear regression analyses are rarely used in randomized controlled trials unless of course one includes a categorical dummy variable to account for group membership, but then this analysis becomes analogous to an analysis of variance or t test as there are multiple groups. This concept is important when attempting to decipher the causative effect of different resistance training variables, particularly as it pertains to the role of exercise-induced changes in muscle size and increases in strength following resistance training [1, 2].

The purpose of this article is to question the idea that increases in strength from resistance training are neurological at first before becoming predominantly driven by increases in muscle size [3]. There is currently no direct evidence to support this hypothesis as this suggestion has been made based on retrospective correlational analyses, which we will illustrate cannot be used to infer the causal effect of exercise-induced increases in muscle size with respect to exercise-induced increases in strength. While higher levels of muscle size are associated with higher levels of muscle strength (i.e., correlational studies between individuals), the question of interest is whether training-induced increases in muscle size within a given individual play a role with respect to training-induced increases in muscle strength. This question can only be answered by designing a study in which multiple groups perform resistance exercise but differ on the magnitude of muscle growth that occurs.

2 How do we Test if Training-Induced Muscle Growth Causes Training-Induced Muscle Strength?

Consider the following question: are increases in muscle size that occur during resistance training responsible for the increases in muscle strength? There are probably several study designs that would be appropriate to answer this question, but it would be most appropriately tested in humans. One potential problem with using animal models is that the magnitude of muscle growth following stretch-induced muscle hypertrophy far exceeds that observed in humans (200 vs. 15%) [4], while models used to mimic resistance exercise (i.e., ladder climbing [5], ankle plantar flexion exercise [6], and even squat exercises [7]) tend to produce significantly less and often no muscle growth [4]. Given that human strength is thought to be important for all individuals whether it relates to improving sports performance [8] or increasing an individual’s life span [9], strength testing should be conducted using a test with the central nervous system intact. After all, the interest for most is in understanding the contribution that increasing muscle mass has with respect to increasing voluntarily tested strength in an intact human.

It could be argued that while testing a muscle fiber’s strength capacity outside of the human body provides important information, it may have little relevance for answering ‘when an adult human participates in resistance training and the muscle hypertrophies, does that change in muscle size produce a concomitant change in strength?’ As such, it is important to assess the strength of animals in a similar fashion (i.e., testing task-specific strength while the animal is alive) as this is what many researchers are interested in and is what is most commonly studied in humans. As expected, all of these studies illustrated that animals increased muscle strength in the task performed regardless of whether muscle growth was [7] or was not [5, 6] present. While animal models may help detail the role that exercise-induced muscle growth has on strength increases from training, the true question of interest is whether this holds true in adult humans performing resistance exercise.

In a human model, researchers must carefully manipulate the magnitude of muscle growth that occurs based on group membership (i.e., the independent variable). Researchers can do this by having multiple groups exercise, yet the exercise protocols performed by each group produce different magnitudes of muscle growth. Thus, if muscle strength were driven by increases in muscle size, any blunting or abolishing of muscle growth should also blunt or abolish muscle strength. Importantly, our research group has conducted two such studies that would seemingly detail that the increases in muscle strength observed after a resistance training protocol do not results from increases in muscle size. We tested this hypothesis by comparing two exercise protocols including a traditional resistance training protocol involving four sets of knee extension and chest press exercises performed at 70% of one-repetition maximum (1-RM) and compared this with another group simply performing a 1-RM test [10]. We observed that the individuals performing the traditional resistance exercise protocol increased both muscle size and strength, but the group performing only repeated 1-RM testing increased muscle strength to a similar extent despite not increasing muscle size. Therefore, the resulting increases in muscle strength cannot be the result of increases in muscle size.

Notably, this study was the same duration as the study often cited as evidence for this ‘neural first’ followed by ‘hypertrophy’ hypothesis. This is important as it suggests that our study was not limited by the duration [3]. Using a within-subject design, we also observed that performing an elbow flexion 1-RM test compared with performing a 1-RM test in addition to three sets of exercises at 70% 1-RM yielded similar results in that only the condition performing multiple sets of exercises increased muscle size, despite both conditions producing similar increases in 1-RM strength [11]. Therefore, one cannot make the argument that muscle strength would have been additive had maximal strength testing and traditional training been performed together. Notably, these similar findings were observed in both trained [11] and untrained [10] individuals. In addition, there are numerous protocols that produce differential strength results despite producing similar muscle growth, and this is dependent on the load that is being lifted [12, 13] as well as the type of strength test employed (i.e., concentric vs. isometric) [14].

3 How do we Currently Test This?

Despite the importance of creating multiple groups to manipulate one of the dependent variables (i.e., the group membership intending to produce differences on muscle size), it is still common for researchers to simply correlate the change in muscle size with the change in muscle strength among a group of individuals performing the same resistance exercise intervention [1, 2]. One of the reasons this is often done is because the initial purpose of the study is not to test whether exercise-induced muscle growth causes exercise-induced muscle strength.

For example, one study examining the correlation between exercise-induced changes in muscle size and strength [1] was actually designed to compare different contraction types [15]. Another similar study assessing this correlation [2] was originally designed to assess the importance of whey protein supplementation combined with resistance training [16]. Therefore, these studies testing correlations between exercise-induced changes in muscle size and strength [1, 2] were simply an afterthought using data that were already collected from studies not designed to answer this question [15, 16]. If one considers the previous paper from our laboratory, a within-subject correlational analysis performed on the group performing traditional sets of resistance exercise would yield a large correlation (β = 51.9, R 2 = 0.47, p = 0.0015) between training-induced increases in muscle size and strength (this is typically the only group included), while the group performing the repeated 1-RM testing (which is not commonly included) yielded no correlation (β = −78.8, R 2 = 0.06, p = 0.2488). Again, this details the importance of including multiple groups performing different exercise protocols in an attempt to produce different magnitudes of muscle growth. This also details that the squared correlations (i.e., R 2) should not be termed ‘variance explained’ because these squared correlations simply illustrate the shared variance between two variables, and neither variable truly explains any of the variance in the other variable.

There are two major limitations when correlating the change in muscle size with the change in muscle strength, and both must be considered: (1) the level of measurement error that is present; and (2) the level of variability that is present in the two variables being correlated. It is well known that a high degree of measurement error and/or limited amount of variability in the variables being assessed will result in a reduced correlation [17]. For example, consider the differing magnitude of variability that exists when correlating baseline levels of muscle size and strength, to that which is observed when correlating the change in muscle size to the change in strength in response to training. The correlation will almost always be stronger when considering baseline levels because the magnitude of variability that exists in the change in muscle size and change in strength from training tends to be about half of the variability that exists in baseline levels of muscle size and strength (this is illustrated in Fig. 1). Using an oversimplified example, it would not be shocking for one individual to have a chest press that is 100 kg greater than another individual, but it would certainly be extraordinary if one individual trained and gained 100 kg more strength in the chest press than another individual over the course of a study.

Fig. 1
figure 1

Illustration of why the baseline correlations will almost always tend to be greater than correlations around the changes in the same variables in response to training. This example was created using data from a previous study [18] in which individuals performed 12 weeks of a full-body resistance training program. Consider the magnitude of variability relative to the error of the measurement being assessed. The gray box illustrates the minimal difference on both sides of the mean. Black dots represent individual data points that exceed measurement error from the mean, while gray dots represent individuals who fall within the error of the measurement. Thus, all gray dots represent the correlation between two variables, and at least one of these variables is simply measurement error given these individuals did not actually respond differently from the group mean. Therefore, only the black dots truly represent differential responses and can appropriately assess the correlation of muscle size and strength independent of measurement error. Minimal differences for muscle size were as follows: chest 2.5 mm; elbow flexors 2.4 mm; knee extensors 2.9 mm. Minimal differences for muscle strength were as follows: chest press 23.4 kg; elbow flexors 7.6 kg; knee extension 21.9 kg

Not only will the small amount of variability limit the correlation that will be present [17], but also this limited variability may not even exceed the error of the measurement. To illustrate this concept, we used data from a previous study [18] in which individuals performed 12 weeks of a full-body resistance training program. The minimal difference in this sample data set was calculated using methods described previously [19] and was obtained from a time-matched control group that was included in the study design. We have taken the minimal difference for all strength and muscle size measurements and have added or subtracted it from both the baseline level of muscle size and strength and the training-induced changes in muscle size and strength to reveal which individuals responded differently from the mean (i.e., which individuals can be classified as differential responders). That is, all individuals that fall within the minimal difference cannot actually be considered as differential responders (this concept is explained in detail elsewhere [20, 21]).

Therefore, a large part of the correlation that is being assessed is not truly the relationship between the change in muscle size and the change in muscle strength, but rather, the association between error/random biological variability related to muscle size and error/random biological variability related to muscle strength. For example, a large portion of the individuals who increased muscle size are still within the error of the measurement relative to the mean group change from baseline, and we cannot be confident these individuals responded any differently from one another. In other words, individuals do indeed increase muscle size and strength following resistance training, but almost all of the individuals cannot be classified as differential responders from one another as they do not exceed the minimal difference from the mean change from baseline. This same statement holds true for muscle strength albeit to a lesser extent (as evident by the greater number of individuals exceeding the minimal difference for muscle strength measures as shown in Fig. 1). Figure 1 shows that a large portion of these correlations is simply attempting to correlate measurement error, and only a small number of individual data points can even be used to assess the association between the change in muscle size and the change in muscle strength (Fig. 1). If we perform the exact same correlational analysis using the baseline values for muscle size and strength, this will yield a better correlation because more data points can be deciphered from one another as the magnitude of difference separating individuals from the group mean exceeds the minimal difference (Fig. 1).

For this reason, it is important to consider that different measurement techniques will have different measurement error and thus may increase or decrease the number of individuals who can be detected as differential responders. To illustrate, a previous study [22] reported coefficients of variations to be 4.3% for the ultrasound machine-measured cross-sectional area, 2.1% for the cross-sectional area measured using magnetic resonance imaging, and 1% for dual X-ray absorptiometry-measured lean tissue mass.

While we again would like to detail that no correlational analyses can infer causality, it should also be mentioned that the correlational analyses often being computed are not necessarily answering the question that many think they are. To illustrate, people are generally interested in whether an increase in muscle size is associated with an increase in muscle strength (i.e., within subjects) [23]. As there are multiple time points that represent the same individual, it is important to perform a within-subject correlation as opposed to just running a correlation on the pre-post change value. In such an analysis, each individual is treated as their own group using a within-group correlation. A previous sample has illustrated that multiple groups can each have a within-group correlation of 0.9 for example, while the total correlation may equal zero [24]. Again, it should be mentioned that while using a within-subject correlation is more appropriate, it is still simply a correlation and thus causality cannot be inferred. This concept can also be seen in a subset of individuals pulled from the same dataset used in Fig. 1. While all individuals have a positive slope in response to the interventions, creating a change score and performing a total correlation would illustrate a negative correlation (Fig. 2).

Fig. 2
figure 2

Illustration of why performing a traditional correlation is inappropriate when repeated-measures designs are used. These are ten individuals pulled from a subset of the individuals used in Fig. 1 [18] and include those who did and did not exceed measurement error. Gray dots indicate pre-values while black dots indicate post-values. a Pre- to post-changes in muscle size and strength for ten individuals in a subset of individuals. Each individual has their own slope in this analysis. b Pre- to post-changes in muscle size and strength in which each individual represented in a is now represented as an individual plotted point using the change in muscle size and the change in muscle strength. Notably, while all individuals have a positive slope as detailed in a, the traditional correlation in b is negative

4 Conclusion

As with all experimental studies attempting to assess the causative relationship between two variables, the independent variable (i.e., group membership) must produce differential results on one dependent variable (e.g., muscle size) and test how this in turn impacts the other dependent variable (e.g., muscle strength). Causal evidence simply cannot be obtained from correlational analyses, and as such, these types of studies should not be used to support the notion that training-induced increases in muscle size result in the training-induced increases in muscle strength. Further, until such evidence is presented, this statement should not be used without a proper citation, which currently does not exist. The purpose of this study is not to say that baseline levels of muscle size do not impact muscle strength, but rather, the purpose is to question if the marginal increases (relative to baseline levels) in muscle size observed with resistance training in healthy adults play any role with respect to the training-induced increases in strength. More appropriately designed experimental studies are needed to answer this question of interest.