1 Introduction

The identification of forces that regulate the dynamics of ecological communities has important implications for both theory (Hairston et al. 1960; Paine 1980; Oksanen et al. 1981; Hunter and Price 1992; Strong 1992; Mutshinda et al. 2009) and management (Rabalais et al. 2002; Smith et al. 2010). These forces include environmental stochasticity, demographic stochasticity and density-dependent regulation (Lande et al. 2003; Wilson and Lundberg 2006). In this paper, we restrict our attention to density-dependent regulation among “seemingly” unregulated populations. We use the word “seemingly” because although populations may be regulated in the long run, they do not appear to be regulated statistically when time series are short. From such data, we try to identify two types of population interactions, top-down and bottom-up controls. A top-down control is herein defined as one where the consumer population abundance regulates the abundance of its resource population, and a bottom-up control is defined as one where the abundance of the resource population regulates that of its consumer. The interaction between two populations may include both trophic and non-trophic effects.

Empirical approaches to identify bottom-up and top-down processes in multi-species communities commonly involve correlation analyses of time series data on abundance indices, e.g., biomass, or catch-per-unit-effort (CPUE; Shiomoto et al. 1997; Worm and Myers 2003; Frank et al. 2005; Laundré et al. 2014). This method associates top-down forcing with a negative correlation between time series of consumer and resource populations, and bottom-up forcing with a positive correlation between the two. However, both the non-stationarity in the time series of community dynamics and the inability to observe all the interacting species pose serious problems in making appropriate inferences regarding community dynamics as described below.

The prevalence of non-stationarity in ecological and fisheries data is one problem inherent in the analysis of population time series (Steele 1985; Pimm and Redfearn 1988; Inchausti and Halley 2001; Stergiou 2002; Halley and Stergiou 2005; Wilberg and Bence 2006; Niwa 2007; Rouyer et al. 2008; Knape and De Valpine 2012). Non-stationarity often violates model assumptions and produces spurious results. The correlations among non-stationary time series are spurious in the sense that the estimated coefficients are statistically significant when there are no real relationships among variables (inflated type I error rate). For example, in the analysis of fisheries time series data, spurious results are often produced by correlating cumulative-sum (CUSUM)-transformed variables because the CUSUM transformation generates non-stationary time series from stationary time series (Cloern et al. 2012).

One approach for dealing with non-stationary time series is to model them as unit-root processes (Enders 2008). A time series is considered to have a unit-root when it can be made stationary by taking a first difference. The most common unit-root processes in the ecology and fisheries literature are random walks. The time series produced by first differencing a random walking time series is white noise, which is stationary (note that the first difference of a unit-root time series is not necessarily a white noise sequence). The lag-0 autocovariance for a random walk process is not stationary and it increases over time. Random walk process has a power spectrum that approaches infinity with decreasing frequency (spectral reddening). These features are commonly observed in fisheries and other population time series (Steele 1985; Pimm and Redfearn 1988; Inchausti and Halley 2001; Stergiou 2002; Halley and Stergiou 2005; Niwa 2007). This suggests the appropriateness to model each univariate non-stationary population time series as a unit-root process.

Persistent natural populations should show evidence of regulation in the long term historic trends of population abundances, whether they are regulated towards an equilibrium point or an equilibrium zone (Strong 1986; Krebs et al. 1994; Murdoch 1994). Their population time series are expected to be stationary in the long run. However, regulation may not be evident in short time series, especially if we only deal with single time series. For example, the existence of an equilibrium point or equilibrium zone was questioned by Strong (1986) in the dynamics of some populations, and the phenomenon of a lack of density dependence in persistent populations was coined “vague” density dependence. However, it is generally acknowledged that density dependence is necessary for population regulation (Murdoch 1994). Many factors have been considered as potential causes of the density “vagueness” phenomenon, which include the mismatch of the spatial scale between the analysis and the ecological process (Ray and Hastings 1996), the lack of statistical power of statistical tests (Knape and de Valpine 2012), and the short length of time series data. Here, we explore the situation when there exists an equilibrium zone within which the population abundance is weakly regulated. A relatively short observation of the dynamics would reveal no or insufficient information on population regulation to conclude density dependence. Therefore, when we have non-stationary time series, instead of assuming each population is individually regulated, we test for stationarity with two or more non-stationary time series together. When such a relation is identified, it is included in the further analysis to investigate population interactions.

Fig. 1
figure 1

Diagram showing an example of three-species interactions. This bottom-up driven community has two trophic levels. Each circle represents a population, and arrows represent population interactions and the direction of that interaction, e.g., the arrow running from population V to population \(P_1\) denotes a positive numerical response from the resource to the consumer population, and this interaction is unilateral. The two consumer populations at the higher trophic level are competitors for available resources. The sign at the left hand side of each circle represents one scenario where a spurious inference could arise from the correlation method. The gray area denotes the population not observed, or excluded from the model formulation, and the white area denotes available data. See the main text for details

The other problem in identifying bottom-up and top-down forcing is that, invariably, analyses are performed on a subset of the trophic web and inferences must be made based on our knowledge of a partially-observed system (Stenseth et al. 1997). Furthermore, trophic “cross-links” (Paine 1980) can act as indirect pathways to produce unexpected indirect effects in ecological communities (Wootton 1994). Problems resulting from the inability to observe potentially important species interactions can be illustrated conceptually by a hypothetical ecological community with two consumers and one common resource denoted by \(P_1\), \(P_2\) and V (Fig. 1). In this example, consumer and resource abundances in a bottom-up forced community could be either positively correlated, negatively correlated or uncorrelated. \(P_1\) and \(P_2\) compete, with \(P_1\) being competitively superior to \(P_2\), and both \(P_1\) and \(P_2\) are affected by changes in V (bottom-up). If \(P_2\) can be observed, but not \(P_1\), and the positive impact of the bottom-up forcing is weaker than the negative impact from competition with \(P_1\), then fluctuations in V and \(P_2\) will be correlated negatively, leading to the spurious inference of top-down forcing. An example of 3-species indirect facilitation, in which there was a positive indirect relation between a keystone predator and a consumer population, has been reported in a freshwater benthic community by Holomuzki et al. (2010). Thus, the identification of bottom-up versus top-down forcing of a food web based on the sign of correlation coefficient is difficult, if not impossible. However, the direction of the effect of species interaction does not change under indirect species interactions, e.g., in the above example, a change in the resource population drives a change in the consumer population, whether positive or negative. This example illustrates the importance of using a Granger causality based method (Granger 1969; Detto et al. 2012), rather than using the sign of the correlation coefficient. Granger causality is a statistical concept based on prediction. If variable x helps the prediction of variable y, then x causes y in the sense of Granger causality. In this example, although the correlation between populations V and \(P_2\) is negative, V still causes \(P_2\) because knowing the time series of population V improves the prediction of the time series of population \(P_2\) through the species interaction. Here, we define the link between statistical model and top-down and bottom-up hypotheses as following. If the resource population time series improves the prediction of its consumer population time series, we say there is a bottom-up effect; similarly, if the consumer population time series improves the prediction of its resource population time series, we say there is a top-down effect. This interpretation is the same as the one used in the literature (Ives et al. 2003). However, we here introduce the co-integration term, which allows us to model non-stationarity.

In the sections that follow, we describe a new practical approach to identify bottom-up and top-down control, which overcomes the statistical difficulties associated with the non-stationarity of ecological time series data and also reduces the difficulties resulting from the inability to observe potentially important ecological interactions. We first provide an overview of our approach to the analysis of multivariate time series of CPUE data, which are potentially non-stationary. We then describe a quantitative framework for a linear multi-species community model. Then, we demonstrate an application of our approach by analyzing CPUE time series data from the shrimp/ground fish fishery in the Gulf of Mexico. Finally, we discuss some implications of our results.

Fig. 2
figure 2

Steps in analyzing multivariate time series data. a After plotting each individual time series, we first check visually for non-stationary behavior. Then, various stationarity and unit-root tests, e.g., KPSS test and Dickey–Fuller test, can be conducted to confirm the stationarity or the existence of a unit-root in the time series. A time series having a unit-root in its characteristic equation is not stationary. There are other types of non-stationarity as mentioned in the introduction, but we do not discuss them here. b Next, linear models are built. When some (\({\ge }2\)) of the time series are integrated, we use multivariate unit-root tests to determine the long-run relationship among time series. c Given the long-run relationship, ecological hypotheses can be tested. If test results indicated that only one time series \(\{x_t\}\) is non-stationary, other time series should be linearly independent of \(\{x_t\}\) in the long term relation because a non-trivial linear combination of a non-stationary time series and stationary time series cannot be stationary. When all the individual time series are stationary, vector autoregressive models can be used, and all the hypothesis tests outlined above can be conducted in a similar fashion

2 Materials and methods

2.1 Overview of the proposed method

An overview of the steps involved in our approach to identifying bottom-up and top-down control by analyzing multivariate time series data is presented in Fig. 2. First, non-stationarity in the time series is checked visually based on time series plots. Then, to confirm the observation, univariate unit-root tests, e.g., the KPSS test (Kwiatkowski et al. 1992) and the Dickey–Fuller tests (Dickey and Fuller 1979; Said and Dickey 1984), are applied. Unit-root tests indicate whether the time series is stationary or a unit-root exists. We note that there are other processes generating non-stationarity in the data (Stenseth et al. 2004; Engle 1982), but they are beyond the scope of this paper. If the time series is produced by a unit-root process, it is not stationary, but its first difference is stationary. For this reason, a unit-root process is also considered integrated of order one, denoted as I(1); the first difference of a unit-root process is considered integrated of order zero, denoted by I(0), because the difference is stationary without further differencing.

Next, linear models are built. If every time series is stationary, vector autoregressive (VAR) or multivariate autoregressive (MAR) models (Ives et al. 2003; Hampton et al. 2013) can be applied directly. If some or all of the time series are integrated of order one, multivariate unit-root tests (Johansen 1991) are used to test if a non-trivial linear combination of I(1) processes is integrated of order zero. If such a linear relationship exists, the linear combination is considered to be the co-integration relation. Because non-stationarity is common among population time series, our approach will expand the applicability of the analysis by incorporating non-stationarity into a VAR or MAR model. Given the presence or the lack of co-integration relation(s) among non-stationary time series, VAR models that incorporate various ecological hypotheses related to community structures are built. This is described in the next section.

2.2 A multi-species linear population model

Vector autoregressive (VAR) models are routinely used to model population time series data (Hampton et al. 2013). Due to the presence of non-stationary time series, here, the VAR model is written in a vector error correction model (VECM) form (Engle and Granger 1987). A pth order VAR model for s populations is

$$\begin{aligned} {\varvec{N}}(t)=\sum _{i=1}^{p}{\varvec{\varPhi }}_i{\varvec{N}}(t-i) + {\varvec{C}}+{\varvec{W}}(t) \end{aligned}$$

where \({\varvec{N}}(t)\) is an \(s\times 1\) vector of log-transformed population abundances at time t, \({\varvec{\varPhi }}_i\) (\(i=1,2,\dots ,p\)) are \(s\times s\) coefficient matrices, vector \(\varvec{C}\) is an intercept term, and vector \({\varvec{W}}(t)\) is a stochastic term. It has a VECM representation

$$\begin{aligned} \varDelta {\varvec{N}}(t)={\varvec{B}} {\varvec{N}}(t-1)+\sum _{i=1}^{p-1}\varTheta _i\varDelta {\varvec{N}}(t-i)+{\varvec{C}}+{\varvec{W}}(t) \end{aligned}$$
(1)

where

$$\begin{aligned} {\varvec{B}}= & {} \sum _{i=1}^{p}{\varvec{\varPhi }}_i-I,\\ {\varvec{\varTheta }}_j= & {} -\sum _{i=j+1}^{p}{\varvec{\varPhi }}_i \end{aligned}$$

where \(j=1,2,\dots ,p-1\), \(\varDelta {\varvec{N}}(t)={\varvec{N}}(t)-{\varvec{N}}(t-1)\), \({\varvec{B}}\) and \(\varTheta _i\) (\(i=1,2,\dots ,p-1\)) are \(s\times s\) coefficient matrices. Matrix \({\varvec{B}}\) (\(s\times s\) with rank r) can be written as a matrix product \(\alpha \beta ^\mathrm{T}\) with \(\alpha \) and \(\beta \) each with dimension \(s\times r\), and T denotes the transpose of a vector. Each column of \(\beta \) contains coefficients for the co-integration relation, and each column of \(\alpha \) contains what are called adjustment coefficients, and it determines how much each population is responding to the co-integration relationship. \(\beta {\varvec{N}}\) is called disequilibrium error and it plays an important role in determining the long term community dynamics as described later in this section. On the left hand side of Eq. 1 is the change of the s-dimensional state vector at time t. This change is decomposed into four components on the right hand side of the equation. From left to right, we interprete these terms as the long-run relationship, short-run relationship, intercept term and stochastic term, respectively. Each individual time series in vector \({\varvec{N}}(t)\) can be either integrated of order one or zero.

In the main text, we focus on inferences on the long-run component of the model, and defer the results on the short-run component to the “Appendix”. Inferences on the short-run component of the model are well known and covered elsewhere (Lütkepohl 2007; Hampton et al. 2013). We expect that when regulation is not strong at the individual population level, i.e., the non-stationarity of some univariate population time series cannot be rejected, the long-run component of the model plays the major role in community dynamics.

The number of linear equilibrium relationships among potentially non-stationary time series, i.e., the rank of matrix \(\varvec{B}\), can be estimated by Johansen’s test of co-integration, which has been verified with simulated data sets and extensively used in econometrics (Johansen and Juselius 1990; Johansen 1991, 1995). First, VAR models were fitted to the data, and order p was chosen based on BIC (Tsay 1984). Then, likelihood ratio tests were used to determine the rank of matrix \(\varvec{B}\). The test statistic (trace statistics) was calculated successively to determine the rank of matrix \(\varvec{B}\), starting from the null hypothesis of no equilibrium relationship (\(r=0\)) against at least one equilibrium relationship (\(r\ge 1\)) and adding the number of equilibrium relationships at each step. With the rank of matrix \(\varvec{B}\) set to r, different hypotheses based on \(\alpha \) and \(\beta \) were formed and tested using likelihood ratio tests (Johansen 1995).

The contribution of the long-run component of the model to community dynamics is determined by the rank of \(\varvec{B}\). The equilibrium states are in the null space of matrix \(\varvec{B}\), which is the set of solutions to \({\varvec{B}}x=0\), and their dimension depends on the number of I(1) time series and the linear dependency of the time series. We consider the following three cases (1) when all the time series in \({\varvec{N}}\) are stationary, (2) when some of the time series are I(1) and subsets of them form co-integration relationships, (3) when all the time series are I(1) and they do not co-integrate. When matrix \(\varvec{B}\) is of full rank (i.e., \(r=s\)), the equilibrium state is a point in the s-dimensional space. When this equilibrium point is stable, it attracts community trajectories in the long run. On the other hand, when some of the components of vector \({\varvec{N}}\) are not stationary, matrix \(\varvec{B}\) is of reduced rank, and the long-run component plays a major role in community dynamics. When some of these time series co-integrate (Engle and Granger 1987), i.e., there is a non-trivial linear combination of I(1) time series that produces a stationary time series, the community dynamics are regulated. When all the time series are integrated of order one and do not co-integrate, the rank is zero (Johansen 1995) and the community dynamics are not regulated. In addition, we consider an equilibrium relationship exists among all the species considered if the disequilibrium errors, i.e., each component of \({\beta } {\varvec{N}}\), follow a stationary process (Banerjee et al. 1993). The disequilibrium errors are calculated as \(\hat{\beta }\mathrm{T}{\varvec{N}}\), where \(\hat{\beta }\) is estimated through Johansen’s procedure. The equilibrium states of the system defined by Eq. 1 can be found by setting stochastic terms to zero, substituting \({\varvec{N}}(t)\) and \({\varvec{N}}(t-1)\) by \({\varvec{N}}^*\), and solving for \({\varvec{N}}^*\).

When every component of matrix \(\varvec{N}\) is stationary, a VAR model can be applied directly to the population data. The test of bottom-up forcing becomes a test of the significance of the coefficients of the resource population terms in the consumer equations, and the test of top-down forcing becomes a test of the significance of the coefficients of the consumer population terms in the resource equations. Likelihood ratio tests can then be constructed to test the significance of each hypothesis (Lütkepohl 2007).

When not all the components of vector \(\varvec{N}\) are stationary, and some of those non-stationary components co-integrate, matrix \({\varvec{B}}\) becomes singular. The coefficients that make the linear combinations of components of \(\varvec{N}\) stationary are in the rows of \(\beta \). These linear combinations become factors, into which the equilibrium relationship for each component population is partitioned. The factor scores are found in the columns of \(\alpha \). Matrix \(\varvec{B}\) is factored into \(\alpha \beta \mathrm{T}\), \(\alpha \) is s-by-r, and \(\beta \) is s-by-r. Here, s is the number of time series, and r is the rank of matrix \(\varvec{B}\). Then the term \(\varvec{B}N\) can be written as \(\alpha \beta \mathrm{T} {\varvec{N}}\). Vector \(\varvec{N}\) is the vector of original time series. The coefficients of each column of \(\beta \) are the coefficients that make linear combinations of the time series \(\varvec{N}\) stationary. There would be r such linear combinations. These r linear combinations (i.e., \(\beta \mathrm{T} {\varvec{N}}\)) are similar to factors, because they are unobserved and the number of them is less than the number of original time series. Thus, \(\beta \) is similar to factor score coefficients, and \(\alpha \) is similar to factor loadings. By testing constraints (restrictions) on \(\beta \), we can determine which species have an equilibrium relationship. Similarly, by testing restrictions on \(\alpha \), we can determine the direction of species interactions.

Finally, when all the time series are non-stationary and linearly independent, the rank of matrix \(\varvec{B}\) is zero. In this case, there is no simple equilibrium relationship among these time series, and bottom-up and top-down hypotheses cannot be tested.

\({\varvec{N}}\) may also have a drifting trend. This trend is incorporated into \(\varvec{C}\), and its significance can then be tested (Johansen 1991). When the population dynamics do not have a drift term, the intercept term (\(\varvec{C}\)) in Eq. 1 can be moved inside the matrix product \(\varvec{B N}\), and it will be stacked at the first row of \(\beta \) with \(\varvec{N}\) augmented by a constant. Such a model is called a restricted intercept model.

2.3 Significance of a focal population in the long-run component of the assemblage dynamics

Sometimes it is necessary to test the significance of a focal population in the equilibrium relationship(s). Suppose we have time series data of one resource population and multiple consumer populations. When an equilibrium (co-integration) relation is identified in the data set, either of the following scenarios is possible: (1) the equilibrium relation is among those consumer populations; (2) the equilibrium relation is between the resource and consumer populations. Testing the significance of the resource population in the equilibrium relation rules out the first scenario where the interaction is competitive.

When the community assemblage has one or more equilibrium relationships identified through testing the rank of \(\varvec{B}\), the significance of a focal population in the equilibrium relationship(s) can be tested. The testing procedure is as follows. The null hypothesis assumes that the focal population does not have any equilibrium relationship with the rest. This hypothesis can be constructed by restricting the coefficient(s) corresponding to focal population(s) in each equilibrium relationship to zero (coefficients in the columns of \(\beta \) for the co-integration relation). The likelihood ratio test is used to test the significance of the focal population in the equilibrium relationship.

2.4 Significance of bottom-up forcing and top-down forcing in the long-run component

If the resource population is found to contribute significantly to the dynamics, we then test the significance of bottom-up forcing and top-down forcing. These hypotheses are formed by restricting elements of \(\alpha \).

Before testing the restrictions on \(\alpha \), we already tested the significance of the cointegration relation and the significance of the focal population in that relation. Non-zero entries of \(\alpha \) thus indicate the corresponding variable is responding to changes in the long term cointegration relation (or the dis-equilibrium errors), which is a linear combination of both resource and consumer variables. A row of zeros indicates the corresponding variable is not responding to the changes in the long term relation. Therefore, finding non-zero entries in \(\alpha \) determines which population is responding to the changes in the long term relation. When we find non-zero entries in the row of \(\alpha \) for the resource equation, we say there is top-down effect in the long term dynamics; similarly, when we find non-zero entries in the row of \(\alpha \) for the predator equation, we say there is bottom-up effect in the long term dynamics.

Top-down hypothesis assumes that only the resource population is responding to deviations from the equilibrium relationship(s), and other populations are not. This hypothesis can be constructed by restricting the adjustment coefficients for the consumer populations to zero, while allowing the coefficient for the resource population to vary freely. When we have only one resource population, the restriction on \(\alpha \) forces \(r\le 1\). When there is more than one equilibrium relationship, this identified equilibrium relationship is a linear combination of all the equilibrium relationships. The likelihood ratio test can be constructed, and the test statistic has an asymptotic \(\chi ^2\) distribution. Deviations from the null hypothesis indicates bottom-up forcing.

Meanwhile, bottom-up hypothesis assumes that only the consumer group was responding to deviations from the equilibrium relationship(s). Similar to the top-down hypothesis, in the null hypothesis the adjustment coefficients for the resource group are set to zero, and those of the consumer populations are allowed to vary freely. Deviations from the null hypothesis indicates top-down forcing.

For the density dependence in the short-term component of the model, see the “Appendix”. In addition, it is also possible for a community to have both significant bottom-up forcing and top-down forcing. For such a community, the dynamics are interactive: both the resource and the consumer populations are affecting the dynamics of each other. Moreover, in the “Appendix”, a nonlinear predator-prey model was used to explore small sample properties of the co-integration method and bottom-up and top-down tests, and these tests showed reasonable performance even with short time series.

2.5 Data

The analysis testing ecological hypotheses is illustrated using survey data collected for fisheries management. The data were collected by the United States National Marine Fisheries Service (NMFS) Southeast Area Monitoring and Assessment Program (SEAMAP; Stuntz et al. 1985). Our analysis used the semi-annual shrimp/ground fish surveys conducted from 1986 to 2011 in the northern Gulf of Mexico between Mobile Bay, Alabama and Brownsville, Texas. Each year, approximately 300–400 bottom trawl samples were collected. Although the SEAMAP shrimp/ground fish program started in 1982, the data from 1982 to 1985 were omitted from the analysis due to inconsistent coverage and changes in sampling methods. Surveys were conducted once during the summer (May–July) and once during the fall (October–November). The northern part of the Gulf of Mexico is divided into 21 statistical zones for fisheries surveys. Our analysis used data from 10 statistical zones in the west, zones 11 through 21, excluding zone 12. The location of bottom trawl stations was randomly selected at each sampling occasion within each zone. At each sampling station, time and environment variables, e.g., water temperature, salinity and depth, were recorded at the time of sampling.

Fig. 3
figure 3

Standardized CPUE time series of brown shrimp (a), Atlantic croaker (b), silver seatrout (c) and sand seatrout (d) from statistical zone 15 in the Gulf of Mexico from 1986 to 2011

Penaeid shrimps found in the northern Gulf of Mexico include brown shrimp, Farfantepenaeus Aztecs, white shrimp, Litopenaeus setiferus, and pink shrimp, Farfantepenaeus duorarum (Perez-Farfante and Kensley 1997). The commercial shrimp fishery targeted all of these species. In 2009, landings were estimated at 118 million kg and valued at 340 million US dollars (National Marine Fisheries Service 2010). Brown shrimp was the most abundant among the three species.

Three finfish species, Atlantic croaker, Micropogonias undulatus, silver seatrout, Cynoscion nothus and sand seatrout, Cynoscion arenarius were included in the analysis. Their distribution patterns, ecological roles, and status in bottom fisheries, with an emphasis on their relation with penaeid shrimps, are outlined in the “Appendix”. These fish species were chosen because they ranked high in catches in the surveys as well as bycatch in the shrimp fishery (Diamond 2004). Both the junevile and adult individuals are well represented in the data (Monk et al. 2015). The fact that these fish species coexisted in large numbers with brown shrimp suggests the potential for ecological interactions between each of these fish species and brown shrimp.

2.6 CPUE standardization of shrimp and fish populations in the Gulf of Mexico from SEAMAP shrimp/ground fish program

Data were standardized using a delta generalized linear model (Fletcher et al. 2005) to each species in each statistical zone to extract annual abundance indices. This method is commonly used in this region (see the “Appendix” for details). We modelled season as a categorical variable (summer is modelled as no effect), and removed the effect of fall sampling from the data. All the following analyses were conducted on the natural logarithms of the standardized CPUE time series.

The existence of a unit root was tested using the augmented Dickey–Fuller (ADF) test (Said and Dickey 1984). The null hypothesis was that the tested time series was a unit root process. We used Bayesian Information Criterion (BIC) for model selection from a set of models (Schwert 2002), which had different time lags. All the statistical analyses were performed using R statistical software (R Core Team 2014), and the scripts used in this paper can be found in the supplementary material.

3 Results

The CPUE time series of brown shrimp and three fish species in zone 15 from 1986 to 2011 are shown in Fig. 3. Time series of other zones can be found in the supplementary material. The zone-specific results that follow are for zone 15 unless specifically noted otherwise. Brown shrimp showed an intermediate level of fluctuations among all the species considered over the 26-year period, with CPUE peaking in 1993 and 2010. Atlantic croaker was the most abundant among all the species, and its CPUE abruptly increased from 2006 to 2007. Silver seatrout and sand seatrout showed patterns similar to brown shrimp, with peaks around 1995 and 2009, but their time series appeared to exhibit a higher frequency of fluctuations.

ADF tests showed that the existence of a unit-root could not be rejected, and the tests on first differenced data all rejected the existence of a unit-root at the 1 % level (Table 1). This suggested that the original log-transformed time series were non-stationary, and were integrated of order one. Unit-root tests on the data from other zones also showed the prevalence of unit-root non-stationarity in the CPUE time series (see supplementary material).

The likelihood ratio test statistic of a restricted intercept model against the unrestricted model was 0.68 (\(p=0.88\)). This test supports the assumption that there is no drift term, \({\varvec{C}}=0\). The tests from other zones all supported the model with no drift term. The Johansen rank test showed that, under the restricted model, the null hypothesis of no equilibrium relationships was rejected at the 5 % level, and the null of less than or equal to one equilibrium relationship could not be rejected (Table 2). For this case, we accept that there was one equilibrium relationship in the community assemblage dynamics. Estimates of \(\alpha \) and \(\beta \) from the intercept-restricted model with \(r=1\) were then obtained (Table 3). The co-integration relation, or dis-equilibrium errors, was plotted in Fig. 4. The disequilibrium errors appear stationary and it suggests community density dependence while single populations lacked density dependence.

Table 1 Results for augmented Dickey–Fuller tests on variables from statistical zone 15
Table 2 Johansen cointegration test of the rank of matrix B from the intercept restricted model from statistical zone 15 with critical values at the 5 and the 1 % level

The hypotheses testing showed that the species interaction pattern was location-dependent in the Gulf of Mexico. Community assemblages in zones 14 and 19 did not show any equilibrium relationship, those in zones 16 and 17 showed two equilibrium relationships, and the rest of the locations showed one significant relationship (Table 4). The test of significance of the brown shrimp population in the equilibrium relationship(s) showed positive results in zones 16, 17 and 18 (Table 4), indicating a significant contribution of brown shrimp. The tests of bottom-up and top-down forcing then were conducted on these three zones. The contribution of the brown shrimp population to the equilibrium relationship was not significant in zones 11, 13, 15, 20 and 21, where the recognized equilibrium relationship was among the fish species.

Table 3 Maximum likelihood estimates of \(\alpha \) and \(\beta \) in Model \(H_1^*(2)\) of statistical zone 15
Fig. 4
figure 4

Cointegration relation from statistical zone 15 between 1986 and 2011 in the Gulf of Mexico. Stationary fluctuation suggests community density dependence among populations

Table 4 Number of equilibrium relationships (ERs) and p-values of the test of the significance of brown shrimp in community assemblage dynamics

Significant bottom-up forcing was identified in two out of those three assemblages, where the brown shrimp population had significant contributions to the long-run component in the community dynamics (Table 5). Top-down forcing was not significant in those three assemblages. In zone 17, neither the bottom-up forcing nor the top-down forcing was identified.

Table 5 Tests of bottom-up forcing and top-down forcing in zones 16–18

4 Discussion

In this paper, we demonstrated a new practical approach to testing bottom-up and top-down processes in a multi-species community with short time series that appear to be unregulated. This approach explicitly tests for and incorporates non-stationary dynamics in multivariate time series, which are common in ecological and fisheries data. Compared with the traditional correlation approach, results of this method are robust against complex and/or partially observable community structure. The method demonstrated here is a type of restricted VAR model, and this restriction allow explicit incorporation of non-stationary dynamics into the model formulation and estimation. For systems exhibiting high-frequency fluctuation due to strong nonlinearity, e.g., chaotic dynamics, a method based on nonlinear state space reconstruction can be used (Takens 1981; Sugihara et al. 2012).

The method presented in this paper is related to the MAR models in fisheries (Hampton et al. 2013). Our method here adds to the traditional MAR modeling by utilizing the non-stationarity feature of the data to highlight long term relationships within the community. It is a data-orientated modeling approach. The utility of the method presented in this paper is not only the addition of non-stationarity into the statistical model, but also the ecological questions we can ask based on this modeling framework. Our approach could be applied to a large number of existing data sets to test hypotheses of species interaction patterns in community dynamics. In addition, as a type of vector autoregressive model, the model presented in this paper also have the advantage of being able to model age/stage structured populations dynamics via delay-embedding (Nisbet and Gurney 1982).

The analysis of reduced rank VECM as presented in this paper is also closely related to dynamic factor analysis (DFA; Zuur et al. 2003) in the fisheries literature. The motivations for both methods are similar in analyzing non-stationary dynamics in time series, but the difference lies in their representations of the process. DFA characterizes time series in terms of a linear combination of unobserved random walks or common trends, and the co-integration method constructs time series using the disequilibrium error from equilibrium relationships. It is well known in econometrics that these two approaches are equivalent (Johansen 1995; Engle and Granger 1987; Stock and Watson 1988; Escribano and Peña 1994). The dynamic factor analysis focuses on the extraction of common trends, which do not attach any immediate ecological meaning other than the fact that these common trends are shared across multiple time series. The method presented here finds linear combinations of the original time series that cancel out the shared common trends in order to extract a more stable time series. Finally, our method is built upon a multi-species model and various ecological hypotheses can be readily constructed, e.g., top-down and bottom-up controls.

Can unit-root processes represent the non-stationary dynamics exhibited by natural populations? For example, a random walk model (Lewontin and Cohen 1969) has been used to model a population under a stochastic environment (Niwa 2007; Cohen 2013). As in the random walk model, unit-root processes are statistical approximations to the dynamics; therefore, they should not be taken as representing the true underlying mechanism. For example, according to this model, individual population abundances can grow arbitrarily large given enough time, but for natural populations, there is always some upper bound imposed on population growth, e.g., by food shortage or lack of suitable habitat. On a longer time scale, this behavior of natural populations contradicts the model. However, when the observation period is relatively short for ecological studies, e.g., 26 years in this study and 31–60 years in Niwa (2007), unit-root models can provide reasonable approximations to the underlying processes and can be used to gain insights into the population and community dynamics.

When top-down processes exist, they are often referred to as trophic cascades, which implies that top-down forces mainly occur in communities with chain-like trophic structures, or trophic ladders (Strong 1992). However, we suggest that trophic ladders are rather a special case of top-down processes in communities with simple structures. The simplistic view of top-down process was the assumption behind the correlation analysis. Top-down processes, as defined in the introduction of this paper, on the other hand, can incorporate species interactions more generally.

In marine ecosystems, bottom-up processes are considered more prevalent (Cushing 1975; Aebischer et al. 1990; Verity and Smetacek 1996; Chavez et al. 2003) and sometimes are treated as the “normal state” (Strong 1992; Frank et al. 2006), whereas top-down processes are considered to be limited to near-shore and inter-tidal communities with simple trophic structures (Chapin et al. 1997; Pinnegar et al. 2000). In the Gulf of Mexico marine communities, our results showed a diversity of species interaction patterns in the community dynamics. Brown shrimp did not have an equilibrium relationship with the three fish species investigated in either the west or north sides of the study area. In the central zones of the study area, brown shrimp contributed significantly to the community assemblage dynamics. Significant bottom-up forcing was identified in two of those three central zones. In this study, fish species were selected based on their high population density, and this might have caused the inability to detect top-down forcing. Additionally, spatial heterogeneity of the ecological process may also have affected the detection of species interactions in the example as pointed out by one of the reviewers. It would be very interesting to further explore the effect of spatial autocorrelation on the detection of species interactions. Further research should also look at how environmental covariates, including anthropological factors, would impact observed community dynamics.