Introduction

State of Michigan geography and climate

Michigan is located in the Northern US, centrally located east to west (Fig. 1). Michigan has two parts, or peninsulas, northern and southern. The southern peninsula is surrounded by Lake Michigan on the west and Lake Huron to the east. It is north of the states of Indiana and Ohio. The Northern Peninsula or sometimes called the Upper Peninsula (U.P.) is disconnected from the lower peninsula by the portion of lake that connects Lakes Michigan and Huron. The U.P. is connected to the state of Wisconsin to the south and Lake Superior to the north.

Fig. 1
figure 1

All precipitation stations in Michigan (solid black dots represent stations with more than 20 data points; hollow dots represent neglected stations due to insufficient data. Study area is where dots occur.)

The climate is characterized as a Humid Continental Climate (NWS 2021), which means that it has distinct seasons and an annually even rainfall distribution throughout the year. The mean annual temperature ranges from 24.5° Fahrenheit in January to 73.5° in July for the largest city of Detroit, located in the southern portion of the Lower Peninsula. Precipitation is primarily snow in the winter, while showers and thunderstorms are frequent in the summer. Michigan is unique in that it is significantly surrounded by Great Lakes and is therefore particularly subjected to precipitation and related streamflow phenomenon. Michigan has 5292 km (3288 miles) of freshwater shoreline, thereby making it the state with the most such shoreline in the US (State of Michigan 2021).

Literature review of precipitation and streamflow trends

There have been several studies on precipitation and streamflow, a partial list of the most recent and most relevant of which is in Table 1. Of note is that of Hagedorn and Meadows (2021) which studied only the undisturbed watersheds in Michigan. The current study is unique in studying all the Michigan watersheds.

Table 1 Recent and relevant studies on precipitation and stream flow trends

Study purpose

The purpose of this study is to analyze the trends for precipitation and streamflow throughout the State of Michigan to shed light on the following questions:

  1. 1.

    Are precipitation and streamflow increasing, as shown by increasing linear trend lines of the data?

  2. 2.

    Are the extremes of precipitation and streamflow increasing, as shown by increasing slopes of the moving average and moving standard deviation of the data?

These questions have implications for future management of watershed for flooding, transportation, recreation, and water supply. The novelty of this work is that nobody has done this kind of analysis for the State of Michigan before.

Materials and methods

Description of data

The state of Michigan has 548 precipitation stations, as shown in Fig. 1. There are multiple types of precipitation stations with some measurements taken automatically and some manually (NOAA 2021). Within all Michigan watersheds, there are 209 streamflow gaging stations, as shown in Fig. 2. Streamflow gaging stations are managed by the U.S. Geological Survey (USGS) and measure the water elevation which is used in conjunction with a stage-streamflow rating curve to calculate the streamflow (USGS 2021a). It can be seen that the spatial coverage of measurement stations is broad. Only data sets with 20 or more data points were used, to ensure meaningful statistical value.

Fig. 2
figure 2

All streamflow gaging stations in Michigan watersheds (solid black dots represent stations with more than 20 data points; hollow dots represent neglected stations due to insufficient data. Study area is where dots occur.)

Methods

All available precipitation data from the first year of record for a particular precipitation station were obtained from the National Oceanic and Atmospheric Administration (NOAA) online weather database for all precipitation stations in Michigan (NOAA 2021). All available streamflow data were obtained from the United States Geological Survey (USGS) website for all Michigan Watersheds dating back to 1901 (USGS 2021b). Precipitation and streamflow stations with fewer than 20 data points were discarded, since they contain insufficient data.

The mean, standard deviation, and coefficient of variation of all data were calculated and recorded for each station. These data were plotted as precipitation over time for each individual station and the slope linear regression slope was recorded. The slope of the linear regression line represents a long-term trend. An assumption of linear regression is normality, which may introduce some error in the regression line slope values. Future analysis can be done with the Mann–Kendal test, for example, that does not assume normality. When the nonparametric Theil–Sen slope estimator is applied, instead of linear regression, the results are similar to linear regression on streamflow data (Wasko et al. 2020), Trend were sought, however, not exact values for predictive purposes. The number of stations with positive slopes (increasing tends) was counted and expressed as a percentage of the total number of stations that contain more than 20 precipitation data points. The 11-day moving average (also a metric of long-term trends) and moving standard deviation, which is a metric of long-term trends in extreme values, were calculated as well. These are hereafter referred to as the “moving statistics.” The moving statistics were also plotted over time for each individual station and the slope of their linear regressions were recorded. Again, the number of positive slopes between all of the stations was summed and expressed as a percentage of the total amount of pertinent precipitation stations.

To investigate the presence of change points in precipitation and streamflow, a two-sample change point analysis using a visual inspection was performed to identify the appropriate point in the given data set that best represents a split in the data. A change point is defined here as a point in time where a change in trends occurred. The change could be from a lower slope to a higher one or vice versa. For some cases, this point was chosen to be where data gaps exist, if any. The prefixes “pre” and “post” were used to identify the data prior to and after, respectively, the determined change point. T tests that assume unequal variances were conducted for all pertinent precipitation and streamflow stations using the Excel spreadsheet T test function. The t test also assumes normally distributed data, which may not be true here. The hypothesized mean difference was set to zero, since it was assumed that the “pre” and “post” data for each station originate from the same data set. Alpha, the correlation coefficient, was set to equal 0.05, since it is a commonly accepted value for statistical analysis. The amount of precipitation and streamflow stations that had P(T ≤ t) two-tail values less than 0.05 were counted and expressed as a percentage of the total amount of t tests performed. The amount of t Stat values greater than t Critical two-tail values was also counted, as well as the amount of stations that had both the P(T ≤ t) two-tail values less than 0.05 and t Stat values greater than t Critical two-tail values. These summations were both expressed as a percentage of the total amount of t tests that were conducted. The plots of streamflow over time were examined and unnatural data sets were flagged. This included data that appeared to be cyclic and/or not scattered. The amount of flagged data sets was recorded. Change point analysis assumes a sudden change, which may or may not be true for climate change. The climate is changing more gradually due to the increase in greenhouse gas emissions. Nevertheless, many studies have used change point analysis to gage trends in data.

A contour map was created of the precipitation P(T ≤ t) two-tail values obtained from the t tests and of the slope of the linear regression line of the dataset for each station. In order to construct this, the latitude and longitude coordinates of each precipitation station were taken from the NOAA “Find a Station” data tool webpage. The P(T ≤ t) two-tail and linear regression slope contour maps use a contour interval of 0.05 and 0.01, respectively, while the respective amount of color classes for each contour map are 5 and 32. Each streamflow gaging station was examined to determine if any upstream dams exist that would potentially affect the streamflow data. Any notable structures were recorded with the corresponding gaging station.

A compilation was created of all Michigan watershed maps. A map of the state of Michigan was shown alongside each watershed map with the location of the watershed marked. The gaging stations within each watershed were marked as were any dams and rivers/streams. An arrow was placed pointing to each gaging station to identify the corresponding slope of the linear regression of the streamflow data for each location. These are all given in the Supplemental Information related to this article.

Results

Precipitation

In total, there are 548 precipitation stations within the state of Michigan. Of these 548 stations, 117 have at least 20 data points. The precipitation station at the Marquette Weather Forecast Office (WFO) in Michigan will be used as an example in the following results as it is a good representation with an ample amount of data and it is located relatively close to the streamflow gaging station that was used as an example in Sect. 3.2.

The mean of all precipitation data points for each station ranges from 28.46 inches in Houghton Lake, MI, to 40.34 inches in Niles, MI. The lowest standard deviation is found at the Gaylord Otsego County Airport in Michigan at 3.22 inches, while the highest standard deviation is 7.02 inches in Herman, MI. The Gaylord Otsego County Airport station also has the lowest coefficient of variation of precipitation data at 0.11, while the largest coefficient of variation is 0.21 at the Harbor Beach 1 SSE, MI, station. Figure 3 shows the precipitation data plot for the Marquette WFO, MI, station. This type of plot was created for all 117 gaging stations and can be found. The linear regression for this particular location displays a slope of 0.0902 in/year.

Fig. 3
figure 3

Precipitation data for the example station at Marquette WFO, MI

Of all precipitation stations, the slope of the linear regressions ranged from − 0.1448 in/year in Morenci, MI, to 0.4039 in/year at the Gaylord 9SSW, MI, station. Positive linear regression slopes appear to be in 89.74% of the stations. When this slope is divided by the average streamflow of each particular location and multiplied by 100, both of these stations, respectively, had the smallest and largest value of this statistic with the smallest being − 0.3860/year and the largest being 1.0532/year. It was found that 89.74% of the 117 stations had positive values of linear regressions divided by average streamflow values.

After calculating the 11-day precipitation moving averages, moving standard deviations, and moving coefficient of variations, it was seen that 46 stations had at least 20 data points and could be further evaluated. The slope of the linear regressions for moving average data sets ranged from − 0.0891 in/year at the Pellston Regional Airport in Michigan to 0.2671 in/year at the Caro Wastewater Treatment Plant in Michigan. Additionally, 76.09% of the 46 stations had positive linear regression slopes for moving averages. The moving standard deviation linear regression slopes ranged from − 0.1465 in/year at the Manistee 3SE, MI, station to 0.1262 in/year at the Detroit City Airport in Michigan. Out of the 46 precipitation stations, 54.35% of them had positive linear regression slopes for the moving standard deviations. The Manistee 3SE, MI, station and the station at the Detroit City Airport both also have the smallest and largest slope of moving coefficient of variation regression at − 0.0053/year and 0.0040/year, respectively. Of the 46 precipitation stations, 41.30% had positive linear regression slopes for moving coefficient of variations.

An example of splitting the data into two datasets based on a visual inspection of changing slopes is shown in Fig. 4 for the Marquette WFO, MI, station. For this example, the “pre” data have a slope of 0.5679 in/year and the “post” data exhibit a slope of 0.2634 in/year. This corresponds to a change of 53.62%. The mean percentage of change between all of the “pre” and “post” precipitation slopes was found to be − 123.13%. Table 2 displays the results of the t test for this precipitation station, where the first three rows of column one are for the “pre” data and the first three rows of column two are representative of the “post” data. A summary of all the statistics for all precipitation gage locations in given in Table 3.

Fig. 4
figure 4

“Pre” (circle) and “post” (triangle) precipitation data for the example station at Marquette WFO, MI

Table 2 Precipitation T test results for the example station at Marquette WFO, MI
Table 3 Summary of precipitation statistical results for all gages

In total, there were 43 precipitation stations, or 36.75%, that had P(T ≤ t) two-tail values less than 0.05. Three of these stations also had t Stat values greater than t Critical two-tail values; therefore, the hypothesis that the data of each individual precipitation station come from a singular dataset is rejected for 2.56% of the 117 precipitation stations that t tests were run for.

Figure 5 shows the contour map that was created using all of the obtained precipitation P(T ≤ t) two-tail values. Similarly, Fig. 6 shows the contour map that was created using the slope of the linear regression of each station’s precipitation values.

Fig. 5
figure 5

Contour map of precipitation P(T ≤ t) two-tail values (each black dot is a precipitation station and the contour lines are colored with the red lines representing the highest P(T ≤ t) two-tail values and the purple lines representing the lowest. Study area is where dots occur.)

Fig. 6
figure 6

Contour map of precipitation linear regression slope values (the red contour lines represent the highest values, while the purple contour lines represent the lowest. Study area is where dots occur.)

A map of the Escanaba River Basin is shown in Fig. 7 with the precipitation stations marked. Each station is labeled and those that had enough data include the value of the slope of the linear regression of the respective station’s data.

Fig. 7
figure 7

Escanaba River Basin precipitation values of the slope of the linear regression values, m, for those gages that had sufficient data Google Maps)

Streamflow

In total, there are 209 streamflow gages in Michigan watersheds. Of these 209 gages, 143 of them have at least 20 data points. The Escanaba River Basin will be used as an example in the following results as it is a good representation with multiple streamflow gaging stations and dams. The analysis for all watershed is given in the supplemental data.

The mean of all streamflow data points for each gaging station ranges from 5.59 ft3/s in Oshtemo, MI, to 4831.24 ft3/s in Saginaw, MI. These same gaging stations also hold the lowest and highest standard deviations of 0.79 ft3/s and 1224.69 ft3/s, respectively. The lowest coefficient of variation of streamflow data is found in East Jordan, MI, at 0.05 while the highest is found in Palmer, MI, at 0.48. Figure 8 shows the streamflow data plot for gaging station 04057800 in the Middle Branch Escanaba River located in Humboldt, MI. This type of plot was created for all 143 gaging stations and can be found in the supplemental information. The linear regression for this particular location displays a slope of − 0.1822 cfs/year.

Fig. 8
figure 8

Streamflow data for gage 04,057,800 in the Middle Branch Escanaba River located in Humboldt, MI

Of all streamflow gaging stations, the slope of the linear regressions ranged from − 5.7041 cfs/year in Pembine, WI, to 46.09 cfs/year in Banat, MI. Positive linear regression slopes appear to be in 75.68% of the gages. When this slope is divided by the average streamflow of each particular location and multiplied by 100, the minimum of this statistic is found in Oshtemo, MI, with a value of − 2.2064/year. This is the same gaging station that exhibits the smallest average streamflow and standard deviation and also has the smallest value of regression slope divided by mean streamflow. This same relationship does not exist for the largest statistic of linear slope divided by mean streamflow; this is found in Hastings, MI, and holds a value of 2.6554/year. It was found that 74.58% of the 143 gaging stations had positive values of linear regressions divided by average streamflow values.

After calculating the 11-day streamflow moving averages and moving standard deviations, it was seen that 118 gaging stations had at least 20 data points and could be further evaluated. The slope of the linear regressions for moving average data sets ranged from − 25.767 cfs/year in Pembine, WI, to 18.0 cfs/year in Grand Rapids, MI. The location in Pembine is the same gaging station that had the linear regression with the smallest slope from the raw data. Additionally, 66.10% of the 118 gages had positive linear regression slopes for moving averages. The moving standard deviation linear regression slopes ranged from − 7.3222 cfs/year at the same Pembine location to 5.0578 cfs/year in Vulcan, MI. This gaging station in Vulcan is in the Menominee River, which is the same river that the Pembine gage is in. Out of the 118 gages, 27.97% of them had positive linear regression slopes for the moving standard deviations. The smallest slope of moving coefficient of variation regressions is − 0.0081/year in Birmingham, MI, while the largest is 0.0047/year in Palmer, MI. This gaging station in Palmer is the same one that had the largest coefficient of variation of its raw data. Of the 118 gages, 33.05% had positive linear regression slopes for moving coefficient of variations.

There were 29 data sets that appeared to have cyclic/unnatural streamflow data, perhaps due to a dam gate opening and closing according to some set algorithm. Regardless, these do exist for a percentage of the total gaging stations and it should be noted. An example of this data is shown in Fig. 9.

Fig. 9
figure 9

Example of cyclic data of streamflow at gage 04126740

An example of splitting the data into two datasets based on a visual inspection of changing slopes is shown in Fig. 10 for gaging station 04057800 in the Middle Branch Escanaba River located in Humboldt, MI. For this example, the “pre” data have a slope of 0.2885 cfs/year and the “post” data exhibit a slope of 0.0108 cfs/year. This corresponds to a change of 96.26%. The mean percentage of change between all of the “pre” and “post” streamflow slopes was found to be 252.65%. Table 4 displays the results of the t test for this gaging station. A summary of all the statistics for all streamflow gage locations in given in Table 5.

Fig. 10
figure 10

“Pre” (circle) and “post” (triangle) streamflow data for gage 04057800 at Mid Branch of the Escanaba River at Humboldt, MI

Table 4 Streamflow T test results for gage 04057800
Table 5 Summary of streamflow statistical results for all gages

In total, there were ten streamflow gaging stations, or 7.09%, that had P(T ≤ t) two-tail values less than 0.05. All ten of these gaging stations also had t Stat values greater than t Critical two-tail values. Therefore, the hypothesis that the data of each individual streamflow gaging station come from a singular dataset is rejected for 7.09% of the 141 streamflow gaging stations that t tests were run for. Figure 11 shows the contour map that was created using all of the obtained streamflow P(T ≤ t) two-tail values.

Fig. 11
figure 11

Contour map of streamflow P(T ≤ t) two-tail values (each black dot is a streamflow gaging station and the contour lines are colored with the red lines representing the highest P(T ≤ t) two-tail values and the purple lines representing the lowest)

Twenty percent of streamflow gaging stations were found to have nearby dams. It is possible that these dams impact the streamflow data and any related calculations.

The Escanaba watershed is shown in Fig. 12. The streamflow gaging stations are pointed out with their respective values of the slope of the linear regression for particular streamflow gaging station.

Fig. 12
figure 12

Escanaba River Basin streamflow with slope of streamflow linear regression, m (Google Maps). The rest can be found in the Supplemental Information (the rivers/streams are blue lines, orange dots represent the streamflow gaging stations, and black rectangles represent dams)

Discussion

The results of this study, namely that precipitation is increasing, agree with climate change theory that says there will be more precipitation in this region of the world as the climate warms. The results the river flowrate is not increasing agrees with hydrology theory in that surface storage is able to absorb some of the increased precipitation.

This study quantified changing trends in precipitation and streamflow using the statistical methods of the linear regression best-fit line for the whole data set and also for before and after a change point, moving mean, and moving standard deviation. There are other statistical methods that could have been used and were used in other studies. These include frequent patterns mining and Random Forest methods (Zeng et al. 2021), the Budyko hypothesis and the TUW model (Zhong et al. 2021), the annual time series of 7-day average minimum streamflow, the scaled average deficit at or below the 2% mean daily streamflow value relative to a base period, and the annual number of days below the 2% threshold (Fleming et al. 2021), different nonparametric Mann–Kendall trend tests (Adib and Tavancheh 2019; Yan et al 2017; Asarian and Walker 2016), gridded data set comparison (Henn et al. 2018), Mann–Kendall and Pettitt's test and double mass curve method (Guo et al. 2018), multiple linear regression (Shrestha et al. 2021), wavelet transfer methods (Zhanget al. 2017), detrended fluctuation analysis and multifractal DFA (Tan and Gan 2017), and m-DMC and m-SCARQ approaches (Swain et al. 2021).

Conclusions

A statistical analysis of all precipitation and streamflow data for the entire state Michigan shows the following:

  1. 1.

    The vast majority of gaging locations in Michigan have increasing precipitation (90%) and streamflow (76%).

  2. 2.

    A lower, yet still significant, percentage, of precipitation gage locations have increasing moving standard deviation values (54%).

  3. 3.

    A minority of streamflow gage locations have an increasing moving standard deviation (28%).

  4. 4.

    The hypothesis that precipitation and streamflow are increasing is, therefore, confirmed.

  5. 5.

    The hypothesis that extremes of precipitation are increasing is also confirmed.

  6. 6.

    The hypothesis that extremes of streamflow is not confirmed.

  7. 7.

    Dams and reservoirs help absorb some precipitation from reaching rivers, thereby reducing the possible effects of changing climate on water management.

  8. 8.

    Values of precipitation P(T ≤ t) two-tail, precipitation linear regression slope, and streamflow P(T ≤ t) two-tail occur in concentrated regions.

  9. 9.

    Water managers may need increased budgets in the future to handle greater streamflow values for flooding, transportation, recreation, and water supply.