Introduction

Municipal solid waste is one of the prominent byproducts of our lifestyle. The generation of MSW is a continuous process that is largely influenced by the rapid urbanization and economic growth [1]. The quantity of waste generation indicates the urbanization, industrialization and socio-economic development of a country [2]. Due to high economic growth and rapid urbanization in China, the generation of MSW is a significant concern for the local government to protect public health. Compared to other Asian countries, Chinese MSW generation rate is considerably higher [3]. Consequently, China needs the accurate prediction and conjecture of MSW generation for proper management, future plan, and utilization of MSW [4]. Estimation of future waste generation is the basis of the existing MSW management plan development. The improper forecasting may lead to several management problems in the selection of priority management techniques, the establishment of infrastructure, management policies and environmental impacts. However, forecasting of long-term MSW generation is challenging due to significant changes in MSWM policy and adopting new technologies by the authority. In China, most of the published research work focused only on the influencing socio-economic factor of MSW generation for a specific province or municipality for a specific period of time, and a small number of research has been dedicated to investigating the influencing factors for Chinese MSW generation as well as the prediction of future MSW generation in national scale.

Up to now, several methods have been applied to forecast MSW quantities which can be categorized into five major groups, namely descriptive statistical methods [5], regression analysis method [4], material flow method [6, 7], time series analysis [8,9,10], and artificial intelligence method [11, 12]. From 2006 to 2014, the number of prominent models are used to perform the prediction of MSW generation, such as support vector machine, wavelet transform, artificial neural network, system dynamic, multiple-regression analysis, single regression analysis, fuzzy logic, geographical information system, analytic hierarchy process, gray model and time series analysis models [13].

The traditional descriptive statistical method generally uses for per-capita average waste production and population growth forecasting. This method is limited by the dynamic characteristics of MSW generation [14]. Regression analysis is the most popular method to forecast MSW generation. Between 2006 and 2014, regression analysis was the most common (20%) used approach by researchers [13]. However, it does not consider all factors responsible for waste generation. Unlike descriptive statistics and regression methods, material flow model explicitly characterizes the dynamic process of waste generation and encompasses every input and output of the waste-generation process. Many researchers also apply time series analysis to forecast MSW generation and obtained better results [15]. Time series method employs historical data and uses the relationship between input and output variables to predict the target future values [16]. Conventional time series analysis requires large numbers of data to accurately forecast for a short period of time [9]. In the light of this problem with time-series analysis, GM was developed to overcome the limitation and forecast for long-term periods [9, 17]. The GM (1, 1), a univariate model, which follows the grey exponential law [18], pronounced as grey model first-order one variable [19]. Unlike conventional models, this model can overcome the lack of social and other predictor parameters [20]. The GM (1, 1) model is most widely used grey model for the MSW forecasting, time series analysis, and other applications [21, 22].

Artificial intelligence (AI), alternatively known as computational intelligence or machine learning, which encompasses science and engineering to create an intelligent machine which has the human level intelligence (or better) to solve a wide range of practical problems [23]. In recent years, AI has become more popular and acceptable method to forecast MSW generation. Several AI models have been employed to forecast MSW generation, including artificial neural network, fuzzy logic and genetic algorithms [13]. ANN is a computer program that is inspired by the process of a human brain information processing system. ANN composed of numerous interconnected elements named artificial neuron working in unison to solve any specific problems. The important role of this structure is its data processing system. Unlike other computer program data processing, ANN acquires knowledge from detecting the patterns and relationship within the dataset, and learns through experience, called training [24]. Due to its learning abilities, ANN is one of the significant tools to forecast short-, medium- and long-term MSW generation [14, 20, 25]. The non-linear system of ANN results in more accurate output than other forecasting methods [12]. However, the data over-fitting, difficulties to understand network architecture and poor generalizing performance limit the ANN application [26]. Recently, Fuzzy inference systems have also been widely used in the field of waste management [20, 27]. Fuzzy logic deals with the fuzzy rules, which are produced during the data training process of the inference. Fuzzy rules generate knowledge from the trained database which is an effective and natural process to assist human in the process of justifying and decision-making process [28]. However, the above-mentioned models are employed for MSWG in different cities or provinces around the world, but not tested in large scale for Chinese MSWG.

In the light of these challenges, this study examined the influence of selected socioeconomic factors responsible for MSWG, and forecast the MSWG in China for 2030 using multi-model approach (i.e., artificial neural network, Grey model, linear regression).

Background of MSWM process in China

Municipal solid waste management is one of the most important issues for the state and environment. Proper MSWM is required to encompass all the related issues from MSW generation to final disposal and its environmental consequences in the WM planning process. Most of the developing countries have experienced inadequate data related to waste generation [29] that effects on the proper MSWM process. Without detail knowledge of all waste-related issues, MSWM process may not be successful in most cases. We propose an interconnected framework for proper MSWM process where the MSWG forecasting and identification of socio-economic factors play a significant role (Fig. 1). The framework for a proper MSWM process will help to improve waste management plan in the future. It indicates the MSWM plan incorporated with MSWG forecasting, budget, and the review of existing MSWM status and limitation. The existing management limitation indicates the requirement for future management improvement. MSWG forecasting gives an idea about future waste load, and help to set management strategy within the budget. Selection of MSWM strategies or technologies or both depends on the waste generation, particularly on factors, amount, and nature of MSWG.

Fig. 1
figure 1

Interconnected framework for MSWM process

In China, MSWM adopts several strategies such as 3R techniques (reduce, reuse, and recycle) [30], resource recovery, hazard reduction, and PPR (Producer Pay Responsibility) [31] in 2009 and 2007, respectively. However, the application of the strategies is not so obvious in the real field, and MSW disposal fees in China are not getting considerable attention yet, and there is no volume or weight base charging system practiced in China. Though many European countries have found a significant effect on MSW recycling and reducing MSWG using effective waste disposal charging system [32, 33]. As a developing country, fixed-rate waste disposal charge may not fit for diverse income range people [34]. Thus, an effective volume or weight base disposal charging may improve MSW recycling and MSWG reduction in China.

Likewise strategies, recent researches have found some drawbacks in the technologies and methods used for MSWM in China. Precisely, the limitations are related to inefficient source separation waste collection [35], green house gas (GHGs) [36, 37] and environmental hazard emission during MSW disposal (incineration and landfill) [38,39,40], toxic heavy metal emisssions from MSW open burning [41], and ground water contamination from landfill leachate [42, 43], which are a significant threats to the environment.

Methodology

The historical data obtained from China statistical yearbook (2000–2016) are used for the socio-economic factor analysis and MSWG forecasting [44]. Due to the short dataset, we selected GM (1, 1), linear regression and ANN models for the proper forecasting. The GM (1, 1) model is well applicable for the short dataset, it avoids the lack of social and other predictor values. The linear regression has better results for a large historical dataset. The artificial neural network can apply for short dataset because ANN generates output by acquiring knowledge from the patterns and relationship of data. Several researchers employed ANN model for forecasting purpose based on short dataset [45].

The elaboration of the models used in this study are given below:

Evaluation of socio-economic factors

Factors responsible for MSW generation initially evaluated using the Fuzzy logic toolbox of Matlab. Data were trained in fuzzy editor and generate common and basic sugeno fuzzy inference structure. Each test, two inputs \({x_1}\) and \({x_2}\) from independent variables (urban population, GDP, and energy consumption (EC)) and one output \({y_i}\) (MSWG) used to train the fuzzy logic inference system. For example, in urban population vs GDP influence on MSWG, \({x_1}~\) = Urban population, \({x_2}\) = GDP, and \({y_i}\) = MSWG. Similarly, in EC vs urban population and EC vs GDP influence on MSWG identification. There are nine rules in fuzzy inference employed to generate the fuzzy inference surface map.

The composed rules of fuzzy model equation are listed below:

$${\text{if}}~{x_1}\;{\text{is}}~{A_1}~{\text{and}}~~{x_2}\;{\text{is}}~{B_1}~{\text{then}}~~{y_1}=~{p_1}{x_1}+{q_1}{x_2}+{r_1},$$
(1)
$${\text{if}}~{x_1}\;{\text{is}}~{A_2}~{\text{and}}~~{x_2}\;{\text{is}}~{B_2}~{\text{then}}~{y_2}=~{p_2}{x_1}+{q_2}{x_2}+{r_2},$$
(2)
$${\text{if}}~{x_1}\;{\text{is}}~{A_n}~\;{\text{and}}~{x_2}\;{\text{is}}~{B_n}~{\text{then}}~{y_n}=~{p_n}{x_1}+{q_n}{x_2}+{r_n},~~\;n=9,$$
(3)

where \({x_1}\) and \({x_2}\)are input variables, \(~{A_i}\) and \({B_i}\) are fuzzy sets define triangular-shaped membership function on variables. \(~{p_i}\), \(~{q_i}\), and \({r_i}\) are design parameters, which are determiner in the training process.

In the second layer of Adaptive Neuro-Fuzzy Inference System (ANFIS) model structure, every node computes the degree of activation of rules, and the membership function was multiplied.

$${w_i}={\mu _{{A_i}}}\left( {{x_1}} \right) \times {\mu _{{B_i}}}\left( {{x_2}} \right)$$
(4)

where \({\mu _{{A_i}}}\left( {{x_1}} \right)\) and \({\mu _{{B_i}}}\left( {{x_2}} \right)\) are the membership degree of \({x_1}\)and \({x_2}\) in, \({A_i}\)and \({B_i}\) are fuzzy sets, respectively.

In the third layer, ith rule membership degree \({w_i}\) is normalized,

$$\bar {w}=\frac{{{w_i}}}{{{w_1}+{w_2}}}$$
(5)

The output of any nodes calculated in the fourth layer and fifth layer generates the overall output from the sum of all the incoming signals

$${\bar {w}_i}{y_i}={\bar {w}_i}\left( {{p_i}{x_1}+{q_i}{x_2}+{r_i}} \right)$$
(6)

And, the overall output

$$\mathop \sum \limits_{i} {\bar {w}_i}{y_i}=~\frac{{\mathop \sum \nolimits_{i} {w_i}{y_i}}}{{\mathop \sum \nolimits_{i} {w_i}}}$$
(7)

The Fuzzy inference used in this study is followed by Abbasi et al. interpretation [20]. Conceptually, fuzzy logic is flexible, non-complex, and easy to understand. It can build understanding into the process rather than tacking it onto the end. It gives the appropriate non-linear function of arbitrary complexity. The ANFIS toolbox of fuzzy logic can easily create a fussy system to match with any set of input–output data, which is a convenient way to map an input space to an output space [46].

In addition, a multiple-regression analysis and correlation matrix also made among independent variables and MSWG.

MSW generation forecasting

Forecasting of MSW generation was conducted using Grey model GM (1, 1), linear regression, and artificial neural network (ANN). The detailed process of model to evaluate results are given below:

Grey model GM (1, 1)

The GM (1, 1) model describes the unknown system using a first-order differential equation. The basic mathematical process of GM (1, 1) model are:

Considered \({X^{\left( 0 \right)}}\) is the real discrete time variable

$${X^{\left( 0 \right)}}=\left( {{X^{\left( 0 \right)}}\left( 1 \right),{X^{\left( 0 \right)}}\left( 2 \right), \ldots ,{X^{\left( 0 \right)}}\left( i \right), \ldots ,{X^{\left( 0 \right)}}\left( n \right)~} \right)\;i{\text{ and }}n{\text{}} \geq 4$$
(8)

where \({X^{\left( 0 \right)}}\) is non-negative sequence and \({X^{\left( 0 \right)}}\left( i \right)\) is the data of the time series.

A new series \({X^{\left( 1 \right)}}\) obtained while the sequence \({X^{\left( 0 \right)}}\) subjected to the accumulating generation operation (AGO). The new series \({X^{\left( 1 \right)}}\) is monotonically increasing.

$${X^{\left( 1 \right)}}=\left( {{X^{\left( 1 \right)}}\left( 1 \right),{X^{\left( 1 \right)}}\left( 2 \right), \ldots ,{X^{\left( 1 \right)}}\left( i \right), \ldots ,{X^{\left( 1 \right)}}\left( n \right)~} \right)\;i{\text{ and }}n{\text{}} \geq 4$$
(9)

where

$${X^{\left( 1 \right)}}\left( k \right)=\mathop \sum \limits_{{i=1}}^{k} {X^{\left( 0 \right)}}\left( i \right)\;k=1,2,3, \ldots ,n.$$

The first-order differential equation of grey model GM (1,1) is:

$$\frac{{{\text{d}}{X^{\left( 1 \right)}}}}{{{\text{d}}t}}+a{X^{\left( 1 \right)}}=b$$
(10)

The above coefficient a and b can be achieved using least square method, that is given below equation.

$$A=\left[ {\begin{array}{*{20}{c}} a \\ b \end{array}} \right]={\left( {{B^{\text{T}}}B} \right)^{ - 1}}{B^{\text{T}}}Y$$
(11)

where,

$$Y={\left[ {{x^{\left( 0 \right)}}\left( 2 \right),{x^{\left( 0 \right)}}\left( 3 \right), \ldots ,{x^{\left( 0 \right)}}\left( n \right)} \right]^{\text{T}}}$$
$$B=\left[ {\begin{array}{*{20}{c}} { - \frac{1}{2}\left( {{X^{\left( 1 \right)}}\left( 1 \right)+{X^{\left( 1 \right)}}\left( 2 \right)} \right)}&1 \\ { - \frac{1}{2}\left( {{X^{\left( 1 \right)}}\left( 2 \right)+{X^{\left( 1 \right)}}\left( 3 \right)} \right)}&1 \\ {\begin{array}{*{20}{c}} \vdots \\ { - \frac{1}{2}\left( {{X^{\left( 1 \right)}}\left( {n - 1} \right)+{X^{\left( 1 \right)}}\left( n \right)} \right)} \end{array}}&{\begin{array}{*{20}{c}} \vdots \\ 1 \end{array}} \end{array}} \right]$$

The grey differential Eq. (11), can be used to forecast the value x of time \(\left( {k+1} \right)\), after achieving coefficient a and b.

$$X_{P}^{{\left( 1 \right)}}\left( {k+1} \right)=\left[ {{x^{\left( 0 \right)}}\left( 1 \right) - \frac{b}{a}} \right] \times {{\text{e}}^{ - ak}}+\frac{b}{a}\;k=0,1,2, \ldots ,n.$$
(12)

The predicted value of x state can estimate using inverse accumulated generating operation (IAGO).

$$X_{P}^{{\left( 0 \right)}}\left( {k+1} \right)=x_{p}^{{\left( 1 \right)}}\left( {k+1} \right) - x_{p}^{{\left( 1 \right)}}\left( k \right).$$
(13)

Particularly noted that AGO and IAGO are the most important features of grey system theory which reduces the randomness of data series. AGO used the original time series data as the intermediate information for the grey prediction model. It reduces the noise of predicting data series by converting ambiguous original time series data to a monotonically increased series. The systematic regularity of data easily identifies by AGO, therefore, the grey prediction requires a minimal dataset to construct a grey differential equation for prediction [47].

Linear regression

The mathematical formula of linear regression to forecast MSW generation are given below:

Considered the actual value time series is \({x^\prime }\),

$${x^\prime }=\left( {{x^\prime }\left( 1 \right),{x^\prime }\left( 2 \right), \ldots ,{x^\prime }\left( n \right)} \right)$$
(14)

The liner regression equation is:

$${X^\prime }=\left( {\alpha +\beta T} \right)\quad T=1,2,3, \ldots ,n$$
(15)

where \(T\) is the time series, \(\alpha\) is the \({X^\prime }\) intercept and \(\beta\) is the slope. \(~\alpha\) and \({\text{}}\beta\) can be obtained from the equation below:

$$\beta =\frac{{N\mathop \sum \nolimits^{} T{x^\prime } - \left( {\mathop \sum \nolimits^{} T} \right)\left( {\mathop \sum \nolimits^{} {x^\prime }} \right)}}{{N\mathop \sum \nolimits^{} {T^2} - {{\left( {\mathop \sum \nolimits^{} T} \right)}^2}}}$$
(16)
$$\alpha =\frac{{\mathop \sum \nolimits^{} {x^\prime } - \beta \mathop \sum \nolimits^{} T}}{N}$$
(17)

where \(N\) is number of observation in actual data.

The predicted value \(X_{p}^{\prime }\) of the \({x^\prime }\) time series can be obtained by subjecting \(\alpha\) and \(\beta\) in the below regression equation.

$$X_{p}^{\prime }=\left( {\alpha +\beta T} \right)$$
(18)

Artificial neural network

The “backpropagation neural network” is the most extensively used network for forecasting applications [45]. It can be used for one to multiple ‘R’ inputs and neurons. Each input in the BP network is weighted by suitable weight ‘W’. The input of the transfer sigmoid function f is the sum of the weighted inputs wp and bias b. To evaluate output a, neuron can use any of the differentiable transfer functions f such as purelin, sigmoid. The structure of basic NARX network illustrated in supplementary Figure S1. In this study, a non-linear autoregressive (NAR) network has been employed. Some steps were followed to use NAR network such as, data processing, training the network, finding the acceptable error, testing network, and data validation. After tasting with several hidden neurons, delays and epochs, the best suitable hidden neurons, delays, and epochs were used to evaluate the acceptable output with the lowest error for this dataset. NAR network data were trained using Levenberg–Marquardt backpropagation algorithm in Matlab 2013 software.

Forecasting model performance evaluation

The performance of different forecasting models was measured using mean absolute percentage error (MAPE), root mean-squared error (RMSE), mean absolute error (MAE), and coefficient of determination (R2), which were computed as follows:

$${\text{MAPE}}=\frac{1}{n}\mathop \sum \limits_{{k=1}}^{n} \left| {\frac{{{x^{\left( 0 \right)}}\left( k \right) - x_{p}^{{\left( 0 \right)}}\left( k \right)}}{{{x^{\left( 0 \right)}}\left( k \right)}}} \right|$$
(19)
$${\text{RMSE}}=\sqrt {\frac{1}{n}\mathop \sum \limits_{{k=1}}^{n} \left| {{x^{\left( 0 \right)}}\left( k \right) - x_{p}^{{\left( 0 \right)}}\left( k \right)} \right|}$$
(20)
$${\text{MAE}}=\frac{1}{n}\mathop \sum \limits_{{k=1}}^{n} \left| {{x^{\left( 0 \right)}}\left( k \right) - x_{p}^{{\left( 0 \right)}}\left( k \right)} \right|$$
(21)
$${R^2}=1 - \frac{{\mathop \sum \nolimits_{{k=1}}^{n} {{\left( {{x^{\left( 0 \right)}}\left( k \right) - x_{p}^{{\left( 0 \right)}}\left( k \right)} \right)}^2}}}{{\mathop \sum \nolimits_{{k=1}}^{n} {{\left( {{x^{\left( 0 \right)}}\left( k \right) - \overline {{{x^{\left( 0 \right)}}\left( k \right)}} } \right)}^2}}}$$
(22)

MAPE The average of absolute percentage error or MAPE is the most widely used measurements for forecasting accuracy, which is recommended by many textbooks, M-competition, and the previous literature [48]. However, it may produce infinite or undefined value while the actual value of it is zero or near to zero.

RMSE it is the most popular judging criterion for the performance of multivariate calibration model, often it is a sole criterion [49]. According to K. H. Jockel and P. Pflaumer, RMSE is nearly invariant with the time period over which it was made, therefore, the observed RMSE can be used to construct confidence limit of future projection [50].

MAE In the statistical point of view, mean absolute error is the measure of the difference between two continuous variables. The MAE in the time series forecasting is the mean of original and predicted time series data difference.

R2 The coefficient of determination is the proportion of the variance in the dependent variable that is predictable from the independent variable(s).

Tasting model performance process through statistical metrics in this study is similar to previous study of S. Farzana et al. [45] and M. Abbasi et al. [20].

Result and discussion

Socio-economic factor of MSWG

Identification of factors influencing MSW generation is one of the most important and challenging problems in MSW forecasting. The general socio-economic factors which are considered as influencing factors of MSW generation are GDP, urban population, urban paved roads, garden, green areas, per capita consumption expenditure, energy consumption, geographical location of area etc [51]. However, the effects of paved roads, garden, green areas, geographical location may not very suitable for national scale MSWG factor analysis in China. Moreover, some previous studies reported that GDP and urban population growth are the major socio-economic factors for MSWG and the contribution of other factors are negligible [51,52,53,54]. Therefore, a correlation matrix and multiple-regression have been conducted among the dependent variable MSWG and independent variables GDP, urban population, and energy consumption to investigate the influence of socio-economic factors on MSWG (Table S1 and S2). As it is shown, MSWG shows significant correlation with entire variables. However, the most significant relationship can be found with urban population growth. Later, the multiple-regression model employed with predictors and it has produced Adjusted R2 = 0.905, F (3, 12) = 48.85, p < 0.0005. Therefore, it demonstrates that about 90% data of the dependent variable MSWG can be explained by the independent variables (GDP, urban population, and energy consumption). The probability value of multiple-regression shows highly significance. The coefficient values presented in Table S2, produce multiple-regression equation MSWG = (− 0.011 × GDP) + (0.764 × urban population) + (− 0.038 × energy consumption) − 16,179. The corresponding coefficient values of the independent variables indicated that the MSWG positively increases with the increase of urban population and decrease with increasing GDP and energy consumption. The P value shows below 0.05 for urban population, which is statistically highly significant. However, the multiple correlation coefficients reported tend to be small in magnitude indicating poor prediction and the partial regression coefficients often shown to be unstable when studies are replicated or cross-validated [55]. Hence, investigating the socio-economic factors influence on MSWG using an advanced method such as fuzzy logic tools gives a clear understanding of this issue. The three-dimensional fuzzy interface map gives a clear understanding on the intra-relation between dependent and independent variables. Figure 2 visually illustrates the fuzzy logic generated surface map of MSW generation with relation to urban population, GDP growth, and energy consumption. As shown, the MSWG surface map in Fig. 2a, the MSW generation decreases with the increases of GDP in Y-axis, and alternatively, MSW generation is increases with the increases of urban population in X-axis. However, a bending can observe in middle of the surface map for both variables changes (GDP and urban population), but the overall trends of the surface map indicating the MSWG are positively increase with the urban population growth. The similar scenario can be found in urban population vs energy consumption three-dimensional chart (Fig. 2b). In the X-axis for urban population, it shows sharp increasing trends with the increasing population and decreasing trends are shown in Y-axis, on the other hand, Fig. 2c does not represent any clear-cut relation in between MSWG and independent variable energy consumption and GDP. Therefore, it can conclude that the urban population growth is the main affecting socio-economic factor for MSWG in China. Table 1 represents the MSWG, GDP, and urban population growth, energy consumption in China from 2000 to 2015 [44]. It shows that urban population is gradually increasing year after year, which can assume that the increasing urban population will create an extra load on future MSW generation.

Fig. 2
figure 2

Relationship between MSW generation and socio-economic factors GDP and urban population growth

Table 1 Municipal solid waste generation, GDP, urban population, and energy consumption in China from 2000 to 2015 [44]

Multi-model forecasting of MSWG

China is the biggest Asian country, consists of 23 provinces, 4 municipalities, 5 autonomous areas, and 2 special administrative regions. The way of waste generation and mode of waste collection in China is diversified in different cities and municipalities. Both formal and informal MSW collection system exists in China, and the figure of informal waste collection as well as waste recycling is not properly calculated. As a large administrative country with diverse waste collection system, it is challenging to consider all the waste input in the MSWM stream. Furthermore, it is more costly and time consuming. Therefore, the yearly collected MSW by the municipality is considered as the generation of MSW in China. In this study, yearly MSW generation data obtained from Chinese statistical year book (2000–2015) are used for predicting and forecasting future MSWG using multi-forecasting models. Forecasting MSWG using a single model might have lack of accuracy for prediction. Hence, a comparative forecasting using multi-models can indicate the most accurate and applicable model for MSWG projection.

The data processing results obtain from GM (1, 1) and linear regression are given in Supplementary data, and for ANN model, 20 hidden neurons, 1 delay, and 13 epochs were found as the most suitable to evaluate acceptable output with the lowest error for this dataset. Figure 3, illustrates the comparative forecasting of MSW generation in China using multi-model forecasting. Each model shows an increasing trend of MSW generation over time. The forecasting results by GM (1,1) and the ANN shows approximately similar prediction starting from 2016 to 2030. However, the linear regression shows lower MSWG in 2030, which is different from GM (1, 1) and ANN model.

Fig. 3
figure 3

Comparative multi-model forecasting of MSWG in China

The three forecasting model gives distinct prediction trends after analysis. In case of ANN model, the prediction result gives an irregular and non-linear increasing trend and the predicted data in 2000–2015 shows similarity with the actual value. As it has been discussed earlier in the introduction part, ANN model predicts time series based on the gaining knowledge from the structural pattern of data series through the training process. Therefore, ANN prediction trends in Fig. 3 gives an irregular increasing trend which is similar to actual data trend because of ANN understanding the pattern and relationship in the actual data set. However, the GM (1, 1) model gives a noise free, smooth, non-linear, and monotonously increasing prediction trends. Since the key theory of GM model is to study uncertainty of system with small amount or incomplete data. Therefore, it avoids the inherent defects of conventional methods and works well on poor, incomplete, or uncertain data to estimate the behavior of uncertain system or time series. Furthermore, the actual data used in the AGO and IAGO as the intermediate of GM data processing produce a noise-free monotonously increasing prediction trend. The GM (1, 1) model is one of the popular models for forecasting time series data for a wide range of research [56]. On the other hand, linear regression gives a linear increasing prediction trends from the historical data, which is far from non-linear GM and ANN prediction because of its linear forecasting trend from the historical data which might be limited by randomness and short data series.

However, entire model results indicating MSW generation in China will be between the range of 23431.16 and 24666.65 (104 tons) in 2030. MSW generation actual value, model prediction, and model performance values are given in (Table S3). Based on the model performance indicators, ANN model has found as the most accurate model to forecast MSWG in China. In ANN model MAPE (%), RMSE, MAE ,and R2 have found 0.0143, 450.84, 228.53 and 0.931, respectively. The comparison of model performance R2, MAPE (%), RMSE and MAE are given in Fig. 4. It shows that the R2 value is higher for GM (1, 1) and ANN model, indicating the good agreement with the predicted data and the observed data of MSW generation. The superior R2 value of ANN represents the most closeness of observed and predicted data series. The MAPE % represents the magnitude of measured data off from the observed data. The lowering of the error (%) is the higher accurate measurement. In the case of forecasting MSW generation in China, the higher R2 and lower error can found for ANN model. According to the model performance, the ranking of the MSW generation forecasting model is ANN > GM (1, 1) > Linear Regression. Therefore, a short term MSW generation forecasting were conducted by using ANN model. According to the ANN model, in 2030 China will produce 24666.65 (104 tons) of MSW. The predicted waste is 2.1 and 1.3 times higher than the generated waste in 2000 and 2015, respectively.

Fig. 4
figure 4

Comparisons of multi model forecasting performance indicators

However, if any changes in management strategies are conducted by Chinese MSWM authorities in future, such as strengthening 3R techniques, it can affect the future MSWG forecasting. At present, Chinese MSW recycling is completely conducted by informal sectors and the total amount of waste recycled, reused, and reduced was not calculated in national scale [57]. Although 3R techniques are significantly influenced MSWG, due to lacking data it is not possible to incorporate them in MSWG forecasting.

Considering the forecasting output in this study and the existing MSWM limitations found in the literature, it could speculate that in near future, Chinese MSWM system will face several complications and that will be more complicated due to the contentious increasing of MSWG. The MSWM planners should consider the possible solution to reduce the waste generation for the effective MSWM in China.

To deal with the increasing MSWG, several previous research work [34, 58] including our study [1] proposed various methods to reduce MSWG reduction as well as improvement MSWM in China. A short list of suggestions proposed for MSWG reduction is listed below:

  1. 1.

    Implement 3R techniques effectively for waste management.

  2. 2.

    Increase waste recycling, and the total amount of waste recycling each year should taking into account in the national waste management documentation and planning process. Most of the leading MSWM countries give high priority on waste recycling.

  3. 3.

    Implement proper waste disposal charging system for MSW (volume or weight based).

  4. 4.

    Separate biological waste disposal in the mainstream of MSW.

  5. 5.

    Improve public awareness towards reduce waste generation and practice on material recycle and reuse.

Conclusion

To improve MSWM, a state is required to investigate all the waste-related issues from its source to the final disposal, and the consequences to the environment. The influencing socio-economic factor and forecasting of MSWG is the basis of municipal solid waste management operation and planning process. Hence, the appropriate tools and techniques for identifying influencing factors and forecast MSWG are highly desired by the MSW planners and decision-makers. In this study, influencing factors of MSWG in China were tested by using fuzzy logic, and secondly waste generation were forecasted by using GM (1, 1), linear regression, and artificial neural network (ANN) model. Analytical results stated that urban population growth is the most significant socio-economic factors for MSWG in China, and the waste generation experienced a gradual improvement in future. The GM (1, 1), linear regression, polynomial regression and artificial neural network (ANN) models can well describe the forecasting process. Based on the model performance indicator MAPE, RMSE, MAE ,and coefficient of determination (R2), the artificial neural network model can be found as the most accurate model for forecasting MSWG in China. Pursuant to ANN model, by 2030 MSWG in China will increase 108.7 and 28.9% of the generated waste of 2000 and 2015, respectively.