Introduction

The path to rapid economic growth in China since the 1980s is engraved by fast CO2 emissions increase. Globally, China has been considered to be the second largest economy and largest CO2 emitter during the last decade. In 2018, it produced 15.8% of global gross domestic product (GDP) and emitted 29.3% of global greenhouse gases (GHG) (International Monetary Fund 2019; BP 2019). At present, China is struggling with reducing high CO2 emissions and keeping fast growth simultaneously as a response to its ambitious sustainable development targets. To be concrete, this country tries to abate its CO2 emissions per unit of GDP by 60–65% before 2030 compared with 2005 level and keeps the annual GDP growth rate higher than 6% (State Council 2019).

It has been widely recognized that improving carbon emissions performance is a crucial way to reduce CO2 emissions during economic activities and promote sustainable development (Tang et al. 2016a; Yang et al. 2018; Chen and Xu 2019; Iram et al. 2020; Tang and Hailu 2020). Many researchers have focused on measuring and evaluating carbon emissions efficiency (CEE) using various approaches in different countries/regions (e.g., Mirza and Kanwal 2017; Tang et al. 2019; Chang 2020). Due to the geographic differences in carbon emission levels and performance, the continuous high emissions and low CEE in some countries have increased the difficulty in global carbon emission reduction (Battisti et al. 2015). Hence, scholars have tried to explore whether countries with low CEE can catch up with those with high efficiency. This is the main research field of international carbon emissions efficiency convergence.

Since traditional β convergence, σ convergence, and random convergence tests (e.g., Barro and Sala-I-Martin 1992; Liu et al. 2017) do not consider individual heterogeneity, Barro et al. (1991) and Quah (1997) delivered the club convergence theory, saying that there exists convergence in the economic growth of regions which share similar initial development and economic structural characteristics. Exploring the club convergence of CEE can demonstrate the dynamics and influencing factors of regional CEE in different convergence club areas (Haider and Akram 2019). This will provide theoretical insights and decision-making reference for reducing regional CEE differences and promoting the coordinated low-carbon development among regions. Therefore, it is essential to know the club convergence of CEE in different regions, especially within a country. However, few have conducted empirical studies on this topic.

In terms of Chinese context, the majority of the existing studies utilize provincial-level data or sectoral-level data (Yang et al. 2017; Lin and Wang 2019; Wang et al. 2019b; Zhou et al. 2020a). Nevertheless, few have investigated the trend of China’s carbon emissions performance from a city perspective. In China, cities contribute more than 70% of the national CO2 emissions (Wang et al. 2018). Accordingly, cities are expected to transform from the current emission-intensive growth path to a more sustainable way. As such, it is necessary to understand carbon emissions performance trend across China’s cities which will provide insights into this country’s sustainable transition patterns. Moreover, city-level analysis, compared with provincial-level or sectoral-level studies, provides a more robust evidence support since it shows intra-provincial heterogeneity and distribution dynamics. There are substantial heterogeneities in resources endowment, energy structure, sectoral structure, development pattern, and other socioeconomic factors for cities in China, a country with vast territory and 1.4 billion population (Wu et al. 2019; Tang et al. 2020a). This leads to distinct differences in both energy intensity (Zhu and Lin 2020) and CEE (Tian and Zhou 2019) among various types of cities. Therefore, it is useful to conduct city-level convergence analysis in terms of carbon emissions performance, which will help form more diverse and targeted reduction and sustainable development policies for different types of cities.

To fill this knowledge gap, we attempt to analyze the carbon emissions performance trend across China’s cities using a total factor framework. A super-efficient SBM model is applied to capture the CEE (Tone 2001; Tran et al. 2019; Tang et al. 2020b). Then, the distribution dynamics model is introduced to analyze the club convergence of CEE. Because the traditional Markov model usually only examines the situation of one-step change (Herrerias 2012; Pan et al. 2015), it is impossible to know the dynamic characteristics of the region over a long period. The distribution dynamics model, which contains a transfer matrix with variable time length of change, can solve this problem. Moreover, we explain the club convergence of urban CEE from the perspective of space spillover. By doing so, we could have more comprehensive and detailed understandings on the carbon emissions performance trend across Chinese cities and develop practical insights for improving the effectiveness of related reduction and sustainable development policies.

In general, the contributions of this study are in several ways. Firstly, a detailed picture of carbon emissions performance trend across China’s cities is depicted, facilitating policymakers to refine policy instruments with a more comprehensive view. Secondly, this paper constructs the matrix of transition probability with different time length to examine the changes of club convergence characteristics in urban CEE over time, revealing more spatiotemporal trends of the convergence of urban CEE. Thirdly, we explain the club convergence of CEE from a spatial spillover perspective, providing new insightful evidence to characterize the trend of carbon emissions patterns. Fourthly, we compile a new panel city-level dataset including dynamic energy use data of 262 Chinese cities from 2003 to 2016.

The remainder of our study is organized as follows. Literature is reviewed in Section 2. Methods and data are described in Sections 3 and 4, respectively. Results are presented in Section 5. Finally, we conclude in the last section.

Literature review

Evaluating carbon emissions performance is usually within a multiple-factor production process; energy and other inputs are used to produce products and other undesirable by-products including CO2. Therefore, early studies employing various ratio methods (e.g., Yamaji et al. 1993; Lu et al. 2007), which simply use the ratio of a certain economic value to CO2 emissions to evaluate carbon emissions performance, are considered to have limits since they are hardly able to reflect the multiple-factor characteristic. In recent years, there is a growing literature which measures carbon emissions performance within a total factor framework. This framework is thought to be capable of dealing with multiple-factor production issues (Tang et al. 2016b, 2018; Chen et al. 2019; Tang et al. 2020a) and therefore employed in our study.

Two efficiency evaluation methods, stochastic frontier analysis (SFA) and data envelopment analysis (DEA), are popular in terms of measuring carbon emissions performance. The parametric SFA is popular among many researchers because it is able to tackle statistic noise and not sensitive to outliers (Wang et al. 2019a). It has been applied to measure CEE at national (Bai et al. 2019), regional (Dong et al. 2013), provincial (Wang et al. 2013; Lin and Du 2015), and sectoral levels (Chen et al. 2019; Li et al. 2019). Nevertheless, SFA requires the assumptions of a specific production function form and random component distribution, as Wu et al. (2019) comment. Such requirements mean the evaluated results may be influenced by the potential wrong choice of function or distribution. In practice, it is difficult for researchers to know whether the chosen function and distribution are correct and avoid arbitrary choices.

The nonparametric DEA overcomes this weakness; it does not need to assume a function form or random component distribution in analysis. Besides, DEA is flexible in application and does not need big samples. Accordingly, DEA is gradually employed in estimating carbon emissions performance. Various DEA-based methods have been used in this field, including non-radial distance function (Kang et al. 2018), output-based DEA (Pang et al. 2015), fuzzy DEA (Egilmez et al. 2016), slacks-based DEA (Zhou et al. 2020b), Malmquist-based DEA (Liu et al. 2017), and super-efficient SBM model (Tone 2001; Tran et al. 2019; Tang et al. 2020b). Compared with other DEA-based methods, the super-efficient SBM model has the advantage of considering radial as well as non-radial possible enhancements in the production process, thereby delivering more comprehensive performance evaluation results. Therefore, this study also employs super-efficient SBM model to evaluate carbon emissions performance, avoiding the unnecessary assumptions and fully taking into account potential production improvements.

It is generally believed that the neoclassical growth model introduced the concept of convergence in economic analysis (Solow 1956), including absolutely convergence and conditional convergence. Baumol (1986) firstly conducted an empirical analysis on this topic using Maddison data. Since the 1990s, the convergence of economic growth has gradually become a hot issue in development economics and international economics. Generally speaking, the early empirical studies about convergence mainly focus on σ convergence, absolute β convergence, conditional β convergence, and random convergence using regression methods (Barro and Sala-I-Martin 1992; Friedman 1992; Mankiw et al. 1992; Islam 1995; Caselli et al. 1996). Additionally, some tried to use time series analysis method for random convergence test, such as co-integration test (Bernard and durlauf 1995) and unit root test (Carlino and mills 1993; Carlino and mills 1996).

It is believed that only the economies with the same structural characteristics and similar initial states will eventually converge to the same steady state, which is called club convergence (Barro et al. 1991). The main distinction between club convergence and other convergence concepts is that club convergence analysis takes into account heterogeneity across the individuals. Some scholars empirically tested the club convergence under the neoclassical framework (Durlauf and Johnson 1995; Masanjala and Papageorgiou 2004). However, some argued that this regression-based approach is not reliable due to the possibility of existence of Galton’s fallacy (Friedman 1992; Quah 1993; Hart 1995). Thus, Quah (1996) proposed the distribution dynamics approach to tackle this problem, which has been applied in various areas (López-Bazo et al. 1999; Rey 2001; Herrerias 2012; Pan et al. 2015). In addition to the distribution dynamics approach, Phillips and Sul (2007) proposed a method in determining endogenous convergence club based on clustering, which has been used in some recent studies (Camarero et al. 2013; Apergis and Christou 2016; Haider and Akram 2019). Compared with the method of Phillips and Sul (2007), the advantage of the distribution dynamics method is that it can examine the mobility between differing clubs and quantify the solidification within different clubs, allowing to incorporate spatial factors into club convergence analysis.

There are a growing number of studies in the existing literature that test the existence of convergence in environmental indicators, such as emissions and energy consumption. Those studies on environmental and energy convergence have been inspired by the abovementioned economic growth literature. They can be roughly divided into three categories. The first is using the unit root test method to test random convergence, which has been used to analyze carbon emissions (Strazicich and List 2003; Aldy 2006; Barassi et al. 2011; Acaravci and Erdogan 2016; Karakaya et al. 2019), energy consumption (Lean et al. 2016), and ecological footprint (Yilanci and Pata 2020). The second category uses discrete trends for σ convergence tests or the regression analysis methods for β convergence (Aldy 2006). Related studies have discussed energy intensity between Eastern Europe and EU countries (Markandya et al. 2006), carbon intensity of China’s industrial sectors (Yu et al. 2018), and China’s provincial energy intensity (Jiang et al. 2018). The third category is about club convergence. Since the study on the first two kinds of convergence tests do not consider individual heterogeneity, more and more researchers apply club convergence test to avoid this drawback, especially in the Chinese context. For instance, both Wang et al. (2014) and Huang et al. (2015) examine the club convergence of China’s provincial CO2 emissions for different periods. They confirm the club convergence to multiple steady equilibriums for those provinces. However, the city-level study on the club convergence of China’s CO2 emissions is still rare.

Regarding results robustness, convergence evaluation usually needs relatively large samples, which imply that the city-level study has its own advantage. Additionally, the city-level study could also involve intra-provincial heterogeneity, an essential feature of Chinese social-economic context (Wu et al. 2019). City-level energy consumption and related CO2 emissions panel data, however, are not virtually available. This probably explains why related studies are scarce. We try to fill this gap by conducting a club convergence analysis based on a newly compiled panel dataset including 262 Chinese cities from 2003 to 2016. This dataset is constructed through analyzing major energy use sources including natural gas, liquefied petroleum gas, and electricity. Conducting such an analysis could help us to better understand the city-level CO2 emissions’ heterogeneity and dynamics.

Little literature has tried to provide explanations of the club convergence in terms of China’s CO2 emissions. To the best of our knowledge, only Wang et al. (2014) utilize the ordered logit model to explore the determinants underlying the club convergence in provincial emissions. They find that economic level, energy use pattern, and energy intensity have an impact on the long-term dynamics of provincial CO2 emissions. However, their method, in essence, is a static regression and therefore does not reflect the dynamic process of club convergence. In our analysis, the club convergence of CEE is explained from a spatial spillover perspective, which is dynamic and provides new insightful evidence. This is another knowledge gap we try to fill.

In total, our study differs from the existing literature in several ways. Firstly, instead of focusing on provincial- or sectoral-level analysis, we depict a detailed picture of carbon emissions performance trend across China’s cities based on the super-efficient SBM and convergence analyses. Secondly, the club convergence of emissions efficiency is explained from a spatial spillover perspective. Thirdly, a new panel city-level dataset is constructed including dynamic energy use data of 262 Chinese cities for 14 years (2003 to 2016).

Methods

Estimating carbon emissions efficiency

The super-efficient SBM model (Tone 2002) is an increasingly popular method for estimating efficiency (Tran et al. 2019; Zhang et al. 2019). Since it can not only effectively tackle pollutants redundancy but also distinguish the efficient DMU at the frontier, this study also uses the super-efficient SBM model to explore CEE across China’s cities as follows:

$$ {\displaystyle \begin{array}{c} Min\rho =\frac{\frac{1}{m}\sum \limits_{\mathrm{i}=1}^m\frac{\overline{x}}{x_{ik}}}{\frac{1}{s_1+{s}_2}\left\{\sum \limits_{r=1}^{s_1}\frac{\overline{y^d}}{y_{rk}^d}+\sum \limits_{l=1}^{s_2}\frac{\overline{y^u}}{y_{lk}^u}\right\}}\\ {}s.t.\left\{\begin{array}{c}\overline{x}\ge \sum \limits_{j=1,\ne k}^{\mathrm{n}}{x}_{ij}{\lambda}_j;\overline{y^d}\le \sum \limits_{j=1,\ne k}^{\mathrm{n}}{y}_{rj}^d{\lambda}_j;\overline{y^d}\ge \sum \limits_{j=1,\ne k}^n{y}_{lj}^d{\lambda}_j\\ {}\overline{x}\ge {x}_k;\overline{y^d}\le {y}_k^d,\overline{y^u}\ge {y}_k^u;\\ {}{\lambda}_j\ge 0;i=1,2,\cdots, m;j=1,2,\cdots, n,j\ne k\\ {}r=1,2,\cdots, {s}_1;l=1,2,\cdots, {s}_2\end{array}\right.\end{array}} $$
(1)

where ρmeans the urban CEE and n is the cities’ number. There are m input variables, s1 expected outputs, and s2 undesired outputs. x, yd, and yurepresent input, desirable output, and pollutant vectors, respectively. kindicates the evaluated decision units.

Evaluating club convergence

Markov chain is a discrete time stochastic process. It studies the characteristic probability of movement from a discrete state to another. This study uses the Markov method proposed by Zhou et al. (2018) to study the club convergence of carbon emissions efficiency. The method expands the Markov transition probability matrix to a given time period. Thus, the trend of CEE can be examined more accurately.

Specifically, the method to construct the Markov transition probability matrix of a multi-year duration is as follows. Assuming that all cities are classified into k types according to the difference in CEE, the transition probability of a d-year period is recorded as \( {P}_{ij}^{t,t+d}=P\left\{{X}_{t+d}=j\left|{X}_t=i\right.\right\} \). It represents the probability that a city’s CEE transfers from type i in year t to type j after d years. Combining all cities and all possible transitions during the whole study period, we obtain the Markov transition probability \( {P}_{ij}^d \) in the study period T and estimate it with Eq. 2.

$$ {P}_{ij}^d={\sum}_{t={t}_0}^{T-d}{n}_{ij}^{t,t+d}/{\sum}_{t={t}_0}^{T-d}{n}_i^t\kern0.1em $$
(2)

In Eq. 2, \( {n}_{ij}^{t,t+d} \) indicates the number of cities with the CEE transferring from i in t to j in t + d. \( {n}_i^t \)means the cities’ number whose CEE are type i in t. Through estimating the different types of transition probabilities, we obtain the Markov transition probability matrix with a multi-year duration as shown in Eq. 3:

$$ \kern0.50em \left[\begin{array}{c}\frac{n_{11}^d}{n_{1\cdotp}^d}\kern0.7em \cdots \kern0.5em \frac{n_{ij}^d}{n_{1\cdotp}^d}\kern0.8000001em \cdots \kern0.7em \frac{n_{1k}^d}{n_{1\cdotp}^d}\\ {}\frac{n_{21}^d}{n_{2\cdotp}^d}\kern0.7em \cdots \kern0.5em \frac{n_{2j}}{n_{2\cdotp}^d}\kern0.8000001em \cdots \kern0.7em \frac{n_{2k}}{n_{2\cdotp}^d}\\ {}\cdots \kern1.2em \cdots \kern0.4em \cdots \kern1.3em \cdots \kern1.3em \cdots \\ {}\frac{n_{k1}^d}{n_{k\cdotp}^d}\kern0.7em \cdots \kern0.5em \frac{n_{kj}}{n_{k\cdotp}^d}\kern0.8000001em \cdots \kern0.7em \frac{n_{kk}}{n_{k_{\cdotp}}^d}\end{array}\right]=\left[\begin{array}{c}\kern0.1em {p}_{11}^d\cdots {p}_{1j}^d\kern0.4em \cdots {p}_{1k}^d\\ {}{p}_{21}^d\cdots {p}_{2j}^d\kern0.3em \cdots {p}_{2k}^d\kern0.1em \\ {}\cdots \kern0.6em \cdots \kern0.4em \cdots \kern0.7em \cdots \kern0.6em \cdots \\ {}{p}_{k1}^d\cdots {p}_{k2}^d\kern0.4em \cdots \kern0.3em {p}_{kk}^d\end{array}\right]\kern0.30em $$
(3)

In Eq. 3, \( {n}_{i\cdotp}^d \) is the scale of type I cities, which is \( {\sum}_{t={t}_0}^{t_n-d}{n}_i^t \)in Eq. 2. \( {P}_{ii}^d \) means the probability of type I cities to remain in type i after d years. Larger value means higher degree of the CEE solidification, i.e., the club convergence phenomenon of cities’ CEE is more obvious.

Explaining the club convergence from a spatial spillover perspective

In this study, we explain the club convergence of CEE from a spatial spillover perspective. Accordingly, a commonly employed analysis in regional science, spatial autocorrelation, is conducted (Liu et al. 2019). The global spatial autocorrelation is tested using the Moran index to understand the spatial correlation of emissions as Eq. 4 (Ma et al. 2019):

$$ I=\frac{n{\sum}_{i=1}^n{\sum}_{j=1}^n{w}_{ij}\left({\mathrm{y}}_i-\overline{y}\right)\left({\mathrm{y}}_j-\overline{y}\right)}{\sum_{i=1}^n{\sum}_{j=1}^n{w}_{ij}{\sum}_{j=1}^n{\left({\mathrm{y}}_i-\overline{y}\right)}^2} $$
(4)

where n is the cities’ number;yi and yj represent the emissions efficiency of cities i and j, respectively; \( \overline{y} \) represents the average of the emissions efficiency; and wij means the elements in the spatial weight matrix w. The specific weight is set as in Eq. 5 (Ding et al. 2017):

$$ {w}_{ij}=\left\{\begin{array}{c}1\left(i\ne j\right)\\ {}0\left(i=j\right)\end{array}\right. $$
(5)

The global Moran index ranges between − 1 and 1. Positive values imply global spatial positive autocorrelation and the negative ones mean negative autocorrelation. In our study, positive correlations imply that high-efficiency cities and low-efficiency cities exhibit high-high and low-low spatial accumulation characteristics, respectively. Negative correlations, however, indicate that high-efficiency cities and low-efficiency cities are high-low spatial accumulated with each other.

The spatial Markov model is obtained by introducing the concept of “space lag” into the transition probability matrix (Rey and Montouri 1999). This model can avoid the neglect of the spatial interaction among research regions and study the role of spatial factors in the evolution of city CEE. Measuring the emissions levels of neighboring cities ∑jwijyj requires the spatial weight matrix constructed in the previous section.

Here, k spatial lag types are used as the transition conditions of city, and then the Markov transition matrix will produce k conditional transfer matrices. The values in the matrix are denoted as Pij ∣ λt, t + d as shown in Eq. 6:

$$ {P}_{\left. ij\right|\lambda}^{t,t+d}=P\left\{{X}_{t+d}=j\left|{X}_t=i,X\_ spatial\_{lag}_t=\lambda \right.\right\} $$
(6)

It indicates the probability of CEE in this region to shift from i type to j type after d years under the condition of the spatial lag of carbon emissions efficiency type λ in the current year. Considering all possible transfer situations during the whole study period, we obtain the Markov transition probability of the conditions and estimate it by Eq. 7:

$$ {P}_{\left. ij\right|\lambda}^d={\sum}_{t={t}_0}^{T-d}{n}_{\left. ij\right|\lambda}^{t,t+d}/{\sum}_{t={t}_0}^{T-d}{n}_{\left.i\right|\lambda}^t\kern0.1em $$
(7)

By comparing the corresponding probability values of Markov and spatial Markov transition probability matrices, we can explore the transfer relationship of CEE between a city and its neighboring one. For example, \( {P}_{12}^d>{P}_{12\left|1\right.}^d \) means the neighboring city is also inefficient, indicating neighbors with low CEE have negative influence on improving a city’s emissions efficiency. The spatial Markov model adopted by us can also investigate the spatial spillover effects at different time lengths.

In this study, the chi-square test is used to test the significance of the difference between the two types of Markov transition matrices. The statistics are constructed as follows (Le Gallo 2004):

$$ Q=-2\log \left\{{\prod}_{h=1}^k{\prod}_{i=1}^k{\prod}_{j=1}^k{\left[\frac{p_{ij}^d}{p_{ij}^d(h)}\right]}^{{n_{ij}}^d(h)}\right\} $$
(8)

where k is the type of CEE in each region and also the type of spatial lag; \( {p}_{ij}^d \) is the estimated spatial transition probability with the duration of d years over the entire period; \( {p}_{ij}^d(h) \) is the spatial transition probability when the spatial lag type is h and the duration is d years; \( {n}_{ij}^d(h) \) indicates the number of cities corresponding to \( {p}_{ij}^d(h) \) (h = 1, 2, 3…k). Statistic Q progressively obeys the chi-square distribution with freedom degree k × (k − 1) and the difference between the cities’ number has a transition probability of zero.

Data

In this article, we compile a dataset including China’s prefecture-level city data on inputs and desirable and undesirable outputs. The dataset includes 262 prefecture-level cities for the period of 2003–20161,Footnote 1 and 2003 is set as the base year (Table 1).

Table 1 Descriptive statistics of variables

We use the perpetual inventory method (Wu et al. 2019) to calculate the cities’ capital stock based on yearly fixed asset investment data. The calculated capital data are adjusted using 2003 constant price. The labor input data, represented by the sum of employees, are obtained from China Urban Statistic Yearbook. Since there is currently no available city-level energy consumption data, the annual electricity consumption is chosen to represent energy input according to Han et al. (2018) and Fullerton Jr and Walke (2019). We use annual GDP of each city, derived from China Urban Statistic Yearbook (National Bureau of Statistics of the People’s Republic of China, 2013-2017) using 2003 constant price, as the expected output. CO2 emissions are used as undesirable output and calculated based on an accounting method using urban natural gas, liquefied petroleum gas, and electricity data (Han et al. 2018).

Results and discussions

Carbon emissions efficiency

Using the newly compiled data, MaxDEA software is utilized to measure the CEE of 262 cities. Using the natural breakpoint classification method in the ArcGIS 10.0 software, this article makes a visual comparative analysis on the distribution of CEE in 2003 and 2016, respectively. The results are shown in Fig. 1.

Fig. 1
figure 1

Chinese city-level carbon emissions efficiency in 2003 and 2016

Figure 1 shows that the CEE of northwestern cities is generally low and that of the central region and coastal areas is relatively high. This is consistent with Cai et al. (2019). The environment of northwestern cities is fragile and the infrastructure is less developed (Liu et al. 2017). Local communities have excessively exploited and highly relied on natural resources (especially coal) which reduced CEE. Moreover, northwestern cities have become the main region to undertake the transfer of China’s high-pollution industry from the coastal areas (Cheng and Zhao 2018), causing industrial carbon emissions to surge.

Although the CEE of the northwestern region is relatively low, its improvement is substantial. In the northeast region, the CEE of some cities has always been relatively high, and the radiation effect has pulled neighboring cities to achieve a faster growth speed in emissions efficiency. With the fast economic development and technology, as well as China’s recent emphasis on the protection of the environment, China’s cities began to adjust the industrial structure, encourage technological innovation, as well as make their economy greener. Western cities are gradually realizing the importance of sustainable development and paying attention to improving CEE. Additionally, recent “Belt and Road” construction has effectively expanded the scale of low-cost and low-emission industries and improved the efficiency of energy use in those cities (Fan et al. 2019), thus improving CEE. Moreover, the overall growth rate of CEE in various cities has accelerated significantly since 2008. This may be due to the decline in demand for Chinese exports after the 2008 financial crisis, which promote China to further optimize the industrial structure, increase investment in saving energy and reducing emissions, and develop a sustainable economy.

Club convergence of carbon emissions efficiency

This article first examines whether there is a club convergence phenomenon in China’s urban CEE under different time periods. Considering that China carries out its official development plan every 5 years, we evaluate club convergence over the periods between 1 and 5 years. Cities’ CEE is discretized into 4 levels, including low, medium-low, medium-high, and high2,Footnote 2 to measure the club convergence (Du et al. 2018). The Markov model based on the transition probability (Eq. 2) is applied to calculate the transition probability matrix of CEE for each year.

In the transition probability matrix, each value represents the probability of the CEE of a certain type of city in China shifting from one state to another. The diagonal value is the probability that the city will remain in this state for different periods. Bigger values mean more obvious club convergence3.Footnote 3

The results show that the diagonal probabilities are higher than the probabilities of moving to other states (Table 2). In particular, the probabilities that cities with low efficiency as well as high efficiency converge to their own club are relatively high (between 0.591 and 0.818). This implies that there exists convergence of high-level and low-level clubs in the CEE. The convergence level of cities with intermediate CEE is much lower than those with high-level or low-level values, indicating that the efficiency competition among intermediate-level cities is fierce; they actively improve CEE and frequently transfer to other types.

Table 2 The club convergence test results of urban carbon emissions efficiency in China

In China, cities with high CEE usually have advanced production technology, high energy use efficiency, and well-developed industrial structure. Those advantages contribute to their achievement in balancing economic growth and carbon emissions well. For cities with CEE, multiple disadvantages, including less-favorable economic and environmental endowments, obstruct the sustainable transformation of local growth patterns. For example, many northwestern cities (e.g., Baotou and Yinchuan) are still highly relying on heavy industries with massive CO2 emissions. On the one hand, they lack funds and green production technology for the transformation. On the other hand, they have been enjoying large numbers of jobs for local less-educated labor and substantial revenue provided by those heavy industries for decades. Therefore, it is hard for those cities to improve the CEE. For cities with intermediate-level CEE, the differences are not particularly obvious in the above aspects, resulting in relatively high liquidity in efficiency. Liu et al. (2018) has a similar conclusion that high-level and low-level industrial SO2 and soot emissions of Chinese cities are relatively stable through PS convergence method.

Unlike previous studies that only consider 1-year transfer (Du et al. 2018), this article analyzes the transfer situation in 1–5 years. The results show that the probability of the diagonal line declines gradually as the transfer span increase, indicating the club convergence of CEE is alleviating with the accumulation of time. With a 5-year period, the probability of the low-, medium-low-, medium-high-, and high-level clubs maintaining their own states decreases by 0.197, 0.285, 0.345, and 0.227, respectively. This means that the solidification of differences in the CEE among various cities has alleviated. Nevertheless, the urban CEE still has obvious club convergence for high level (0.616) and low level (0.591). During the study period, cities including Xuchang, Zhoukou, Xiaogan, Huanggang, Shaoyang, Maoming, Neijiang, Hulunbeier, and Jiamusi have always been at a high level of CEE. Dongying, Qingyuan, and Baoji have always been in the middle-level clubs. Besides, there exists a “low carbon efficiency trap” problem for the cities of the low-level club in China, such as Baotou, Tongchun, and Yinchuan.

Spatial explanation of club convergence

This study attempts to explain the club convergence of CEE from a spatial perspective. We first examine the spatial correlation of urban CEE and test the spatial agglomeration degree through Moran’s I value (Li et al. 2017).

Table 3 presents that there exists a clear spatial agglomeration in the CEE. Except for 2003, the values in all other years have passed the significant test with a significant level of 0.05. The spatial distribution of CEE shows a spatial distribution characteristic of “high-high agglomeration” and “low-low agglomeration” over time. The results are in line with Wang et al. (2019c) who find that there exist spatial agglomeration characteristics for urban carbon emissions in China.

Table 3 Moran index value of urban carbon emissions efficiency

Table 4 gives the transfer probability matrix of urban CEE for both 1 year and 3 years in China. Each value in the matrix indicates the probability of CEE moving between different clubs with varying levels of neighboring cities. The diagonal is the probability that a city remains in that club under different time lengths. By examining the difference between spatial and nonspatial Markov transition probability matrices at different time lengths, we can explore the effect of spatial effects on club convergence of CEE.

Table 4 The spatial Markov transfer probability matrix of urban carbon emissions efficiency in China

Table 4 shows that when the neighboring cities are in low-level club, the probability that the low-level city still stays at a low level after 1 year is 0.839, 2.6% higher than the average level without spatial effect. Moreover, the possibility of upward migration of this type of city becomes smaller. The probability of upward movement to the medium-high level is 0.016, which is lower than the scenario when the spatial factor is not considered (0.018). The probability of moving up to the medium-low level is also lower than the level without considering the spatial factor. When the city is at a medium-low level, there is a probability of 0.145 of being pulled down to the low-level club, higher than the average level of 0.118. For those and their neighboring both at the high-level club, the probability of maintaining the original efficiency level is 0.836, higher than the average of 0.818. However, the probability of the high-level cities moving down to the middle-high, middle-low, or low level is lower than the average level. It is evident that when the neighboring cities are at a low level, a city is less likely to improve CEE. However, when the neighboring cities are at a high level, a city has a smaller probability of decreasing CEE.

To test results reliability, the transition probability of CEE over a 3-year period is further examined. When the emissions efficiency levels of a city and its neighbor are both low, the probability of keeping efficiency at low level after 3 years is 0.769, which is 6.5% higher than the average. If the efficiency is at high level, the probability of being at a high level after 3 years is 0.650, 2.2% lower than the average. It can be seen that after 3 years of accumulation, the cities with low CEE has a “dragger effect” on their neighbor. When the level of the neighboring city is high, the probability of low-efficient cities remaining at a low level after 3 years is 0.546, and the probability of upward transferring is 0.454, 15.8% higher than the average, indicating that high-level cities have a positive impact on their neighbors’ CEE.

Moreover, we compare the spatial effects of 1 year and 3 years under four spatial types to investigate the temporal dynamic effects of the above spatial effect. The results indicate that the CEE of neighboring and that of local cities are more likely to be the same type, which is consistent with the accumulation characteristics in the earlier section of this study. In addition, under different spatial types, the degree of spatial impact under 3 years is generally higher than that of 1 year (Table 5). The results with other spatial types are similar. Those results imply that there exists spatial spillover in CEE.

Table 5 Comparison results of spatial effects

The significance of the difference between the spatial and nonspatial Markov chain transition probability matrices is examined by Q statistics (Table 6). The spatial effect of CEE passes the significance test in all time periods. This conclusion is consistent with the other studies analyzing provincial data (Zhou et al. 2019). The difference between the two kinds of transfer probability matrices can be reflected by Q value which can measure the degree of spatial spillover. It tends to increase, implying the strengthening spillover effect of the neighboring cities.

Table. 6 The significance test results of the differences between spatial and nonspatial Markov transfer probabilities

We can draw a conclusion from the above analysis that spatial factors affect the club convergence of CEE of Chinese cities through spatial agglomeration and spillover. There is an obvious spatial correlation for China’s urban CEE which are “high-high agglomeration” and “low-low agglomeration.” Besides, there exists promotion effect from high-level CEE neighboring cities and inhibition effect from inefficient ones. Such promotion or inhibition effect could be summarized as spatial spillover effect. Due to the adjacent geographical location, convenient transportation, and frequent communication, the neighboring cities are more likely to share with each other in carbon emissions reduction technology and management experience. Additionally, for a city, the reduction policies of the surrounding cities are potentially useful reference. The positive spillover effect often occurs in the high-level agglomeration area, and the “drag effect” usually occurs in the low-level agglomeration area.

However, the solidification degree of clubs with intermediate CEE is relatively low. On the one hand, most cities with intermediate carbon emissions efficiency are scattered at the junction of high-level agglomeration area and low-level agglomeration area. There would be a more complex spatial effect that they may be positively affected by high-level neighboring cities and may also be hindered by low-level neighboring cities, thus resulting in unstable efficiency. On the other hand, the emissions reduction technology and management level of these cities are consistent with the mean level of the whole country. They have the possibility of either entering into high-level clubs or falling into low-level clubs.

Conclusions and policy recommendation

Using the newly compiled data of 262 Chinese cities from 2003 to 2016, we use the super-efficient SBM model to analyze the CEE of Chinese cities. Then we adopt the extended Markov and spatial Markov approaches to explore the club convergence characteristics and their spatial influence factors.

The results show that the CEE of Chinese cities has improved, especially for the western and northeastern. The efficiency of the northwest cities is low, while those of the central and coastal cities are relatively high. The urban CEE has certain spatial agglomeration characteristics. There exists significant club convergence in China’s urban CEE. The convergence degree between clubs varies significantly and is usually much higher in cities with high- and low-level carbon emissions efficiency. There is an obvious spatial correlation for China’s urban CEE which are “high-high agglomeration” and “low-low agglomeration.” Besides, the CEE between cities has a significant spatial effect. “Drag effect” usually occurs in the low-level agglomeration area, and positive spillover effect often occurs in the high-level agglomeration area.

Our analysis has some policy implications. To improve the CEE of low-level club cities and realize the coordinated and sustainable development of low-carbon economy, the “cross-border” cooperation and communication between cities in different clubs should be highly promoted. Technical cooperation of them will help improve the CEE, especially for the cities in the low-level club. To allow lower-level clubs to get more positive spillovers from high-level clubs, cities of high-level efficiency clubs, such as Xuchang and Xiaogan, should be encouraged to communicate more frequently with others. Since there exists a “low carbon efficiency trap” problem, the Chinese government should focus on strengthening the mitigation of carbon emissions in low-level areas. For the western region, it can be based on the green “Belt and Road” construction to promote the development of green investment and green financial system and to achieve a win-win situation between economic development and environmental sustainability. Moreover, the establishment of a national carbon market is also necessary, which can reduce the abatement cost of carbon emissions and promote the transformation of regional economic and industrial structure.