1 Introduction

Technological diffusion due to foreign direct investment (FDI) forms a crucial source to sustain economic growth in the developing world (Balasubramanyam et al. 1996; Borensztein et al. 1998; de Mello 1999; Xu 2000). This idea is not new and has its antecedent under new growth theory in which overall productivity is demonstrated to be endogenous, pertinent to external inputs such as FDI (Berthélemy and Démurger 2000; Bilgili et al. 2016; Borensztein et al. 1998; Romer 1990; Su and Liu 2016). Aside from this, FDI maintains extra advantages such as being less volatile, difficult to reverse and subject to minimum political collaterals, as opposed to portfolio equity and debt flows, insulating host countries from unexpected economic and political shocks (Lensink and Morrissey 2006). These premises support proactive stances toward multinationals, particularly among developing countries. In this regard, China is no exception. Hard infrastructure, such as special economic zones, is set up alongside institutional reforms by emphasizing property rights (Long 2005). Due to these efforts, China’s record of hosting FDI is unprecedented. Since the early 1980s, it has absorbed a significant share of the world’s FDI stock and its dominance was further cemented by China’s accession to the World Trade Organization (WTO) in December 2001. To put this into perspective, in 2003, China overtook the USA as the top FDI destination for the first time. As of 2016, FDI stock in China was approximately USD1354.51 billion, accounting for 15% of the world’s FDI stock into the developing world (UNCTAD 2017a).

With this backdrop, considerable efforts have been devoted to understand the effects of FDI on various dimensions of the Chinese economy. This string of literature has centered on the growth effect of hosting FDI (Berthélemy and Démurger 2000; Su and Liu 2016; Yalta 2013; Zhao 2012) alongside a subset focus on potential pertinent channels (Fu 2004, 2008; Liu et al. 2009; Zhang 2017). Recently, attention has been directed to its potential effects on institutional and environmental aspects (Chen 2017; Elliott et al. 2013; Long et al. 2015; Salim et al. 2017a; Yang et al. 2013). Overall, these efforts have greatly enriched knowledge of FDI and form the basis for relevant policy to guide and manage it.

Surprisingly, the empirical relationship between FDI and domestic investment (DI)—which should be equally prioritized by decision-makers—has failed to capture sufficient attention. It is further complicated by contrasting theoretical predications between them. Conversely, the crowding-in camp subscribes to the view that FDI-related technological spillovers augment local entrepreneurship and encourage DI. For example, Markusen and Venables (1999) demonstrate that FDI crowds in DI via the broadening backward or forward linkages in the host country. Meanwhile, Harrison et al. (2004) argue that FDI alleviates financial constraints in the host country, leaving more funds available to domestic firms. Similarly, Gall et al. (2013) suggest that the efficiency of the local financial market acts as a catalyst to drive a positive FDI–DI nexus. Conversely, it is possible for FDI to crowd out DI if multinationals gain market share at the expense of their domestic competitors (Aitken and Harrison 1999; Suyanto and Salim 2012). In this scenario, the presence of efficient multinationals intensifies local competition in the goods and factors markets, increasing production costs, and eventually driving their domestic rivals out of business (Harrison and McMillan 2003).

Empirically, the dominance of these competing forces determines the association between FDI and DI. Aside from academic curiosity, understanding the nature of the FDI–DI nexus carries crucial policy implications. This nexus is relevant to macromanagement practices. Recall that DI in China has long been instituted by the arms of the government, similar to state-owned enterprises (SOEs), to meet mandatory growth targets. Although this practice has recently ceased to be a market force, it remains crucial to offset unexpected external shocks. The effectiveness of such fiscal stimulus might be constrained by the crowding-out effect exerted by FDI.Footnote 1

The FDI–DI nexus matters to domestic entrepreneurship. The recent “Made in China 2025” campaign strives to ascend its manufacturing sectors along the ladders of the global value chain. A key to a successful campaign lies in the hands of millions of vibrant private enterprises, rather than on the backs of giant SOEs (Lardy 2014; Lin 2012). Despite this, indigenous private firms are frequently discriminated against in the political pecking order.Footnote 2 As opposed to their higher-ranked SOE counterparts, these firms face weak legal protection and restricted access to bank loans. As such, they are more vulnerable when FDI substitutes DI. In this way, FDI can discourage domestic entrepreneurship. However, when a positive FDI–DI nexus sets in, institutional advantages often allow SOEs to expand their investment ahead of private firms. According to the case studies analyzed by Huang (2003), multinationals, intentionally or not, often form new joint ventures with SOEs to seize emerging market opportunities. This asymmetry is critical to the future trajectory of indigenous private firms who shoulder most of the responsibility to fulfill the Made in China 2025 promise and to transform the Chinese economy at large (Lardy 2014; Lin 2012).

The FDI–DI nexus serves as a key metric to justify preferential treatment of multinationals. Because FDI is widely considered a growth engine and serves a de facto criterion of political promotion, local officials in China, particularly those from inland areas, are willing to provide generous tax breaks and discounts on land acquisition to lure foreign investors (Braunstein and Epstein 2002; Chen 2017; Long 2005; Ng and Tuan 2001). Such initiatives have inadvertently fed a locational tournament among local governments, whereas in the worst-case scenario, attracting FDI is achieved at the expense of environmental degradation (Braunstein and Epstein 2002). These policies are acceptable if there are complementarities between FDI and DI. However, it would not be in the local government’s long-term interest if FDI substituted DI. If such a negative nexus exists, the local government should scale back preferential treatment and divert resources to support programs that promote the growth of indigenous start-ups and well-managed private enterprises. A similar point was made by Huang (2003), who argues that resources and preferences should be given equally to indigenous private enterprises to enable them to compete on a level playing field with multinationals.

As far as this question is concerned, China offers a suitable context, not only due to its rapid accumulation of FDI and DI over the last decade, but also due to similar FDI and investment policies across the country (Lardy 2014; Lin 2012; Long 2005). Aside from this, statistical scope for FDI and DI is more consistent within a single-country panel. This is in stark contrast to cross-country studies in which definitions of FDI and DI vary significantly between countries (Benhabib and Spiegel 1994; Reiter and Steensma 2010). Methodologically, these characteristics minimize unobserved fixed effects that often plague cross-country studies on this topical issue (Salim et al. 2017b). Further, the short but vibrant history of hosting FDI and its uneven regional distribution has been substantially enriched within and between variations across Chinese regions (Su and Liu 2016; Zhang 2017). It is possible to take advantage of this opening up “policy experiment” to identify the empirical relationship between FDI and DI.

This paper contributes to the extant literature in the following ways. First, this research employs a comprehensive city-level panel covering the post-WTO period. To the best of the researchers’ knowledge, this is the only study that investigates the FDI–DI nexus at such a disaggregate level. It differs from most existing studies that are either based on time-series techniques that assume regional heterogeneity away (Ang 2009; Chen et al. 2017a, b; Kim and Seo 2003; Tang et al. 2008; Van Loo 1977; Xu and Wang 2007), or panel studies that pool heterogeneous countries together (Adams 2009; Agosin and Machado 2005; Al-Sadig 2013; Borensztein et al. 1998; Farla et al. 2016; Morrissey and Udomkerdmongkol 2012; Wang 2010). Meanwhile, the choice of time span not only avoids any systematic changes ascribed to the WTO accession, but also updates the literature using a pre-WTO dataset (Sun 1998; Tang et al. 2008; Xu and Wang 2007). Second, this paper minimizes aggregation bias by dividing 215 cities into several geographical regions. This categorization enables detection of regional heterogeneities in the nexus. Appropriate FDI strategy could then be informed at a local level, making it more compatible with regional development. Third, this research inquires how absorptive capacities alter the nexus. This knowledge offers hints about how to maintain a positive FDI–DI association or to prevent crowding out of FDI. Finally, the system-generalized method-of-moment (GMM) estimator is used to control potential dynamic and simultaneity biases. Different from the conventional setup, this study is not limited to the Sargan/Hansen over-identifying and autocorrelation tests, but it extends diagnostic checks to instrument subsets using the difference-in-Hansen (D-in-Hansen) test (Hansen 1982).

Previewing the results, it is found that FDI has a neutral effect on DI across Chinese cities, while a heterogeneous nexus emerges when it is segmented into subregions. Specifically, FDI strongly crowds in DI in eastern China and, to a lesser extent, central China. However, such a relationship fails to become established in western China. These results collectively call for region-based FDI strategy. To prevent the crowding-out effect from setting in, support toward indigenous private firms may be necessary. It is further uncovered that the heterogeneous FDI–DI is dependent upon the local absorptive capacity, including human capital (HC), financial depth, and institutional quality (INST). In line with the main conclusion, their effects on the FDI–DI nexus also differ across regions.

The rest of the article is organized as follows: Section 2 provides a brief survey on the current state of the FDI–DI nexus literature. Section 3 elaborates on the econometric framework, and Sect. 4 focuses on the data description. Section 5 discusses the empirical results. The conclusion and the relevant policy implications are drawn in Sect. 6.

2 Literature review

2.1 Time-series studies for individual countries

Following the seminal work of Van Loo (1977), who reveals a neutral FDI–DI nexus in Canada, empirical research on this topic has been proliferating. In part, this is propelled by the pro-FDI attitude that has dominated the developing world since the mid-1970s, which has considerably enlarged the countries to be investigated. The other reason is that a longer dataset enables serious statistical analysis to be carried out. For example, although Ang (2009) and Gameli Djokoto et al. (2014) report a crowding-in effect of FDI in Malaysia and Ghana, respectively, Kim and Seo (2003) discovered the opposite effect in South Korea. Taken together, these studies suggest that a complementary association tends to prevail among developing countries. By systematically analyzing 47 countries, Qi (2007) reaches the opposite conclusion—that in advanced economies, FDI appears to crowd in DI.

Results from China are more consistent. Sun (1998) makes a preliminary attempt and suggests that at least one-third of growth in DI could be directly attributed to FDI injection. Tang et al. (2008) come to a similar conclusion through a more rigorous VAR analysis. Xu and Wang (2007) further consider potential structural change, and they observe that this positive nexus failed to set in until the early 1990s when substantial flows of FDI began to pour into China. More recently, Chen et al. (2017a) inquire into the role of entry modes chosen by multinationals on the FDI–DI nexus. Their empirical results indicate that equity joint ventures (EJVs) tend to foster indigenous firms, while wholly foreign-funded multinationals appear to substitute them instead.

Notwithstanding the valuable insights offered by these works, focusing solely on the time dimension has posed two major challenges. One is limited observations. According to Herzer et al. (2008), the commonly used unit root and co-integration tests are sensitive to sample size. The established FDI–DI nexus could be biased due to the short time span. The other is aggregation bias, in which regional heterogeneities are often assumed away. Large countries such as China could be especially prone to this bias. On policy perspective, country-specific results are rarely generalized to other countries, although the similar economic and institutional characteristics are shared within them. These two issues have spurred the study using panel methods.

2.2 Panel studies

In one of the earliest attempts, Borensztein et al. (1998) note the crowding-out effect of FDI through a sample of 69 developing countries over the period from 1970 to 1989. This result is sensitive to alternative specifications. According to Mody and Murshid (2005), the robust relationship could not be uncovered unless the endogeneity of FDI was accounted for. To that end, they have instrumented FDI with a global pool of funds that is available for developing countries and the geographical distance between the home and host countries. Their instrumental variable (IV) estimation suggests that FDI exerts an economically large and statistically significant effect on domestic capital formation using a sample of developing countries covering the period from 1979 to 1999. However, feasible IV sets are usually difficult to obtain. In the case of weak instruments, IV estimates tend to be more biased relative to their OLS counterparts (Angrist and Pischke 2009). This challenge has been resolved by exploiting internal IV sets in the spirit of the GMM estimator developed by Arellano and Bond (1991), Arellano and Bover (1995), and Blundell and Bond (1998). To the researchers’ knowledge, Agosin and Machado (2005) are the first to apply the difference GMM estimator to reveal the effects of FDI on DI using a panel of 36 developing countries across Asia, Africa, and Latin America over the period from 1971 to 2000. They find that FDI either exerts no influence over DI in the sample, or partially crowds it out. In a follow-up study, Morrissey and Udomkerdmongkol (2012) improve on the work by Agosin and Machado (2005) by applying the system GMM estimator, in which an additional moment condition is placed to extract estimation efficiency. Using a panel of 46 developing countries from 1996 to 2009, they find that FDI crowds out DI in the host country.

Despite the efforts by Morrissey and Udomkerdmongkol (2012) to address various shortcomings in Agosin and Machado’s (2005) research, Farla et al. (2016) question the validity of the unfavorable findings against FDI in the host country. Conceptually, they criticize Morrissey and Udomkerdmongkol (2012) for using inappropriate proxies of FDI and DI in the analysis that introduce downward bias in the estimates. Methodologically, this downward bias is exacerbated by the fact that Morrissey and Udomkerdmongkol (2012) overlook the problem of instrument proliferation in their system GMM estimations. Applying proper modifications to the original Morrissey and Udomkerdmongkol dataset, Farla et al. (2016) discover that FDI crowds in DI in the host country. Further, they demonstrate that the nature of the FDI–DI nexus can be extremely sensitive to model specifications and prone to aggregation bias. In the case of the Morrissey–Udomkerdmongkol dataset, a potential source of aggregation bias can be traced to the mixed collection of developing countries at various stages of economic development. In theory, this practice violates the homogeneity assumption imposed on the coefficients of the lagged dependent variables by the GMM estimator. To mitigate aggregation bias, empirical studies have avoided pooling heterogeneous countries together. For example, Ndikumana and Verick (2008) focus on 38 sub-Saharan African countries for the period from 1970 to 2005 and discover a crowding-in effect in this region. This finding is confirmed by Adams (2009), who examines 42 sub-Saharan African countries over a shorter sample period from 1990 to 2003. Wang (2010) manages aggregation bias by categorizing countries according to their income level. Applying different GMM estimators to a panel of 50 countries over the period from 1970 to 2004 demonstrates that the cumulative effect of FDI on DI is neutral in developed countries, but positive and large in developing countries.

Despite advancing knowledge on time-series studies, two voids are yet to be filled within the panel framework. First, a panel study focusing on China remains absent, despite its dominance in hosting FDI. This is unexpected, given that DI has constituted a major driving force for the Chinese economy and has played a crucial role in offsetting external shocks. Second, although difference or system GMM estimators could alleviate the endogeneity bias, most diagnostic tests have been restricted to the Sargan or Hansen over-identifying and first- or second-order autocorrelation tests. As Roodman (2009a, b) and Farla et al. (2016) argue, the detailed estimation procedure of the difference and system GMM estimators should be clearly stated, and the validity of subset instruments must be carefully examined. Meanwhile, the GMM estimator is designed for the “wide and short” panel in which the cross-sectional dimension is considerably larger than its time span (Arellano and Bover 1995; Blundell and Bond 1998; Bond 2002). Existing GMM studies have mostly failed to satisfy this due to data constraints. By contrast, this panel offers an obvious advantage, as it involves 200 prefectural-level cities over less than a decade. Applying it to the GMM estimator is expected to produce more reliable estimates.

3 Models and methodology

To reveal the empirical relationship between FDI and DI, the following model is estimated:

$$ {\text{DI}}_{i,t} = \gamma {\text{DI}}_{i,t - 1} + \beta_{1} {\text{FDI}}_{it} + X'\beta + \alpha_{i} + \delta_{t} + \varepsilon_{it} $$
(1)

Subscript i and t index city and time. \( \alpha_{i} \) captures city-specific fixed effects, and \( \delta_{t} \) gauges aggregate shocks that are systematic across all cities. Effectively including time dummies removes contemporaneous correlations and lessens concern about spatial dependence (Çoban and Topcu 2013). DI and FDI refer to DI and FDI, both of which are scaled by the city’s GDP (gross domestic product). Considering that DI is persistent over time, its path dependence is accounted for by incorporating its first-order lag (\( {\text{DI}}_{i,t - 1} \)). Finally, X consists of a set of other conventional determinants of DI.

Following Knight and Ding (2009), real GDP growth (GROWTH) is controlled first to capture accelerator effect. It is expected that higher economic growth promotes capital formation. Meanwhile, household saving (SAVING) is used to reflect the availability of credit on DI. Ideally, it should be proxied by bank loans available to domestic investors. However, such series are not available at a city level. A prerequisite to qualify SAVING as a valid proxy is the low degree of capital mobility across Chinese cities. This means that indigenous investors fund their projects by tapping local financial resources only and have limited access to the capital pools of surrounding cities. To empirically assess this condition, this study follows Zhang et al. (2012) to carry out the test formulated by Feldstein and Horioka (1980). Both cross-sectional and panel regressions confirm a low degree of capital mobility across Chinese cities, which warrants the validity of SAVING.Footnote 3 Finally, government expenditure (GOV) is included to capture the central planner’s role played by the local government (Lardy 2014). However, the expectation of GOV is unclear. If the local government spends on improving hard and soft infrastructure that is conducive to business environments, a positive effect of GOV should display. In contrast, if the government tends to intervene in the local economy via its arm of SOEs, a negative effect on GOV is expected (Bilgili 2003; Wang 2005).

Taking these variables into account, the fully fledged model is given as:

$$ {\text{DI}}_{i,t} = \gamma {\text{DI}}_{i,t - 1} + \beta_{1} {\text{FDI}}_{i,t} + \beta_{2} {\text{GROWTH}}_{i,t} + \beta_{3} {\text{SAVING}}_{i,t} + \beta_{4} {\text{GOV}}_{i,t} + \alpha_{i} + \delta_{t} + \varepsilon_{i,t} $$
(2)

The primary interest of this study concerns the magnitude of \( \beta_{1} \), which represents the short-term effects of FDI on DI. Empirically, a statistically insignificant \( \beta_{1} \) implies a neutral FDI–DI nexus. In contrast, a positive and statistically significant \( \beta_{1} \) indicates a complementary nexus, or that FDI crowds in DI. Meanwhile, a negative and statistically significant \( \beta_{1} \) indicates a substituting nexus, or that FDI crowds out DI.

This study begins by estimating Eq. (2) without the lagged dependent variable using fixed effect (FE) and random effect (RE) estimators. Although both approaches reduce the omitted variable bias, they fail to address the endogeneity of FDI and other controls—so does the dynamic bias caused by including the lagged dependent term. To simultaneously overcome these issues, the GMM estimator is adopted. It exploits the internal instrument to identify the causal relationship of FDI to DI.

Taking the first difference of Eq. (1), the city-specific fixed effect (\( \alpha_{i} \)) is removed as shown as:

$$ \Delta {\text{DI}}_{it} = \gamma \Delta {\text{DI}}_{i,t - 1} + \beta_{1} \Delta {\text{FDI}}_{it} + \Delta X'\beta + \Delta \delta_{t - 1} + \Delta \varepsilon_{it} $$
(3)

where \( \Delta \) denotes the first difference operator.

As such, one source of endogeneity is eliminated. However, remaining parameters fail to be estimated by OLS for two reasons:

  1. 1.

    the endogeneity of regressors such as FDI

  2. 2.

    the correlation between the new error term (\( \Delta \varepsilon_{it} \)) and \( \Delta {\text{DI}}_{i,t - 1} \).

In the spirit of Arellano and Bond (1991) and Arellano and Bover (1995), lagged explanatory variables in levels (Z) form valid instruments under two prerequisites: \( \varepsilon_{it} \) shows no serial correlation, and explanatory variables exhibit weak exogeneity.

Assuming these two conditions are met, difference GMM estimators exploit the following moment condition:

$$ E[Z_{i} '\Delta \varepsilon_{i} ] = 0 $$
(4)

where \( \Delta \varepsilon_{i} = (\Delta \varepsilon_{i3} ,\Delta \varepsilon_{i4} , \ldots ,\Delta \varepsilon_{it} )^{{\prime }} \) and consistently for all Chinese cities i, i = 1, 2, 3…N. In general, the asymptotically efficient GMM estimation based on this set of moment conditions minimizes the criterion given below:

$$ J_{N} = \left( {\frac{1}{N}\sum\limits_{i = 1}^{N} {\Delta \varepsilon_{i} '} Z_{i} } \right)W_{N} \left( {\frac{1}{N}\sum\limits_{i = 1}^{N} {Z_{i} '\Delta \varepsilon_{i} } } \right) $$
(5)

WN is weighting matrix defined as Eq. (6).

$$ W_{N} = \left[ {\frac{1}{N}\sum\limits_{i = 1}^{N} {(Z_{i} 'HZ_{i} )} } \right] $$
(6)

According to Bond (2002), H is (T − 2) square matrix with 2s on the main diagonal, − 1s on the first off-diagonals, and zero elsewhere. Notice that WN is independent of any estimated parameters. Using this weighting matrix equivalently assumes the homoscedasticity of the error term. In reality, this assumption is often violated and the empirical weighting matrix is frequently estimated using the equation:

$$ W_{1N} = \left[ {\frac{1}{N}\sum\limits_{i = 1}^{N} {(Z_{i} '\widehat{{\Delta \varepsilon_{i} }}\widehat{{\Delta \varepsilon_{i}^{{\prime }} }}Z_{i} } )} \right]^{ - 1} $$
(7)

where the estimates of the first-differenced residuals \( \widehat{{\Delta \varepsilon_{i} }} \) are obtained from a preliminary consistent estimator. Calculation involving this step is known as a two-step difference GMM estimator.

Although it is conceptually simple to understand, difference GMM is subject to a major caveat when the explanatory variables are persistent over time. This property would weaken instruments and lead to serious finite sample biases. To tackle this problem, Arellano and Bover (1995) and Blundell and Bond (1998) propose an additional moment condition for the equation in levels, which can be expressed as:

$$ E[\Delta Z_{i} '(\alpha_{i} - \varepsilon_{i} )] = 0 $$
(8)

This moment condition works under the assumption that the first differences of the independent variables (\( \Delta Z_{i} \)) are uncorrelated with city-fixed effects (\( \alpha_{i} \)). As long as this is satisfied, the instruments for the equation in levels are the lagged differences of explanatory variables. This combines with the previous setup defined in Eq. (4) to form the system GMM estimator.

This study applies the system GMM estimator, as most variables exhibit strong time persistence.Footnote 4 Specifically, this two-step variant is preferred, as homoscedasticity is unlikely to hold across over 200 Chinese cities in the sample. Under this mainframe, three additional improvements are made to concrete the estimation. First, although the two-step system GMM is asymptotically efficient in theory, applying it practically suffers downward bias in standard errors (Roodman 2009a, b; Windmeijer 2005a, b). To mitigate this concern, this research follows the correction procedure developed by Windmeijer (2005a, b). Second, although using internal instruments substantially saves effort in searching and experimenting external IVs, it is not without cost. The instrument count easily grows large and could overfit endogenous variables, leading to a biased estimator. Specifically, a high instrument count weakens over-identification tests that decide the reliability of GMM estimations. To prevent proliferation in instruments, lag lengths available for instrumentation are limited to two, and instrument sets are simultaneously collapsed into smaller ones, as suggested by Roodman (2009a). Finally, different from the common practice of reporting the Hansen test, in which the overall validity of the instrument set is examined, D-in-Hansen tests are carried out to examine the validity of instrument sets for each endogenous variable (Hansen 1982). In this study, the primary interest FDI, GROWTH, and lagged dependent term (DIi,t−1) are treated as endogenous and must be instrumented, while SAVING and GOV are assumed to be exogenous.Footnote 5 Following the established literature, time dummies are positioned as external IVs to improve estimation performance (Çoban and Topcu 2013).

4 Data

This study exploits the cross-city variations over time to identify the causal relationship of FDI to DI in China. It offers three distinctive advantages over conventional cross-country panels. First, omitted variable bias is likely to be minimized due to similar institutional and legal frameworks across Chinese cities. For example, central government retains the ultimate authority to draft, publish, and revise policy on foreign capital. This is manifested by the closely watched Catalogue of Industries for Guiding Foreign Investment, which is a periodically updated guideline for multinationals that does not make any geographical differentiation (Ministry of Commerce 2017). Local variations, although present, are kept on a marginal scale, owing to the bottom-to-up approval procedure (Chen 2017; Long 2005). In contrast, motivations and efforts to draw FDI are heterogeneous across countries, which tends to compound the empirical FDI–DI nexus.

Second, definitions of variables and the procedures to comply with them are more likely to be consistent within a single-country panel. Cross-country studies suffer tremendously in this regard. Take FDI as an example. Most countries follow UNCTAD (2017b) requirements that cross-border capital flows be classified into FDI if the single investor either owns 10 percent or more of the ordinary shares or voting power of an enterprise in the host country (UNCTAD 2017b). However, Chinese authority stipulates that the threshold share to be qualified as FDI is 25%. Similar issues are portrayed by gross fixed capital formation in which its sub-components remain prominently discrepant between developed and developing countries (Benhabib and Spiegel 1994). The city-level panel is largely free from this complication, as local statistical bureaus are required to behave on their central counterpart by collecting, processing, and delivering data. Since 2002, the National Bureau of Statistics of China (NBSC) had unified statistical scope for FDI and DI, rendering this dataset more consistent across Chinese cities (NBSC 2002).

Finally, in this case, observations expand dramatically. Up to 2011, when this sample ends, China had 269 prefectural-level cities, far more than the number of countries involved among the existing cross-country panels. Meanwhile, the large number of cities facilitates regional analysis, offering a rare opportunity to probe the heterogeneous FDI–DI nexus across geography. The number of cities considered in this study is 215.Footnote 6 This sample covers the period from 2003 to 2011, during which several key variables are made publicly available. Notably, this coincides with the time when NBSC began to unify statistical procedure to comply with variables that will be analyzed.

This dataset is principally obtained from two sources: China Urban Statistics Yearbook (CUSY) and China Statistical Yearbook for Regional Economy (CSYRE). Both are compiled and maintained by NBSC, which guarantees a reasonable level of data quality. Although it may not eliminate manipulations due to pursuing politically desirable targets, such practice has gradually ceased alongside the dramatic expansion of the Chinese economy (Owyang and Shell 2017). Meanwhile, controlling the aggregate time effect, as indicated in the empirical model, helps alleviate data inflation, which tends to be systematic across almost all Chinese cities.

Table 1 summarizes the data construction procedures and their respective sources. Both FDI and DI are normalized by the city’s GDP to partial out the scale effect. NBSC does not report investment by ownership at a disaggregated city level. Therefore, DI is estimated by contrasting FDI from gross fixed capital formation (GFCF) on the basis that the overwhelming majority of FDI into China belongs to green-field investment (Davies 2013). Notably, this practice is not feasible when FDI is dominated by mergers and acquisitions (M&As). Contradictory to green-field FDI, M&As transfer indigenous existing assets to foreign hands without creating new production facilities, subtracting this type of FDI from GFCF encounters to double counting issues and bias the estimated DI downward.Footnote 7

Table 1 Variable constructions and sources

Table 2 presents a summary of the statistics of the variables included in this analysis. FDI, as a share of GDP, is averaged to 2.4%, and the variations are considerably large. Consistent with the expectations of this study, on average, DI accounts for over 50% of the city’s GDP, suggesting its prominent role in contributing to local economy. Because the sample focuses on urban China, real GDP growth is higher, sitting at nearly 16% (Su and Liu 2016).Footnote 8

Table 2 Summary statistics

Further, to avoid aggregation bias, 215 cities have been divided into eastern, central, and western groups following the geographical classifications of their respective provinces.Footnote 9 This conventional practice rests on the premise that the three macroregions possess distinctive comparative advantages in which the eastern provinces have endowed and maintained a sophisticated industrial base, whereas their central and western counterparts enjoy advantages in agricultural and mineral resources. The trajectory of opening up differs considerably between them. Since 1979, immediately after the onset of national reform, FDI has been rooted in eastern China. However, until recently, it remained sparse to western provinces (Qi 2007; Su and Liu 2016; Zhang 2001a, b). Both stylized facts are insightful to the FDI–DI nexus. According to Dunning (1980, 1992), multinationals commonly integrate their internal competitive edges with locational advantages to maximize returns. This means that multinationals are self-selected into these regions and, as a consequence, may portray contrasting motivations and deliver varying degrees of commitment to the local economy.

Based on regional samples, Table 2 further reveals several preliminary results. First, as expected, FDI-to-GDP ratio is considerably higher in the eastern sample than in its central and western counterparts. The reverse is true when DI-to-GDP ratio is considered. They collectively offer anecdotal evidence that there is a negative association between FDI and DI, supporting the crowding-out hypothesis. Second, non-eastern cities together outperform their eastern counterparts in growth records. This is promising in the sense that income inequality is gradually narrowing across regions. Third, government expenditure remains important to non-eastern cities, indicating that market reform should be deepened to flourish in the private sector. Finally, eastern households tend to save less than their inland counterparts. One possible explanation is that buoyant housing and stock markets have diverted household saving away from commercial banks in eastern cities. The China Household Financial Survey 2012 (CHFS 2012) confirms this, noting that most stock market participants and commercial housing investors are from eastern China (Gan et al. 2014).

5 Empirical results

5.1 Full sample results

Table 3 reports the full sample results. Column (1) and Column (2) report RE and FE estimations, although the Hausman test suggests the latter should be preferred. Both show a positive but statistically insignificant association between FDI and DI, indicating that a neutral FDI–DI nexus prevails in China. This result remains qualitatively the same after including dynamic terms and simultaneously controlling endogeneity of FDI and other controls, which is uncovered in Column (3) by a system GMM estimation. This finding contradicts Sun (1998), Tang et al. (2008), and Xu and Wang (2007) who have consistently obtained a positive association between them. The mixed results could stem from different sample periods. Their studies exclusively focused on pre-WTO periods, whereas this study covers the post-WTO period from 2003 onward. According to Chen (2011) and Davies (2013), accession to the WTO serves as a landmark to liberalize the FDI regime in which market-seeking multinationals have been pouring into China. As opposed to efficiency-seeking multinationals, they commonly possess fewer industrial linkages with indigenous enterprises. Simultaneously, market-seeking multinationals intend to protect their tangible and intangible assets from leaking outside, further preventing their positive spillovers from being assimilated locally (Javorcik 2004). This is the primary reason for neutralizing the positive FDI–DI nexus observed over the post-WTO period. By covering the post-WTO period, Chen et al. (2017a) obtain a conclusion in line with this paper.

Table 3 Full sample results on the FDI–DI nexus

In terms of controls, real economic growth is demonstrated to foster DI, marking evidence of the accelerating effect. A similar positive effect is exerted by local government expenditure, suggesting that local public spending is conducive to DI by improving hard and soft infrastructure. In this regard, the fiscal decentralization introduced from 1994 seems optimal (Zhang and Zou 1998). However, SAVING fails to register any significant effect on the dependent variable. Controlling endogeneity does not fundamentally alter these findings, although changes in magnitude and significance are present. Finally, according to Column (3) of system GMM estimation, DI is persistent over time, with an estimated coefficient of 0.708 and a statistically significant 5% level. This implies that this estimated model is dynamically stable.Footnote 10

The appropriateness of GMM estimation rests on a battery of diagnostic tests reported in the lower part of Table 3. In line with existing literature, the Hansen J identifying test for overall instrument validity and first- and second-order autocorrelation tests were performed. The results confirm that specification is free from serial correlation, and the chosen IV set identifies the exogenous component of endogenous variables. Different from conventional practice, the subset instruments are examined separately. The D-in-Hansen test (levels) suggests that the lagged variables in levels form valid instruments, while the D-in-Hansen (DIt−1), D-in-Hansen (GROWTH), and D-in-Hansen (FDI) confirm that instrument sets for lagged dependent variables, real economic growth, and FDI are feasible. Further, the exogeneity assumption of GOV, SAVING, and time dummies is verified by D-in-Hansen tests (IV), indicating that they are valid external IVs. As a final remark, the total instrument count is only 18, owing to the efforts in limiting lag lengths available for instrumentation and further collapsing the instrument set. As such, the results of the previous tests are unlikely to be weakened by instrument proliferation (Roodman 2009a, b).

5.2 Subsample results

Full sample results assume regional heterogeneity away and may confront aggregation bias. Locational advantages are diverse across regions and have drawn multinationals with distinct motivations and varying commitments to the local economy. Further, coping with the different paces in reforming and opening, the empirical FDI–DI nexus is expected to differ along geographical dimensions. To assess this regional heterogeneity, this research disaggregated the full sample into eastern, central, and western counterparts, which resulted in 87, 99, and 29 cities. Table 4 reports a two-step system GMM estimation using them.Footnote 11 Consistent with expectations, FDI exerts heterogeneous effects on DI across three subsamples. For eastern cities, on average, a 1% increase in FDI raises DI by 1.74%, with everything else equal. A similar pattern is uncovered in central China, although the estimated coefficient is reduced to 1.66 and becomes marginally significant at a level of 10%. In contrast, FDI tends to substitute DI among western cities of China, but it is not significant at any conventional level. These findings are consistent with Sun (1998), who obtained a significantly positive FDI–DI association across 10 coastal provinces of China.

Table 4 Subsample results

Two potential explanations are offered to reconcile the heterogeneity. One is related to entry modes chosen by foreign investors. In the context of China, five entry modes are available to foreign investors among which EJVs and wholly foreign-funded enterprises (WFFEs) overwhelmingly dominate.Footnote 12 They interact differently with its host economy, given their varying degree of control, resource commitment, and risk exposure (Blomström et al. 2001). For example, EJVs create an ideal platform for their local Chinese partners to learn the best practice from their foreign counterparts. Thus, they are conducive to domestic entrepreneurship (Filatotchev et al. 2007). Although lack of Chinese involvement in WFFEs may cause domestic investors to fail to capture positive spillovers, Chen et al. (2017a, b) confirm these predictions by exploring quarterly variations over the period from 1995 to 2014. Coincidently, the entry modes chosen by foreign investors have been divided across the regions. Eastern China has been the base of millions of EJVs since the 1980s, while WFFEs are recently gathering momentum among inland provinces owing to renewed initiatives driven by the WTO accession (Chen et al. 2017a, b). This supports the crowding-in effect of FDI revealed in eastern cities, but an ambiguous one throughout inland cities.

The other explanation stems from the sectoral composition of FDI across regions. UNCTAD (2001) maintain the view that production linkages between foreign and domestic firms appear much broader in the manufacturing sector compared to those established in the primary sector. Over the past three decades, foreign investors with efficiency-seeking purposes have been lured into coastal China to tap into the cost advantages available there (Long 2005). Channels include labor turnovers and demonstration effects set into foster DI in a pronounced way (Chen et al. 2017a, b; Huang and Sharif 2009; Lin et al. 2009). In contrast, western China, with rich natural endowment but a lack of hard infrastructure, has drawn resource-seeking multinationals. However, this has been on a limited scale, owing to government restrictions. Their presence tends to create an enclave economy that makes little contribution to local development. The findings are supported by Adams (2009), who suggests that concentration in resource extraction industries is the primary reason for the crowding-out effect of FDI.

Heterogeneity across regions is also portrayed by other controls. GROWTH and GOV remain positive and statistically significant only in the central region. SAVING is significantly positive among inland cities, suggesting that DI is predominantly funded by tapping local savings. It further implies that financial development (FD) lags behind there. On a passing note, the coefficient of DIt−1 is statistically significant and falls within the range of 0.406 and 0.668, suggesting that models (1)–(3) remain stable. This model is correctly specified, and the instrument strategy is feasible, as confirmed by a series of diagnostic tests reported in the lower part of Table 4. Although the number of cross-sectional units is reduced due to sample segmentation, they remain significantly larger than their respective instrument counts, lessening the concern caused by instrument proliferation.

5.3 Further analysis of local absorptive capacity

Section 5.2 reveals that the FDI–DI nexus varies across regions but, from a policy perspective, fails to address how it could be sharpened. This issue is essential to regional development strategy. For western China, the priority is still how to tap multinationals with the initiative of replicating the economic takeoff that blanketed eastern China for two decades. This analysis suggests that FDI has produced undesirable effects across western cities. The questions are: How to reverse it and make FDI more compatible with DI there? How to maintain and strengthen the positive FDI–DI nexus throughout coastal and central cities? These issues are probed by investigating local conditions on the nexus. To that end, three types of absorptive capacity—HC, FD, and INST—interact with FDI. The choices on them are made from relevant FDI–DI and FDI-growth literature (Al-Sadig 2013; Alfaro et al. 2004; Borensztein et al. 1998; Farla et al. 2016; Hermes and Lensink 2003; Morrissey and Udomkerdmongkol 2012). “Appendix A” provides detailed constructions of these variables.

Equation (9) presents the augmented empirical model:

$$ {\text{DI}}_{i,t} = \gamma {\text{DI}}_{i,t - 1} + \beta_{1} {\text{FDI}}_{i,t} + \varvec{Z}^{{\prime }} \beta + \alpha_{i} + \delta_{t} + \theta {\text{ASC}}_{i,t}^{j} + \eta ({\text{ASC}}_{i,t}^{j} *{\text{FDI}}_{i,t} ) + \varepsilon_{i,t} $$
(9)

where \( {\text{ASC}}_{i,t}^{j} \) refers to one of the abovementioned local absorptive capacities j at city i in year t, and \( {\text{ASC}}_{i,t}^{j} *{\text{FDI}}_{i,t} \) captures the interactive effect between it and FDI. Given the strong correlations among FDI, ASC, and their interaction term by construction, estimating Eq. (8) will be subject to strong multicollinearity. The approach suggested by Azman-Saini et al. (2010) is used. \( {\text{ASC}}_{i,t}^{j} *{\text{FDI}}_{i,t} \) is first regressed on \( {\text{FDI}}_{i,t} \) and \( {\text{ASC}}_{i,t}^{j} \). Next, the resulting predicted residuals from the first-stage regression are restored and used to represent the interaction term. Having performed these calculations, both FDI and \( {\text{ASC}}_{i,t}^{j} \) are now orthogonalized to the interaction term. The remaining controls are defined previously. The estimation strategy remains the same except that the new included ASC and their interactions with FDI are treated as endogenous.

Table 5 reports the two-step system GMM estimation. The focus is on the coefficients of FDI, ASC, and their interaction terms, while all controls except \( {\text{DI}}_{i,t - 1} \) are omitted to conserve space.Footnote 13 In general, it was found that interactions are mainly significant in eastern and inland cities. HC is shown to weaken the positive association between FDI and DI among eastern cities. This contradicts Borensztein et al. (1998) who conclude that attaining a minimum HC—measured by average schooling years—is required to harness the crowding-in effect of FDI. Although this result is surprising, it is not entirely indefensible. Recall that HC is captured by a share of university students who might place themselves better in the labor market. Job offers from multinationals comfortably fit this psychological intention. Conversely, as noted by Ge (2006) and Hsu and Jaw (2015), multinationals in China usually pay a larger wage premium to lure experienced and better-educated workers than local enterprises. As such, both pull and push sides enable multinationals to hoard skilled workers that strengthen their competitive edge over local rivals. Following this line of reasoning, readily available HC facilitates FDI to crowd out DI in eastern China. In contrast, in central China, this significant result vanishes and turns into a positive, although statistically insignificant, association in western China. These findings are supported by Al-Sadig (2013) who also shows that HC fosters a positive FDI–DI nexus among low-income countries only.

Table 5 The role of local absorptive capacity on the FDI–DI nexus, by region

Column (2) further reveals that a developed financial market also removes the complementary FDI–DI nexus in eastern cities. This again contradicts the findings of Alfaro et al. (2004) and Hermes and Lensink (2003), who consistently identify that FD is crucial to capture FDI-related spillovers across countries. However, this finding appears in line with Huang’s (2003) proposal of an alternative institutional-based approach to explain China’s remarkable success in hosting FDI. In his book-length discussion, by carrying out intensive surveys and industrial-level case studies, Huang (2003) postulates that the state-ruled banking sector is generally reluctant to lend to private enterprises, forcing them to seek external finance from more expensive sources. By recognizing this institutional deficiency, many foreign partners exploit it by undervaluing the contribution of their indigenous counterparts in newly formed joint ventures. Because many financially vulnerable indigenous firms are in desperate need of immediate capital injections, they reluctantly agree to unfair terms and conditions of transactions, amounting to a de facto fire sale of their assets. This claim is examined by Havrylchyk and Poncet (2007, 1676) who conclude that “private enterprises often seek a foreign investor because they are excluded from the banking sector in their province.”

Using firm-level data, Héricourt and Poncet (2009) argue that FDI provides an alternative funding source to many cash-strapped private firms in China’s state-ruled financial system. Following the prediction of Huang (2003), an improvement in the financial system will deter the entry of multinationals while fostering the investment made by indigenous private firms, leading to a negative association between FDI and DI. These findings provide city-level support, in line with provincial and firm-level evidence from Havrylchyk and Poncet (2007) and Héricourt and Poncet (2009). Given the previous findings, it is expected that Column (3) will depict institutional improvements to help FDI substitute DI, as both HC and FD belong to part of the overall institutional profile. Intuitively, less red tape and bureaucratic burden alongside institutional improvements make multinationals operate more efficiently, strengthening their competitiveness in an alien business environment.

As a final remark, these findings pass a battery of diagnostic tests reported in the lower part of Table 5. Negative and statistically significant relationships are observed for the error terms in the first-order differences, but not in the second order. Hansen J and a series of D-in-Hansen tests confirm that the instrumentation strategy is valid for identifying the exogenous component of endogenous variables. Finally, the coefficient of DIt−1 is statistically significant and falls between 0.415 and 0.991, indicating that all specifications are dynamically stable.

6 Conclusion and policy implications

As one of the world’s leading host countries, the nature of the FDI–DI nexus in China has largely remained under-explored. This study attempts to fill this void by exploiting the city-level panel covering the post-WTO period from 2003 to 2011. Using the system GMM estimator, a neutral relationship was found to prevail between foreign and domestic across all 215 cities. A heterogeneous nexus emerges when it is segmented into regional samples according to geography. FDI strongly crowds in DI in eastern China and, to a lesser extent, in central China, while a negative but statistically insignificant relationship is established among western cities. In addition, the empirical FDI–DI nexus has been shown to have the potential to be altered by local absorptive capacities, including HC, financial development, and instructional quality. Overall, improvement in these local conditions makes FDI substitute DI. This has been ascribed to an unexpected finding related to specific institutions of China including, but not limited to, labor-market targeting and the state-ruled financial system.

Based on these results, the following policy recommendations are offered to Chinese policymakers. First, a one-size-fits-all FDI strategy is no longer appropriate in contemporary China. Local governments must combine their locational advantages to attract appropriate multinationals that are compatible with local conditions and development. Second, the accession to the WTO has substantially deepened market reforms, creating an increasingly pro-business environment for multinationals in China. Undoubtedly, this has strengthened their competitiveness. However, institutional barriers—mainly in terms of obtaining banking loans and seeking fair legal representation—continue to prevent indigenous private firms from flourishing. Therefore, it is recommended that resources and preferences be equally given to indigenous private enterprises so that they can compete with multinationals on a level playing field. Nevertheless, this study is the first to examine the FDI–DI nexus at a city level in China and reveals strong regional heterogeneity. The nexus is assumed to be linear. Further research is expected to relax this assumption by considering nonlinear approaches, such as threshold or regime-switching models (Bilgili et al. 2016).