1 Introduction

Investments in transportation infrastructure are key to promoting economic growth and spreading growth geographically by lowering trade costs (e.g., Donaldson 2018), inducing firm growth (e.g., Lu 2020), increasing competition (e.g., Asturias et al. 2019), and allowing workers to commute (e.g., Asher and Novosad 2020). Yet the effect that transportation infrastructure has on wages is potentially ambiguous as connecting markets causes many things to change simultaneously: workers may work in other markets (raising local wages), firms may attract workers from new markets where wages are lower (lowering local wages), and local production may increase or decrease based on changes in output competition across markets.

In this paper, we study an aspect of how wages are affected by transportation infrastructure that has received less attention: to what extent does transportation infrastructure affect the ability of firms to exercise monopsony power in local labor markets? In principle, easier labor commutes may increase the mobility of a labor force that is otherwise captive to the local labor market. This is especially important in rural areas that are otherwise unconnected by roads. We address this question in the context of the India’s Golden Quadrilateral (GQ) expressway expansion initiative. The Golden Quadrilateral initiative is of interest for several reasons. First, it is one of the largest highway expansions in the world. Second, in developing countries monopsony power may be particularly strong. India has traditionally been known for having spatially segmented markets where the potential for monopsony power is high (see, for example, Brooks et al. 2021; Binswanger and Rosenzweig 1984; Braverman and Stiglitz 1982, 1986). Finally, the GQ was built quite rapidly, expanding from only 5 to 95 percent complete between 2001 and 2006.

We focus on labor “markdowns” and their impact on labor’s share of aggregate income. A markdown is the ratio of the value of the marginal product of labor to the wage that is above and beyond what is explained by a markup in the output market. As an example, suppose a firm uses oil and labor as inputs. We observe that the value of the marginal product of oil exceeds its input price by 10%, while the value of the marginal product of labor exceeds the wage by 32%. By assuming that the market for materials (oil, in this example) is competitive, we can apply the methods of Brooks et al. (2021) to disentangle the firm’s markup on its output from the markdown on labor. In this example, since oil is competitive, the 10% gap between the value of the marginal product of oil and its price is equal to the markup on its output. Then, the markdown on its labor is simply 20% (i.e., 1.32/\(1.1-1\)). Applying these methods to detailed geographic data, we find substantial pre-existing labor markdowns in the data. We then show that for firms near the newly constructed expressway, average markdowns are reduced substantially.

Figure 1 illustrates the pattern for three different measures of the markdown. It shows the time path of labor markdowns by tercile of how close a firm’s district is to the GQ, which is all but completed in 2006. The paths of the three terciles show roughly identical increases until 2006, at which point the markdowns in the most remote locations continue to rise, while the two terciles closest to the expressway flatten off.

Fig. 1
figure 1

Impact of highway on markdowns by proximity tercile. Note This graph shows the time path of labor markdowns by tercile of how close a firm’s district is to the Golden Quadrilateral. The reported markdown is the weighted average of firms’ markdowns, where the weight is the product of the firms’ average labor compensation and the survey-provided sampling weight. DLW, CD and CRS are the three methods we use to measure markdowns, and their construction is explained in Sect. 3.3. The vertical line indicates the year 2006 when GQ is mostly completed

We delve into the causes of the lower markdowns for connected districts. First, we show that even though the impact of the GQ connection on markups is perhaps negative, it cannot account for the lower markdown. Instead we measure relative increases in labor compensation and labor’s share among connected firms. While we cannot isolate the precise way in which these lower markdowns are manifested, we rule out several possible explanations. In particular, we do not find strong evidence that the highway-induced reduction in labor markdowns is correlated with the firm labor share. Neither do we find evidence of increased labor supply elasticity or changes in skill premia in integrated labor markets.

Nevertheless, the markdown patterns we uncover are economically significant. Markdowns induce an aggregate labor’s share that is between 3 and 7 percentage points lower than it would be in the absence of markdowns. The introduction of the GQ reduces markdowns and leads to an increase in the aggregate labor’s share of aggregate output by about 1.8 to 2.3 percentage points.

The rest of this paper is organized as follows. We review the relevant literature in the remaining introduction. Section 2 reviews a general model of firm with monopsony power and uses it to derive our formula for markdowns. In Sect. 3, we give background on the Golden Quadrilateral initiative, our data, and our practical measurement of markups and markdowns. Section 4 then presents the empirical analysis, and Sect. 5 concludes.

1.1 Related Literature

This paper contributes to multiple ongoing areas of research. There is a growing literature on labor monopsony, especially in the USA (Card et al. 2018; Gouin-Bonenfant 2018; Lamadon et al. 2019; Berger et al. 2018; Hershbein et al. 2020). The first three examine the sharing of rents in labor markets with search and matching frictions. Berger et al. (2018) study a similar labor market using mergers for identification. Hershbein et al. (2020) study classical monopoly power, documenting time series patterns in markdowns in the USA, including a sharp increase after 2000. Most directly, we borrow our measure for markdowns on labor from Brooks et al. (2021), which also focuses on labor markdown in India. This paper is novel in looking at the impact of large-scale infrastructure investment on firms’ monopsony power, however.

Another series of papers have analyzed the impacts of infrastructure investments on firms and labor markets. Most closely related is Asturias et al. (2019), which uses a trade model to quantify the impact of increased competition in the product market from the expansion of the Golden Quadrilateral highway system in India. We utilize their data and complement their findings by empirically assessing the impact of the GQ on labor market competition. Although they find impacts on markups, the techniques developed here are robust to variation in product market markups, and those markups do not affect our estimation of input markdowns. Other work has studied the impact of roads on labor migration. For example, Asher and Novosad (2020) show that the primary impact of a large national rural road construction program connecting villages in India is an occupational move away from agriculture to wage income and suggest that this is driven by labor opportunities from outside the villages. Similarly, Brooks and Donovan (2020) show how bridge infrastructure investment in Nicaragua substantially changed labor patterns by allowing rural workers to access new labor markets. We complement this literature by showing how changes in labor market opportunities and competition for workers impacts firms’ ability to markdown wages monopsonistically.Footnote 1

Several other papers have looked at the impact of the National Trunk Highway System, another major highway expansion but in China. Using a before–after approach and region level data, Faber (2014) finds that it lowered the industrial output and growth of newly connected areas. In contrast, using more continuous variation and firm-level data, Lu (2020) finds that the same project promotes growth of firms in newly connected areas using variations in the timing of highway segment construction. Alder and Kondo (2020) show how the highway planning in China was driven by political economy considerations, and they solve for the optimal highway system devoid of such consideration. An advantage of our India setting is that we do not suffer from the challenge of disentangling multiple reforms—including relaxation of the Hukou system that allows for more migration and unilateral trade liberalization—which could have a first-order impact on the labor markets and confound interpretation.

2 Model

Following Brooks et al. (2021), this section derives our measure of the markdown and links it to labor’s share of income.

2.1 Monopsonistic Firm’s Problem

We consider the problem of a firm who has market power in both the product market, the local labor market, and potentially other input markets. We index the firm and its output by \(n=1,\ldots,N\), its industry by \(i=1,\ldots ,I\), and its location by \(k=1,\ldots ,K\). Firms use two inputs: labor and materials. Importantly, the firm is a price-taker in one input market. In our application, materials will be the input for which the firm is a price-taker.Footnote 2

Letting \(x^M_{nki}\) denote the quantities of materials and \(x^L_{nki}\) denote the labor employed by firm n located in k and operating in industry i, we express the industry-specific production function quite generally as:

$$\begin{aligned} y_{nki} = F_i(x^L_{nki},x^M_{nki};Z_{nki}) \end{aligned}$$
(1)

where \(Z_{nki}\) is a set of firm-level characteristics, including productivity but also any other potential firm or location-specific factors that might affect the level or shape of technology.

For labor input, the firm faces an inverse supply function that depends on the aggregate labor supply \(X^L_{ki}\) in location k and industry i:

$$\begin{aligned} w^L_{ki} = G^L_{i}\left( X^L_{ki} \right) , \end{aligned}$$
(2)

where the aggregate quantity equals, by market clearing, the total input demanded across all \(N_{ki}\) firms in the industry and location:

$$\begin{aligned} X^L_{ki} = \displaystyle \sum _{n=1}^{N_{ki}} x^L_{nki}. \end{aligned}$$
(3)

Again, the supply of materials is perfectly elastic at a given factor price, \(w^M_{ki}\). The most natural interpretation of this assumption is that materials are traded both within country and internationally so that local firms have no market power to exert over their prices.Footnote 3

Likewise, the firm faces an inverse demand for its output, given the output of all other goods, which we denote \(\{y_{jki}\}_{j\ne i}\):

$$\begin{aligned} p_{nki} = H_{i}\left( y_{nki};\{y_{jki}\}_{j\ne i}\right) \end{aligned}$$
(4)

Here, we assume that firms face a single demand function. This would be the case if they sell to a single national market and if there are no transportation costs across locations.Footnote 4

The firm’s profit maximization problem is therefore:

$$\begin{aligned} \displaystyle \max _{\{y_{nki},x^M_{nki},x^L_{nki}\}} p_{nki}y_{nki} - w^M_{ki} x^M_{nki} - w^L_{ki} x^L_{nki} \end{aligned}$$
(5)

subject to:

$$\begin{aligned} y_{nki}&= & {} \,F_i(x^L_{nki},x^M_{nki};Z_{nki})\\ p_{nki}&= & {} \,H_{i}\left( y_{nki};\{y_{jki}\}_{j\ne i}\right) \\ w^L_{ki}&= & {} \,G^L_{i}\left( X^L_{ki} \right) . \end{aligned}$$

The fact that \(p_{nki}\) and \(w^L_{ki}\) are both functions in the constraints emphasizes that firms internalize their effect on both output prices and input prices. In particular, by producing more output, they reduce the price of their own output, and by choosing to use more labor, firms internalize the effect of higher wages.

Using \(\lambda _{nki}\) as the Lagrange multiplier on the production function, the firm’s first-order conditions are:

$$\begin{aligned}&p_{nki} + \frac{\partial p_{nki} }{\partial y_{nki}} y_{nki} = \lambda _{nki}\end{aligned}$$
(6)
$$\begin{aligned}&w^L_{ki} + \frac{\partial w^L_{ki} }{\partial x^L_{nki}} x^L_{nki}= \lambda _{nki} \frac{\partial F_{i}}{\partial x^L_{nki}}\end{aligned}$$
(7)
$$\begin{aligned}&w^M_{ki} = \lambda _{nki} \frac{\partial F_{i}}{\partial x^M_{nki}} \end{aligned}$$
(8)

Notice that Eqs. (6), (7) and (8) can be rewritten, respectively, as:

$$\begin{aligned}&\frac{\lambda _{nki}}{p_{nki}} = 1+ \frac{\partial \log (p_{nki})}{\partial \log (y_{nki})}\end{aligned}$$
(9)
$$\begin{aligned}&\lambda _{nki}\frac{y_{nki} \frac{\partial \log (F_i)}{\partial \log (x^L_{nki})}}{w^L_{ki}x^L_{nki}}= 1+\frac{\partial \log ( w^L_{ki}) }{\partial \log (x^L_{nki})}\end{aligned}$$
(10)
$$\begin{aligned}&\lambda _{nki}\frac{y_{nki} \frac{\partial \log (F_i)}{\partial \log (x^M_{nki})}}{w^M_{ki}x^M_{nki}}= 1 \end{aligned}$$
(11)

2.2 Markups

We define a markup as the ratio of output price to marginal cost. A common measure for markups from de Loecker and Warzynski (2012) is the ratio of the elasticity of an input to its cost share. However, that ratio which we define as \(\mu _{nki}^L\) for labor input is not, in general, equal to the markup:

$$\begin{aligned} \mu _{nki}^L\equiv \frac{\frac{\partial \log (F_i)}{\partial \log (x^L_{nki})}}{\frac{w^L_{ki}x^L_{nki}}{p_{nki}y_{nki}}}. \end{aligned}$$
(12)

In the absence of monopsony power, using any input implies the same measured markup. However, this is no longer true with monopsony power. If the firm is not price-taking in labor, for example, \(\mu _{nki}^L\) could exceed one for two reasons: market power in the output market, or monopsonistic market power in the labor market. Here, our assumption that the material input M is perfectly elastically supplied is helpful, since it provides a way of measuring markups in output prices without being confounded by the presence of monopsonistic market power on other inputs.

Manipulating the equations above, we can solve for the markup \(\mu _{nki}^M\) making use of the price-taking input as follows:

$$\begin{aligned} \frac{ \frac{w^M_{ki}x^M_{nki}}{p_{nki}y_{nki}}}{ \frac{\partial \log (F_i)}{\partial \log (x^M_{nki})}} = \frac{1}{\mu _{nki}^M} =1+ \frac{\partial \log (p_{nki})}{\partial \log (y_{nki})}. \end{aligned}$$
(13)

2.3 Markdowns

Taking the ratio of Eqs. (7) and (8), we can then isolate monopsony power in the market for labor relative to the markup:

$$\begin{aligned} \forall m, \frac{\mu _{nki}^L}{\mu _{nki}^M} = 1+\frac{\partial \log ( w^L_{ki}) }{\partial \log (x^L_{nki})} \end{aligned}$$
(14)

In other words, the left-hand side is a properly normalized measure of the exercise of classical monopsonistic market power, i.e., what the literature refers to as a “markdown.” This gives us a clear way of measuring the markdown. At times, we refer to the markdown as the “markdown ratio” to emphasize that the markdown is calculated as the ratio of two markup measures.

Substituting in the definition of the aggregate supply \(X^L_{ki} = \sum _{n=1}^{N_{ki}} x^L_{nki}\), and manipulating shows how this equals the ratio of the share of firm \(s_{nki}^L = \frac{w^L_{ki}x^L_{nki}}{\sum _l w^L_{ki}x^L_{lki}}\) to the elasticity of labor supply \(\epsilon _L=\frac{\partial \log ( X^L_{ki}) }{\partial \log (w^L_{ki})}\). Then:

$$\begin{aligned} \frac{\partial \log (w^L_{ki})}{\partial \log (x^L_{nki})} = \frac{1}{\epsilon _L} s^L_{nki} . \end{aligned}$$
(15)

and the markdown equation becomes:

$$\begin{aligned} \frac{\mu _{nki}^L}{\mu _{nki}^M} = 1+\frac{1}{\epsilon _L} s^L_{nki} . \end{aligned}$$
(16)

Thus, a markdown will be high when either the elasticity of labor supply is low or the firm has a large share of the market.Footnote 5

2.4 Labor’s Share of Income

Again, following Brooks et al. (2021), we can derive the aggregate factor payment share for labor. Labor’s share of value added in this economy is therefore:

$$\begin{aligned} \eta _L = \frac{\displaystyle \sum\nolimits_{i=1}^{I}\displaystyle \sum\nolimits_{k=1}^K \displaystyle \sum\nolimits_{n=1}^{N_{ki}} w^L_{ki} x^L_{nki}}{\displaystyle \sum\nolimits_{i=1}^{I}\displaystyle \sum\nolimits_{k=1}^K \displaystyle \sum\nolimits_{n=1}^{N_{ki}} (p_{nki} y_{nki} - w^M_{ki}x^M_{nki})} . \end{aligned}$$
(17)

Define the labor share of a given firm in the national labor pool as:

$$\begin{aligned} \omega ^L_{nki} = \frac{ w^L_{ki} x^L_{nki}}{\displaystyle \sum\nolimits_{i=1}^{I}\displaystyle \sum\nolimits_{k=1}^K \displaystyle \sum\nolimits_{n=1}^{N_{ki}} w^L_{ki} x^L_{nki}} . \end{aligned}$$
(18)

Then, notice by taking the reciprocal of the labor share, we can derive an expression that depends on firm-level labor shares of the national labor pool, and ratios of input expenditure to revenue:

$$\begin{aligned} \frac{1}{\eta _L} = \displaystyle \sum _{i=1}^{I}\displaystyle \sum _{k=1}^K \displaystyle \sum _{n=1}^{N_{ki}} \frac{p_{nki} y_{nki}}{w^L_{ki} x^L_{nki}} \omega ^L_{nki} - \displaystyle \sum _{i=1}^{I}\displaystyle \sum _{k=1}^K \displaystyle \sum _{n=1}^{N_{ki}} \frac{w^M_{ki} x^M_{nki}}{w^L_{ki} x^L_{nki}} \omega ^L_{nki} . \end{aligned}$$
(19)

Finally, notice that the ratios of input expenditure to revenue appear in the definitions of the markups. That is:

$$\begin{aligned} {\mu ^L_{nki}} \equiv \frac{\theta _{nki}^L }{ \frac{w^L_{ki}x^L_{nki}}{p_{nki}y_{nki}}}, {\mu ^M_{nki}} \equiv \frac{\theta _{nki}^M }{ \frac{w^M_{ki}x^M_{nki}}{p_{nki}y_{nki}}} \end{aligned}$$
(20)

where for any input m,

$$\begin{aligned} \theta _{nki}^m \equiv \frac{\partial \log (F_i)}{\partial \log (x^m_{nki})}. \end{aligned}$$
(21)

These imply that:

$$\begin{aligned} \frac{p_{nki}y_{nki}}{w^L_{ki}x^L_{nki}} = \frac{\mu ^L_{nki}}{\theta _{nki}^L}, \frac{p_{nki}y_{nki}}{w^M_{ki}x^M_{nki}} = \frac{\mu ^M_{nki}}{\theta _{nki}^M} \implies \frac{w^M_{ki} x^M_{nki}}{w^L_{ki} x^L_{nki}} = \frac{\mu ^L_{nki} \theta _{nki}^M}{\mu ^M_{nki} \theta _{nki}^L} . \end{aligned}$$

Finally, this can be substituted into Eq. (19) to get:

$$\begin{aligned} \frac{1}{\eta _L} = \displaystyle \sum _{i=1}^{I}\displaystyle \sum _{k=1}^K \displaystyle \sum _{n=1}^{N_{ki}} \left[ \frac{\mu ^L_{nki}}{\mu ^M_{nki}}\frac{\mu ^M_{nki} - \theta _{nki}^M}{\theta _{nki}^L } \omega ^L_{nki} \right] . \end{aligned}$$
(22)

We have only rearranged definitions, to put labor’s share into a form where we can easily isolate the role of markdowns (and markups) by constructing and applying counterfactual series of \(\mu ^L_{nki}\) and \(\mu ^M_{nki}\) into the above formula. In particular, looking at Eq. (16) and setting \(s_{nki}^L=0\), we construct a counterfactual \(\frac{\mu ^L_{nki}}{\mu ^M_{nki}}=1\) that gives labor’s share when monopsony power has been eliminated.

3 Policy Background, Data, and Measures

This section explains the policy initiative, data, and various measures for markdowns that we utilize.

3.1 The Golden Quadrilateral Initiative

The Golden Quadrilateral is a highway system in India that is so named because it connects the largest metropolitan areas across India: Delhi in the north, Calcutta in the east, Chennai in the south, and Mumbai in the west in a circuit. The Quadrilateral spans 5,846 km, making it the largest highway system in India, and among the largest in the world.

The initiative was implemented by the National Highways Authority of India. It connected the four urban areas with an expressway with four to six lanes across for the first time. Prior to its construction, no expressway connected the cities: only two percent of national highways were four lanes, over a quarter of national highways were of “poor” road quality, and about a quarter of the roads were considered congested (World Bank 2002). This system was intended to greatly reduce travel times. For example, the Delhi–Gurgaon Expressway, which is a part of Golden Quadrilateral Highway Project, has reduced the traveling time between Gurgaon and Delhi from 60 min to approximately 20 min.

It was announced in 1999 and construction started in 2001. The original estimated cost was 600 billion rupees (about $12.8 billion in 2001), but the project was completed significantly under budget at 250 billion rupees ($5.3 billion in 2001). Nevertheless, the total cost was sizable; it would have constituted 1 percent of GDP in 2001.

Asturias et al. (2019) geocoded the 127 stretches that comprise the highway. They show that it was built quite rapidly; while only 5% was complete in 2001, 95% was complete by 2006, the original target date.Footnote 6 A second stage that involved a cross across the four corners was only ten percent complete by 2006. Hence, we follow Asturias et al. (2019) and focus on the original quadrilateral, using 2006 as the key date. Figure 2 reproduces their Fig. 1 and shows the Golden Quadrilateral (GQ) completion on top of the previous (non-expressway) road network.

Fig. 2
figure 2

Road network in India and the golden quadrilateral (GQ)

Another important policy that impacted labor markets around the same time is the National Rural Employment Guarantee Act of 2005 (NREGA), which guaranteed 100 days a year of government-sponsored manual labor employment to any adult in rural areas willing to work. As this introduced an alternative source of employment, it is possible for this to have reduced monopsony power. However, since it is guaranteed in rural areas, it is likely to have impacted those areas far from the GQ, rather than those closer to the GQ. Nevertheless, we consider NREGA in our robustness analyses.

3.2 Data

We utilize two datasets. The road and geospatial data come from the previously mentioned Asturias et al. (2019). Their data are based on geospatial data for all the National Highways of India supplied by ML Infomap, which they augment using information provided by the NHAI on the completion dates of various 127 stretches.

We link these data to the panel version of India’s Annual Survey of Industries, which is collected by their Ministry of Statistics and Programme Implementation. These data are establishment level, so we have information on the actual location of production. Although not completely representative, the coverage is broad, containing all large plants (greater than 50 employees) and a sample of smaller plants that depends on the industry and the number of plants within that industry and state. The approximate number of establishments contained in the sample varies from 23,000 to 44,000 over the years 1999 to 2011. We focus on manufacturing and use the narrow 4-digit industry classification. We utilize measures of output (the value of gross output), material expenditures (the total value of domestic and imported items purchased for production), labor payments (the sum of wage, bonus, and contribution to provident and other funds), and capital (the value of fixed assets, net of depreciation). Labor payment data are used to construct share of the labor market at the district and 4-digit industry level.

Geographically, the finest data we have are at the district level. There are over 600 districts in India, and the highway data measures the shortest straight-line distance between the highway and the most populous city in the district.Footnote 7 We then link these data to the firm district to yield estimates of firm distance to the highway. We start by dropping the areas within 50 km of the four major urban centers. We then construct terciles of districts based on their proximity to the completed expressway. Figure 3 shows the three terciles; red are closest (1st tercile), whereas yellow are furthest (3rd tercile). The tercile boundaries around the highway are large. The first tercile includes districts within 55 km of the expressway, while the second includes districts within 191 km. Clearly, these cannot be interpreted directly as commuting zones. However, in a spatial model of partially integrated labor markets, the spatial spillovers can extend far beyond the first-order effects on directly connected labor markets.Footnote 8,Footnote 9 By its nature, distance to the highway will be geographically clustered. To assuage concerns, we drop areas within 50 km of the four metro centers, and we also show that our results are robust to separately controlling for coastal areas. Finally, we show that our results are robust to the use of distance to the minimum distance straight-line connection between metro centers, an identification robustness procedure used by Asturias et al. (2019) and Alder and Kondo (2020).

Fig. 3
figure 3

Indian districts by GQ proximity terciles. Note This graph plots the terciles of districts based on their proximity to the completed expressway. The first tercile includes districts within 55 km of the expressway, while the second includes districts within 191 km. The targeted areas are districts within 50 km radius from the center of the four “corners” of the Golden Quadrilateral: Delhi, Calcutta, Chennai, and Mumbai

3.3 Measuring Markdowns and Markups

We measure markdowns according to Eq. (16), which requires measuring markups. This can be done in various ways, which we explore in this subsection. Following Brooks et al. (2021), we consider three alternative approaches to measuring markups, all of which they report as giving comparable results.

Using the formula in Eq. (13) requires an output elasticity, defined as \(\theta _{i,t}^{M}=\frac{\partial \log (F_i)}{\partial \log (x^M_{nkit})}\). The standard approach, used by de Loecker and Warzynski (2012) (referred to hereafter as DLW), is to estimate the production function by applying the methods of Ackerberg et al. (2015). They estimate translog production functions, which can then be used to easily solve for the output elasticities. Although most common, this approach has some important shortcomings, especially when used in conjunction with DLW to estimate markups. The main limitation is that the production is only identified for the case of either a value-added production function or a gross output production function in which materials are Leontieff (see Ackerberg et al. (2015) and also Gandhi et al. (2020) for a full explanation). Either of these special cases precludes the estimation of the elasticity of output with respect to materials, the precise parameter necessary to apply the de Loecker and Warzynski (2012) formula.Footnote 10 Since this is the standard way of estimating markups (e.g., de Loecker and Warzynski 2012; Edmond et al. 2015; de Loecker et al. 2016, and Brooks et al. (2020)), we present this as one measure, but we allow for several alternatives. We label this markup method “DLW,” since it most closely follows their implementation.

Our second method is to estimate markups as the gross profit margin. This is a valid method as long as the production function is constant returns to scale, and the firm is price-taking in its inputs (i.e., there is no monopsony power). Again, the second condition is problematic given our focus on monopsony.Footnote 11 The precise formula we use is:

$$\begin{aligned} \mu ^M_{nkit}=\frac{\mathrm{{sales}}}{\mathrm{{costs}}}=\frac{py}{RK+wL+qM}. \end{aligned}$$
(23)

where we can measure sales (py), labor payments (wL), and materials expenditures (qM) directly from the data. For capital, we have the stock of capital (K) rather than the payments to capital (RK). The key therefore is to differentiate payments to capital from profits that stem from markups/market power. Notice that the reason this measure of markups is less appropriate in the presence of markdowns is because it attributes all profits (in excess of returns to capital) to markups (higher revenues per unit of output), while some actually would come from markdowns (lower costs per unit of output).

We discipline the return to capital using the cost of capital measured in the data using \(R=r+\delta\). We look at the return on corporate bonds, which yields a value of \(r=0.08\). We assume a standard depreciation rate of \(\delta =0.05\) to yield R values of 0.13 in India. This yields an average markup of 1.21. We label this second markup measure as “CRS,” which stands for the constant returns to scale assumption.

Our third method uses the markup formula in Eq. (13), but instead of estimating \(\theta _{i,t}^{M}\), it simply assumes that the production function is Cobb–Douglas with respect to materials, i.e., \(\theta _{i,t}^{M}= \theta ^M\). This is a strong assumption on functional form, but it is internally consistent and is therefore our preferred method. This measure of the markup is merely the inverse of the share of materials normalized by a scalar, the elasticity of output with respect to materials, \(\theta ^M\). This scalar cannot be identified from the factor payment share to materials in the presence of market power. Hence, we need to simply assign a value. Specifically, we choose \(\theta ^M\) so that the average level of these markups equals the average measured using the CRS method.Footnote 12 Naturally, the qualitative patterns are independent of this scaling, but the magnitudes depend on it. We refer to this third markup measure as “CD”, which stands for Cobb–Douglas. In each case, markups are clearly measured with substantial error. We therefore winsorize 3 percent in both sides of the tails of each 2-digit industry in each year.

In addition, we can measure the labor-based markup \(\mu _{nkit}^L\). We measure the labor-based markup again using the CD approach, assuming a constant \(\theta ^L\). However, since we lack a solid target for markdowns (analogous to our the markup target used to assign \(\theta ^M\)), \(\theta ^L\) cannot be identified in the same way. Instead, we calibrate this elasticity by using a structural prediction of the model. Namely, Eq. (16) says that if the firm has no market power, then the ratio \(\mu ^L_{nkit}/\mu ^M_{nkit}\) is equal to one. We choose the value of \(\theta ^L\) to satisfy this equation.Footnote 13 We again winsorize the 3% tails of the distribution in each 2-digit industry in each year. Notice in the CD case that the markdown becomes materials payments divided by labor payments multiplied by a constant equaling the ratio \(\theta ^L/\theta ^M\).

Table 1 gives summary statistics for the most important measures coming from this analysis, including implied values of markups and markdowns. The first three columns give summary statistics based on sampling weighted data using the survey-provided sampling weights. Average markups range from 1.07 (in the CD formula) to 1.42 (in the DLW formula). The average values of the markdowns are 1.01 across all three formulas. The average firm is located 119 km from the completed GQ highway and has a labor market share of 0.10 in the district-industry.

Table 1 Summary statistics

The last three columns give alternative summary statistics that additionally weight by labor-compensation, which lead to somewhat larger markdowns.Footnote 14 The regressions we report utilize survey-provided sampling weights alone. Theory gives us no direction in choosing between raw (sample-weighted) regressions and labor-compensation weighted, since these should be identical. Empirically, the magnitudes of coefficients are very similar, but standard errors are larger and so we lose significance when weighting by labor compensation.Footnote 15

Examining these summary statistics, we must acknowledge that the distribution of (normalized) measured markdowns is not consistent with the distribution implied by the theory along a key dimension. In particular, the median markdown ratio is well below one, while the formula in Eq. (16) implies markdown ratios weakly greater than one. However, we are not distressed by this apparent conflict, since it would follow naturally from measurement error, which is clearly present in the data. Since markdown ratios have the markup in the denominator, classical measurement error in the denominator would lead to non-classical measurement error in markdowns. It would impute a rightward skew to the distribution, which we see, with the mean well above the median.

However, by Jensen’s inequality, the average markdown in the raw distribution may be then overstated, so that our rescaling factor may be too large. When we normalize \(\theta ^L\) using the regression constant as above, the normalization may push too many firms below a markdown ratio of one. Again, this scaling will affect the level of markdown ratios and therefore the magnitude of regression coefficients, but it has no effect at all on qualitative patterns nor the implications for the impact of labor market power on labor’s share.Footnote 16 It will only affect the magnitudes of markdown ratios and the estimated regression coefficients.Footnote 17

The validity and robustness of our markdown measures is an important concern. We can evaluate the validity of our measures on multiple fronts. First, we can examine areas where market power is unlikely. In these four urban areas, which we drop from our analyses, we do not measure positive markups on average, and, while there is still considerable variation across firms, the standard deviation is a third less than outside of these urban centers. Second, our markdown measurement relies on the assumption of Cobb–Douglas production functions, or at least a log linear local approximation to the production functions around the relevant production points for the small changes we measure. Looking more closely, we can examine the markdown of electricity, which is unlikely to be monopsonistic. The implied markdown is remarkably stable (or, equivalently, non existent once properly scaled), reflecting the fact that the average electricity share of expenditures is stable, fluctuating between 13 and 15 percent. Thus, Cobb–Douglas seems reasonable. Third, using the same methods, Brooks et al. (2021) find that the various measures are robust to estimating assumptions, and the results in Fig. 1 and the remaining analysis confirm this.Footnote 18 Finally, we will also see that our markdowns are correlated with firm labor market share, consistent with the exercise of monopsony power.

4 Empirical Analysis

4.1 Main Results

In this section, we present our empirical results. Our principle test is to evaluate how connection to the expressway impacts labor market markdowns. We therefore estimate the following equation:

$$\begin{aligned} \frac{\mu _{nkit}^L}{\mu _{nkit}^M}= & {} \chi _n+\chi _t+\alpha \ln (py_{nkit})+ \beta _1*\chi _{tercile=1}*\chi _{t>2006}\nonumber \\&+\beta _2*\chi _{tercile=2}*\chi _{t>2006}+u_{nkit} \end{aligned}$$
(24)

where \(\chi _n\) and \(\chi _t\) are firm- and time-fixed effects; \(\chi _{tercile=1}\) and \(\chi _{tercile=2}\) are indicators for whether the firm is located in one of the closest or middle terciles of locations from the expressway; and \(\chi _{t>2006}\) is an indicator for whether the date is after the “completion” (i.e., 95%) date. We also control for the (log) level of output, \(py_{nkit}\), since the ability to exercise monopsony power may be greater when firms are larger. We exclude those firms located within 50 km of the four major metropolitan “corners” of the Golden Quadrilateral to focus on newly connected areas (although this does not have important impacts on our results). Since the omitted group is the group that is furthest from the GQ, the estimates of interest are \(\beta _1\) and \(\beta _2\), the coefficient on the double interactions, as they are interpreted as the impact of connection relative to those firms that remain unconnected. (Since they would be subsumed by the plant and year-specific fixed effects, we do not include the tercile indicators or completion indicator as direct effects.)

The estimates are presented in Table 2. (Standard errors are clustered at the district level throughout the results we present.) As anticipated by theory, firm size is substantially associated with an ability to exercise monopsony power. More to the point, however, we see that relative to the unconnected firms in the third tercile, the firms in the first and second tercile show smaller markdowns after connection, and the \(\beta _1\) and \(\beta _2\) estimates are significant at the 1 percent level. The magnitude of the first and the second terciles is quantitatively similar and indeed, statistically indistinguishable. Moreover, the estimates are very similar across the three alternative measures of markdowns. The estimates are also economically significant: recalling that the average markdown was about 1.15 percent across these specifications, we cannot reject that connection to the GQ fully eliminates the exercise of monopsony power.

Table 2 Impact of highway on markdown

4.2 Robustness

Next, we study the robustness of these results to additional controls or specifications. The results of these exercises are given in the appendix.

First, while we cannot allow for district-specific year-to-year variation, in Table 6 we add state-time dummies as controls which allow for year-to-year variation to differ at a subnational level. For some states and years, these state-specific time patterns are significant, but our results are quite robust, which indicates that the identification is driven by within-state, across-district variation.

Second, Fig. 3 shows that many coastal areas are close to the GQ, and perhaps coastal districts had a differential time pattern. We allow for this by creating a dummy for the coastal district, and interacting it with time dummies as well as shown in Table 7. The time trend is insignificant, and the results are virtually unchanged.

Third, we have noted NREGA, which was enacted around the same time and may have disproportionately impacted rural districts relative to more urban districts. In Table 8, we allow for two district-level controls: the number of job cards per capita, which measures the extensive margin or breadth of the program, and the total per capita labor expenditures, which measures the intensive margin or depth of the program. The former is ultimately significant, and associated with a larger markdown, but it does not impact our result for the impact of the GQ.

Finally, to account for potentially endogenous placement of roads, we also construct terciles using the distance to a straight line, minimum distance connection between the metro centers, and our results are again robust. In sum, the results are quite robust, and this is true for each table we present. These results are given in Table 9.

To get a better sense of the year-to-year identification, we start by grouping the first two terciles together (since their magnitudes were similar). We then interact this group with every year in the sample (instead of relying on the post-2006 sample). That is, we estimate:

$$\begin{aligned} \frac{\mu _{nkit}^L}{\mu _{nkit}^M}=\chi _n+\chi _t+\alpha ln(py_{nkit})+\sum \beta _t*\chi _{tercile=1,2}*\chi _{t}+u_{nkit} \end{aligned}$$
(25)

The estimates of \(\beta _t\) over time, along with their 95% confidence intervals, are presented visually in Fig. 4. The omitted year is completion year of 2006, which is therefore normalized to zero. We see that prior to 2006, the estimates are not statistically different from zero. After 2006, however, the coefficients show a strong break becoming negative and significantly so. However, using a Chow test, we reject a single linear trend in the data at the 5% level, i.e., a structural break in 2006 exists in the data.

Fig. 4
figure 4

However, using a Chow test, we cannot reject a single linear trend in the data at the 5% level, i.e., a structural break in 2006 exists in the data

We therefore find strong evidence for the decline of markdowns after the road is connected. We now try to understand this finding in more depth.

One possible explanation for the declining markdown measure is that it fell because markups rose. Recall that the markdown is the ratio of two measures of “markups”: the labor measured markup over the true markup. One reason that the markdown could fall is that the true markup rises. Another reason to look at markups is because the markdown reflects the ratio of materials payments to labor payments. A decrease in materials payments, due to access to cheaper inputs that the expressway opened up together with a low elasticity of substitution toward inputs, for example, could appear as an increase in markdowns. This would also show up as an increase in markups.

For both of these reasons, we examine the impact of the expressway on the markups themselves. That is, we run regressions analogous to those in Table 2, but where the dependent variable is the markup rather than the markdown.

The results are presented in Table 3. We find mixed evidence regarding the impact of highway on markups. The coefficients on the double interactions are small and statistically insignificant when using the CD and DLW measures. For the CRS estimates, the magnitude is larger (at least five times as large), and the impact on the first and second terciles is statistically significant. Recall, however, that a problem with the CRS measure of markups is that the gross profit margin can incorporate profitability that comes from markdowns. Hence, these larger results are not inconsistent with the results that markdowns fell in these areas. In any case, the result that connection reduces markups is consistent with the idea that the expressway lowered not only travel costs but trade costs as well, and induced more competition into these regions. Indeed, this is the central argument in Asturias et al. (2019).

Table 3 Impact of highway on markup

Figure 5 is the analog to Fig. 4 for markups. The trends are less stark, and, except in the case of the CRS measure, not statistically different from zero after the highway.

Fig. 5
figure 5

Impact of highway proximity on markups over time. Notes This graph plots the estimates of the effect of highway on markup over time, along with their 95% confidence intervals. The vertical dashed line indicates the year of highway completion

In sum, there is some evidence that markups fell as a result of the GQ, but this evidence is weaker than that for markdowns. In any case, a drop in markups cannot be a contributing factor to the observed decline in markdowns as the markup is in the denominator of the markdown.

4.3 Labor Compensation

We now explore payments to labor more closely. In particular, we continue to combine the first two terciles into a single group, and we examine other dependent variables impacted by the expressway connection: labor compensation, labor’s share, and the shares of labor in the market at either the state or district level. The results are shown in Table 4.

Table 4 Impact of highway on labor payment measures

Focusing on column (1), it is not surprising that (log) labor compensation is strongly correlated with size. Although the coefficient may seem low relative to advanced economies, it is comparable to labor’s payment share as shown in Table 1. Again, more to the point, the coefficient on the interaction term shows that labor compensation increases by 11 (log) percentage points for those newly connected firms. Moreover, in column (2), we see that labor’s share increases by 5 percentage point. Hence, we have direct evidence that labor compensation and labor’s share is directly impacted.Footnote 19

Columns (3) and (4) examine how connection to the expressway impacts the market share in the labor market. We find a significant increase in the market share of connected firms. The market share increases by 0.5 percentage points in the state-industry and 0.8 percentage points in the district-industry. Recall Eq. (16): in theory markdowns are increasing in labor market share and decreasing in the elasticity of labor supply. Therefore, the drop in markdowns of connected firms cannot be explained by market share in the labor market.

Finally, in Fig. 6 we check for the possibility of pre-existing trends in labor’s share and labor compensation. We find no evidence at all of a pre-trend in labor’s share as shown in the left panel. The right panel, however, exhibits three years that are significantly below zero. Nevertheless, applying a linear trend and using a Chow test for structural break in trend, we cannot reject a straight line, i.e., the lack of a structural break in trend in 2006. Again, the upward pretrend would not be inconsistent with the gradual construction of the expressway, of course, and so our results are best interpreted as a before and after comparison indicated by the horizontal lines which are the average pre- and post-2006 coefficients.

Fig. 6
figure 6

Nevertheless, applying a linear trend and using a Chow test for structural break in trend, we [delete: cannot] reject a straight line, i.e., the lack of a structural break in trend in 2006

4.4 Effect on Elasticity of Labor Supply

Next, we examine whether the effective elasticity of labor supply changed in response to the expressway connection. We have a fixed geographic sense of a labor market, but the expressway may have changed the effective pool from which workers can be drawn. In that sense, the labor supply elasticity is not necessarily picking up the elasticity of an individual worker or even a fixed set of workers, but the elasticity of supply that comes from neighboring areas responding to wages, or, in the case of a markdown, the ability of workers to find employment in neighboring areas if wages were suppressed.

To operationalize this, we start with Eq. (16) as motivation for the following regression equation:

$$\begin{aligned} \frac{\mu _{nkit}^L}{\mu _{nkit}^M}= & {}\, \chi _n+\chi _t+\alpha ln(py_{nkit})+\gamma _1*s^L_{nkit}+\gamma _2*\chi _{tercile=1,2}*s^L_{nkit}\nonumber \\&+\gamma _3*\chi _{tercile=1,2}*\chi _{t>2006}+\gamma _4*\chi _{tercile=1,2}* \chi _{t>2006}*s^L_{nkit}+u_{nkit} \end{aligned}$$
(26)

where \(s^L_{nkit}\) is again the firm’s share in the labor market. Here, the triple interaction coefficient \(\gamma _4\) is of particular interest as it reflects the post-access change in the inverse labor supply elasticity. If the elasticity of labor supply increased, we would expect this coefficient to be negative.

In estimating a different version of (16), however, Brooks et al. (2021) note that the firm’s labor compensation shows up in the denominator of the markdown and the numerator of the labor market share. Hence, any measurement error will cause a spurious correlation. To avoid this, they use a two-stage regression, which we also adopt, instrumenting for labor market share by using the firm’s share in the local product market. Intuitively, the more output a firm produces, the more labor it should hire, but this should not directly affect the ratio of materials and labor payments (i.e., the markdown). (In the case that factor ratio increases with the scale of operations, we can add log output as a separate control in the regression.)

Table 5 shows the second stage results of these equations. (The first stage is quite strong and is included as Table 10 in the Appendix.) Column (1) measures the labor market share at the state level, while column (2) measures it at the district level. The estimated coefficient on firm’s labor market share of 0.319 in column (1) implies a labor supply elasticity of about 3.1 at the state level before 2006, whereas the coefficient of 0.092 at the district level implies an elasticity of 10.3 at the district level. Both estimates are strongly statistically significant, and the larger elasticity at the district level is consistent with more integration in the labor market across districts than across states.Footnote 20

Nevertheless, our primary focus is the interaction terms with post-2006, and those firms in proximity to the expressway. These coefficients are puzzling, however. The triple interaction coefficient, \(\gamma _4\) above, is insignificant at the state level and the district level. Instead, we see that firms’ markdowns fell in connected areas after the expressway is completed, but that decline happens uniformly across all firms rather than disproportionately for firms who have more labor market power. Thus, the observed decline in markups does not have the direct interpretation of an increase in labor supply elasticity that comes from increased labor market integration.

Table 5 Relationship between markdown and labor market share

We have also examined whether the decline in markdown is instead a spurious artifact of a decline in input prices that comes from connection. A decline in the prices of materials, together with a low elasticity of substitution with respect to materials, could lower the ratio of material expenditures to labor compensation, which we might falsely interpret as a decline in markdown. We indeed find that connection leads to significantly lower input prices, and the measured markdowns can be significantly decreasing in materials prices, but this does not undermine our measured impact of the highway connection on markdowns as shown in Table 11 of the appendix.

In sum, the decrease in markdowns is not easily explained with the existing theory as a decline in monopsony power coming from either lower labor market shares or increased labor supply elasticity. It remains an open question.

4.5 Aggregate Labor Share

We last examine the impact of monopsony power and the change in monopsony power that comes from the expressway on aggregate labor’s share in the manufacturing sector. Following Eq. (4), we calculate the actual and counterfactual labor’s share in the data in every year. We construct two counterfactuals: labor’s share in the absence of all markdowns (which is higher than observed) driven by market-power and labor’s share in the absence of highway (which is lower than observed). The latter counterfactual is constructed by subtracting the coefficients in the CD case of Table 2 from the firm-specific markdowns of those firms located closer to the expressway.

The results are below in Fig. 7. The solid line shows the actual, observed movements in labor’s share in manufacturing. The dashed line above it shows the counterfactual labor’s share in the absence of market-power driven markdowns. The overall impact of monopsony is sizable, especially early in the sample, confirming the results of Brooks et al. (2021). It ranges from a high of 7 percentage points (in 1999) to a low of 3 percentage points (in 2009).Footnote 21

Monopsony can therefore explain a substantial amount of why labor’s share is low in Indian manufacturing. Starting in 2006, we add the dash–dot line, which shows the impact of the expressway by plotting what the counterfactual labor’s share would have been in the absence of the highway. Labor’s share in this counterfactual world is smaller, indicating that the expressway increased observed labor’s share by between 1.8 (2007) and 2.3 (2011) percentage points. The impact of the expressway is smaller for three reasons. First, the expressway does not completely eliminate monopsony power for firms closer to it. Second, not all firms are impacted by the expressway. Third, the impact occurs in a time where monopsony power is less important for labor’s share.

Fig. 7
figure 7

Aggregate labor’s share in manufacturing: observed and counterfactual. Notes This graph plots the time path of the observed aggregate labor’s share (solid line), the implied labor’s share in a counterfactual economy without estimated markdowns (dashed line), and the implied labor’s share in a counterfactual economy without the expressway (dash–dotted line)

In sum, our empirical results show a sizable impact for the expressway both in decreasing markdowns and increasing labor’s share.

5 Conclusion

This paper presents some of the first evidence available on the role of infrastructure in affecting firm markdowns by evaluating the impact of India’s Golden Quadrilateral expressway expansion on the monopsony power in the labor market. Firms in districts closer to the highway exhibit significantly lower markdowns than firms in more remote areas. The impact of highway on markdowns is substantial: the average markdowns are effectively eliminated for firms within close proximity to the highway. These lower markdowns have raised labor’s share by about 2 percentage points.

The causes of the lower markdowns for connected firms remain a puzzle. We show that the impact of highway on markdowns cannot be explained by a decrease in markups. We also find no evidence that labor supply elasticities increase when districts are connected to the highway. This brings into question whether the decrease in markdown is driven by increased integration across labor markets.

We must leave to future research the goal of identifying clear mechanisms. Nevertheless, the results in this paper are important in showing highway infrastructure investment reduces monopsony power and promotes competition in the marketplace.