1.1 Introduction

Agricultural production and productivity are location-specific in which factors like soil conditions, physical infrastructure, and weather events play an important role. Location also matters for farmers’ decision on choosing inputs and outputs as well as on adopting new crop varieties and other research-based technology. The adoption of agricultural research, recognised as a primary source of agricultural productivity change in many countries, tends to be location-specific, and so neighbourhood influence is suspected to play a role in the research-productivity nexus. However, this spatial pattern has been ignored when estimating factors determining the agricultural productivity. In fact, there has never been any study undertaking the spatial analysis of research-productivity relationship in any developing countries. The key concern is that if the spatial dimension existed, then the previous investigation of agricultural research impact on productivity ignoring the role of spatial patterns could be biased (Anselin 1988).

This study is one of the first efforts bringing attention to spatial or geographic issues when investigating the agricultural research impact on productivity in developing countries. It aims to fill gap in the literature by incorporating spatial effects in the productivity determinant model using provincial-level data, covering 76 provinces of Thailand for the year of 2004, 2006, 2008, 2010, and 2012. The main objective is to test the existence of any spatial pattern of research-productivity relation for the case of Thai rice production. The rice sector of Thailand was chosen as a case study because rice is the dominant crop in Thai agriculture where regional difference in rice varieties and farming practices is well observed. Provinces in nearby areas often share similar inputs, types of rice varieties, and infrastructure (transportation and irrigation systems). It is possible that rice productivity in one province is related to nearby provinces and that site-specific factors could influence the determinants of rice productivity. Literature also shows that the benefits of R&D are spatially selective and tend to concentrate in certain areas (Capello and Lenzi 2015). Therefore, the spatial issue deserves serious attention and is attractive enough to conduct a spatial analysis.

The paper consists of six sections. This section provides an introduction and the motivation for the study. A literature review then briefly describes how previous studies conducted their analyses on the links between agricultural research and productivity, and studies applying spatial approaches are also highlighted. Section 1.3 specifies the models, estimation techniques, and testing strategies and begins with the standard OLS specifications in order to perform the diagnostic tests for the presence of spatial pattern followed by the spatial specifications. Section 1.4 describes the sources and definitions of data and variables used in the estimation models. The regression results are interpreted in the fifth section with emphasis on whether any spatial pattern exists and the implications for agricultural productivity. Finally, a conclusion is drawn.

1.2 Literature Review

Agricultural productivity and its link with agricultural research have long been studied, and a number of previous empirical studies confirm that agricultural research has a positive and significant impact on the productivity (Evenson 2001; APO 2001). Numerous studies associate productivity growth with technical change attributed to agricultural research (Ruttan 1987; Evenson and Pray 1991; Fan and Pardey 1997; Evenson 2001; Kelvin et al. 2005). Most studies have focused on the role of public research since research investment is primarily public sector activities and the influence of private research on productivity is mostly unknown. The majority of these studies employ both national- and subnational-level data using econometric models and techniques such as OLS, seemingly unrelated regression, error correction modelling, and panel data regression techniques. The overwhelming conclusion of this empirical research is that investment in public research and extension has been a primary source of agricultural productivity change in many countries (Evenson 2001). Similar conclusion also applies to the case of Thai agriculture in which recent studies found that agricultural research (public, private, and foreign research) together with infrastructure and climate factors plays a crucial role in stimulating the productivity growth (Suphannachart and Warr 2011, 2012; Suphannachart 2016). Suphannachart (2013), focusing on the rice sector, shows that the public investment in rice R&D and the adoption of high-yielding rice varieties have positive and significant impacts on the rice productivity. However, these previous studies do not take into account the role of spatial effects on the linkage between productivity and its determinants.

Spatial econometrics has been widely applied in a number of researches in economics in which location and neighbourhood influence play an important role (Anselin et al. 2004; Baylis et al. 2011; GeoDa Center 2016). Baylis et al. (2011), in particular, provide a review of empirical literature applying spatial econometric methods for panel data in agricultural economics with an emphasis on the effect of climate change on agriculture. The study also highlights an important role of location and application of spatial techniques in many research topics of agricultural economics, in which land is immobile, weather events affect farm decisions, policies are set by regional political boundaries, and information is often regionally explicit. There are also various examples of studies with applications of spatial models in finance and risk management, production and land economics, development economics, and environmental economics. Capello and Lenzi (2015) is one study that supports the existence of spatial dimension of knowledge-innovation nexus and shows that when the source of innovation is formal (R&D), the benefits are spatially selective and tend to concentrate in certain area. However, the application of spatial econometric methods in the analysis of productivity and technical change is still limited.

1.3 Methodology and Estimation Techniques

The productivity determinant model is constructed based on the production function framework in which TFP growth is identified as a shift in the production function representing technical change. The TFP is measured as that part of rice output growth not explained by growth of measured factor inputs using the Solow-type growth accounting method. Under this method, output growth can be decomposed into the growth rate of the efficiency level and the growth rate of primary factor inputs, weighted by their cost shares. The TFP measurement follows the same method employed in previous studies of Suphannachart and Warr (2011 and 2012). The potential determinants of rice TFP incorporate factors affecting mainly the technological change such as seed technology and expenditures on research and development, which is similar to the previous study of Suphannachart (2013). Specifically, the model includes rice TFP as the dependent variable and rice research budget, high-yielding rice varieties adoption, irrigation, rainfall, and weather conditions as explanatory variables. Extension from the previous studies is an incorporation of the spatial relation using provincial-level data.

As the objective of this study is to test the existence of a spatial pattern in the research-productivity nexus, three estimation methods are employed consecutively. The first estimation method applied to the TFP determinant model is pooled OLS. The second method is panel data techniques (fixed effects and random effects). The third method is spatial regression techniques (spatial lag and spatial error).

The estimation equation is as follows:

$$ \mathbf{P}=\mathbf{X}\boldsymbol{\upbeta } +{\alpha}_i+t+\boldsymbol{\upvarepsilon} $$
(1.1)

where P is a vector of log of dependent variable (i.e. productivity or TFP index at provincial level); X is a matrix of log of explanatory variables including rice research budget (R), actual adoption of high-yielding or modern rice varieties (HYV), amount of rainfall (Rain), irrigated area (I), and weather-related and natural factors (W); α i is provincial-specific fixed effect; and t is time dummies.

Equation (1.1) is first estimated by pooled OLS and lumping the fixed effect in the error term. Without considering the spatial effects, the pooled OLS is inconsistent when the omitted variable bias is a problem, but it is unbiased though inefficient otherwise. The model is then estimated using panel data techniques, fixed effect and random effect models. The Hausman test is used to determine whether fixed effects (FE) or random effects (RE) is more suitable. The null hypothesis under the Hausman test is that the coefficient of the FE model is the same as the coefficient of the RE model. If the null hypothesis is rejected, then the fixed effect is correlated with the explanatory variables. Hence, the omitted variable bias is a problem and the FE model is preferred. However, the interest of this study does not focus on the estimation results of the OLS specifications. The purpose here is to perform the diagnostic tests for the presence of spatial dependence in the error terms of OLS regressions. If there is a sign of spatial dependence in the OLS residuals, then OLS is inappropriate. The above equation is extended to incorporate the neighbourhood influence or spatial effects.

In the standard linear regression model, there are two types of spatial effects, spatial dependence and spatial heterogeneity, which can be incorporated in two ways (Anselin 1999). First, the spatial effect or spatial dependence is included as an additional regressor in the form of a spatial lagged dependent variable and so is called a spatial lag model. It is appropriate when the focus of interest is the assessment of the existence and strength of spatial interaction. Second, the spatial effect or spatial heterogeneity is incorporated in the error structure, called a spatial error model. This model is appropriate when the concern is to correct for the potential bias of spatial autocorrelation due to the use of spatial data that varied with location and are not homogeneous throughout the data set. It is typical to undertake the spatial analysis using both models since they are actually related. The spatial heterogeneity in the error structure can be considered an underlying reason behind the spatial lag model, although the spatial dependence can be observed more clearly. This study employs both spatial models.

In the spatial lag model, TFP in one province is assumed to be spatially interacted or dependent to TFP in neighbouring provinces. In other words, the model captures the neighbourhood spillover effects and hence takes the following form:

$$ \mathbf{P}=\rho \mathbf{WP}+\mathbf{X}\boldsymbol{\upbeta } +\boldsymbol{\upvarepsilon} $$
(1.2)

where P and X are dependent and explanatory variables in the OLS specifications, ρ is spatial dependence parameter, and W is a n× n standardised spatial weight matrix (where n is the number of observations). In this study, W is 380 × 380 symmetric matrix as the data include 76 provinces for 5 years.

Spatial weight matrix, W, is taken to represent the pattern of potential spatial interaction or dependence. It reveals whether any pair of observations is neighbours. For example, if province i and province j are neighbours, then w ij  = 1 or zero otherwise. In this study, any pair of provinces is considered neighbours if they share common borders (contiguity basis).

For ease of interpretation, the spatial weight matrix is typically standardised so that every row of the matrix is summed to 1 (i.e. ∑ j w ij  = 1). That is, all neighbours of a province are given equal weight, and hence all provinces are equally influenced by their neighbours. It is also important to note that the elements of the weight matrix are non-stochastic and exogenous to the model.

In the spatial error model, the data collected at each province is assumed to be heterogeneous as every location has some degree of uniqueness relative to other locations. That is, the nature of spatial data can influence the spatial dependency, and hence the error term is spatially correlated. The model takes the following form:

$$ \mathbf{P}=\mathbf{X}\boldsymbol{\upbeta } +\boldsymbol{\upvarepsilon}; \kern0.5em \boldsymbol{\upvarepsilon} =\lambda \mathbf{W}\boldsymbol{\upvarepsilon } +\mathbf{u} $$
(1.3)

where P and X are dependent and explanatory variables in the OLS specifications, λ is spatial error parameter, and u is an error term that satisfies the classical assumptions of independent identical distribution (i.i.d) with constant variance σ 2. W is the spatial weight matrix.

For the estimation technique, the maximum likelihood estimation (MLE) is used. The reason for this is that in the spatial lag model, OLS is biased and inconsistent due to the endogeneity problem, whereas, in the spatial error model, OLS is unbiased but inefficient due to the spatial autocorrelation in the error term.

To test for the existence of a spatial pattern, two tests are conducted (Anselin 1988). A diagnostic test for the presence of spatial dependence in OLS residuals is conducted first using the Moran’s I statistics (MI). The null hypothesis under this test is the absence of spatial dependence. If the null hypothesis is rejected, there is a sign of spatial pattern which it is necessary to investigate further using the spatial regression models. The second test is then conducted on the spatial models using the Lagrange multiplier (LM) test. This is the test for significances of spatial parameters. The null hypotheses are ρ = 0 under the spatial lag model and λ = 0 under the spatial error model. Under the null hypothesis, the test statistics have a chi-squared distribution with one degree of freedom. If the test statistics is greater than the critical value, the null hypothesis is rejected. The significances of spatial parameters confirm the existence of spatial effects or neighbourhood influence in the TFP determinant model.

1.4 Variables and Data Description

The estimation of TFP growth can be expressed as the residual part of output growth that cannot be explained by the combined growth of primary inputs. The primary conventional factor inputs used in this study include land, labour, and capital. Aggregate input is weighted average of growth of each input where weights are their varying cost shares.

The TFP determinant model employed in this study incorporates factors affecting mainly the technological change such as seed technology and expenditures on rice research. Other relevant economic and noneconomic factors are also included to explain the residual TFP such as infrastructure, rainfall, and natural factors. Specifically, agricultural research budget is used to represent a major source of technical change that raises productivity. An increase in rice research budget is expected to raise TFP. Only national public research is considered because rice research in Thailand has long been conducted by the public sector at national level and there are data limitations on other funding sources of research and extension. Seed technology is also included using the adoption of high-yielding rice varieties as it plays a crucial role in determining rice productivity (Evenson and Gollin 2003). The adoption is measured as shares of rice varieties planted areas. Amount of rainfall is included as water is a crucial factor for rice growing. Irrigation, measured as proportion of irrigated area, represents an infrastructure factor that can raise rice productivity. The natural factor, measured as a proportion of rice harvested to total rice planted area, is also included. It represents the weather shock such as drought, flooding, rice disease, and insect or pest epidemic. Good weather like less occurrence of drought or flooding or pest epidemic should raise TFP relative to the opposite. This natural factor proxy has commonly been used in earlier studies, for example, Setboonsarng and Evenson (1991), Pochanukul (1992), and Suphannachart and Warr (2012).

The output and input data are pooled cross section and time series at provincial level, covering 76 provinces of Thailand for the year of 2004, 2006, 2008, 2010, and 2012. Altogether, the data contain 380 repeated observations on the same individuals (76 provinces) at different points in time (5 years). The data are mainly taken from the Office of Agricultural Economics, and some data series are drawn from Khunbanthao and Suphannachart (2016). All data are available at provincial level except research expenditure that is in national level. Definitions and sources of data used in this study are summarised in Table 1.1. All variables are transformed to logarithmic form, and summary statistics of variables used in the TFP determinant model are shown in Table 1.2.

Table 1.1 Summary of the data used in TFP measurement and TFP determinants
Table 1.2 Summary statistics of variables in the TFP determinant model

1.5 Results and Discussion

This section reports the estimation results focusing on the main question of whether there exists any spatial pattern when estimating the relationship between TFP and its determinants, particularly technology factors (research budget and high-yielding rice varieties adoption). The results using the three-step estimation techniques explained earlier are shown in Table 1.3.

Table 1.3 Estimation results of the TFP determinant model (Dependent variable is TFP: lnP)

The TFP determinant model is first estimated by pooled OLS. The estimates of every coefficient from pooled OLS conform to prior expectations. However, it is more likely that the estimates are inconsistent since the unobserved fixed effect is expected to correlate with the explanatory variables. Accordingly, the coefficients cannot yet be interpreted from this estimation. The correlation between the unobserved heterogeneity and the explanatory variables is confirmed when the Hausman test suggests that the fixed effect (FE) model is suitable. This means the coefficients of FE are statistically different from those of the random effect (RE) model, and hence the omitted variable bias is an important problem. However, if there exists any spatial pattern or spatial interaction in the dependent variable, the FE estimates are also inconsistent.

Therefore, the diagnostic test for the presence of spatial dependence in OLS residuals is conducted on both the pooled OLS and FE estimations. Moran’s I statistics is used to test the null hypothesis of the absence of spatial dependence. The null hypothesis is rejected at the 5% level of significance for the pooled OLS but failed to reject for the FE specification. Hence the presence of spatial dependence is confirmed only for the pooled OLS estimation. The pooled OLS estimation is inappropriate, and its estimates are only reported in Table 1.3 but not interpreted. The FE estimation is appropriate and its results are interpreted below. However, as the purpose of this study aims at testing the spatial relationship between research and productivity, the pooled OLS model which is proved exhibiting spatial patterns is extended to corporate spatial parameters. In this case spatial specifications are more appropriate, and the spatial lag and spatial error models, specified in Eqs. (1.2) and (1.3), respectively, are estimated by the maximum likelihood method.

To further test the significance of the spatial lag (ρ) and the spatial error (λ) parameters, the Lagrange multiplier test is conducted. The null hypotheses are ρ=0 and λ=0, and the test statistics follow chi-squared distribution with one degree of freedom. As shown in Table 1.3, only the p-value of spatial lag parameter is statistically significant; the null hypothesis is rejected at the 5% level of significance, and it is concluded that the spatial lag parameter is statistically significant. The significance of spatial lag parameter confirms that the neighbourhood influence is important. In particular, there exists spatial dependence between the rice productivity in neighbouring provinces. But there is no evidence that there exists spatial heterogeneity across the spatial data as the spatial error parameter is not statistically significant.

In terms of the spatial dependence represented in the spatial lag model, it can be directly interpreted that the TFP in one province is significantly associated with the TFP in neighbouring provinces, given that the spatial relationship is specified by the weight matrix. It is typical to observe that if one province has a certain level of productivity, its neighbours are highly likely to have a similar level. Therefore, neighbourhood influence is significant. For example, a province located near a major research station or a dam tends to benefit from a discovery of new rice varieties and irrigation projects. This benefit may spill over to neighbouring provinces and may also affect rice productivity and its determinants. The rice TFP in a province located near research centres and dams and shares similar amount of rainfalls is thus associated with the TFP in nearby provinces.

Regarding the interpretation of the coefficient estimates, the FE and the spatial lag models which are shown to be appropriate confirm the significant relationship between the TFP and the technology factor represented by actual adoption of high-yielding rice varieties and the infrastructure factor represented by irrigation area. The magnitude of the technological impact is small and similar for both the FE and the spatial lag estimations. Specifically, a 1% increase in the proportion of planted area of rice varieties that target output increasing leads to 0.03% (under the FE) and 0.04% (under the spatial lag) increase in the TFP index (an output produced out of a unit of total inputs used), respectively. Despite the small magnitude of the research impact on productivity (measured in terms of the adoption of rice varieties developed by rice research stations), it conforms to economic intuition and supports the results of previous studies that agricultural and rice research can raise agricultural and rice productivity (Suphannachart and Warr 2011; Suphannachart 2013; Khunbanthao and Suphannachart 2016).

As for the magnitude of the irrigation coefficients, they are quite different between the FE and the spatial lag results. Under the FE model, a 1% increase in the irrigation area results in 0.38% increase in the TFP, while under the spatial lag model, the same level of change in the irrigation can only raise TFP by 0.05%. For rice research budget variable, it is shown to be positively significant only in the FE specification. This is probably due to the research budget data are only available at a national level, and so their location-specific impacts captured in the spatial lag model cannot be observed. Rainfalls are shown to have a negative and significant impact only in the spatial lag model which makes sense because the climatic factor is better observed in the location-specific model in which neighbouring provinces tend to share similar amount of rainfalls.

1.6 Conclusion

This study is one of the first efforts to conduct the spatial analysis of agricultural research impact on productivity at subnational level in developing countries. It attempts to find out whether there is any significant spatial pattern when estimating the TFP determinant model using provincial-level data for the case of Thai rice production. The data cover 76 provinces of Thailand during the year 2004, 2006, 2008, 2010, and 2012. The analysis begins with the standard OLS specifications in order to test for the presence of spatial structure in the error terms. The estimation then proceeds to simple spatial econometric models as the empirical results confirm the presence of spatial structure in the error components. The significance of the spatial dependence is confirmed using the spatial lag model suggesting that the TFP in one province is significantly associated with the TFP in neighbouring provinces. However, there is no statistical evidence that there exists spatial heterogeneity across the spatial data as represented in the spatial error model.

The estimation results of both the OLS and spatial regression found the agricultural research (measured as high-yielding rice varieties adoption) impact on productivity (measured as rice TFP) to be statistically significant though small in magnitude. This is consistent with prior economic intuition and the results of previous studies. In overall, the spatial estimation results confirm that the rice TFP in Thailand tends to concentrate in particular areas where neighbourhood influence plays an important role. Therefore, when estimating the determinants of rice TFP for the case of Thailand or any other case where regional differentials are evident, the productivity level in one area tends to be related to the productivity in neighbouring areas, and this pattern should be incorporated in the estimation model. The significance of spatial correlation among provinces also implies that productivity-enhancing policy shall be developed targeting groups of provinces in close proximity or larger regional bases, rather than focusing on various small areas.