1 Innovation and wages literature’s review

In labor economics, various explanations have been provided to account for wage differentials from the point of view of labor demand (employers) and for wage differentials between “similar” workers (workers with comparable skills). Some of these theoretical explanations have been supported by empirical findings. In this section, we will briefly focus on both in order to introduce and motivate our approach.

1.1 Innovation and wages: what is the link?

Different streams of theoretical explanations have been recognized to account for wage differentials: unobserved heterogeneity, compensating differentials, efficiency wage theory, rent sharing and other theories mainly related to sociological considerations.

The first explanation relaxes the assumption of uniformity among workers relying on the idea that high-wage firms employ more productive workers: technologies respond in a different way to workers’ ability. Wage differentials could be explained by unmeasured abilities not accounted for in ordinary econometric estimations. Workers with more abilities receive higher wages due to their higher productivity, causing wage differentials between equally skilled workers (Arbache 2001).

Another branch of literature accounts for “compensating differentials” meaning that “monetary wage overstates (understates) the returns to work because it ignores extra costs (benefits) imposed by working conditions” (Groshen 1991). The main idea is that higher wages are required by more difficult job conditions in terms of safety, injury rates, uncommon risk of lay-off. However, little empirical evidence supports this argument.

A further explanation is related to the efficiency wage theory. According to the latter, a causal relationship arises between wage level and worker’s on-the-job productivity. Employers would like to pay higher wages, above the so called market-clearing wage, in order to capture the increment in productivity. The basic idea is that worker productivity depends on the wage received, which implies that higher wage represents a higher incentive to be productive for the worker. Furthermore, according to Shapiro and Stiglitz (1984)’s model, an increase in wage decreases a worker’s incentive to shirk, boosting worker’s productivity and lowering direct monitoring expenses. In this sense, the shirking model version of efficiency wage explanations predicts that wage differentials depend on the amount of monitoring costs between different firms and industries. Higher monitoring costs lead to higher wages.

The rent sharing models account for the economic rents faced by firms using advanced technology equipment. Then, wages might be higher in plants applying this equipment because workers are able to capture some of the rents associated with the use of these machines or with the introduction of innovation (Dunne and Schmitz 1995).

Also insider/outsider, bargaining, turnover costs and sociological model try to account for wage differentials from different points of view. According to the sociological model, firms paying higher wages are willing to pay higher wages to all their workers because of norms, loyalty feeling, fairness considerations.

Firm size is important to explain the relationship between technology and wage as underlined by Dunne and Schmitz (1995). The main idea is that the probability of a firm adopting advanced technology and the skill of the firm’s workforce are both increasing functions of the firm size due to the high costs of technological capital. According to this framework, wages are size premiums including components reflecting worker skills and advanced technology.

1.2 Innovation and wages: empirical contributions

Some studies have tried to empirically investigate which model reviewed above better fits the data. The unobserved heterogeneity hypothesis is supported by Freguaglia and Menezes-Filho (2007) for Brazil, as they find that the unmeasured skills of individuals seem to be an important factor accounting for 90 % of inter-industry wage differentials in Brazil. These kind of studies estimate the wage equation at individual level exploiting household surveys matched with labor surveys. Krueger and Summers (1987, 1988) find support for the efficiency-wage explanations of inter-industry wage differentials. Always from a firm-level perspective, Casavola et al. (1996) try to quantify the impact of innovation on earnings and employment by skill level, running cross sectional regressions for each year in a sample of 20,000 Italian firms per year. The proxy they used for technology is quite rough, being the share of intangible capital in total capital relative to industry average. After controlling for workers’ and firms’ covariates, they found 2–6 % increase in wages for each professional group due to the technology measure. However, the cross-sectional regression framework does not allow to control for unobserved heterogeneity and it is likely to be an upward bias in the estimates. Doms et al. (1997) exploit a dataset with information on technology use and workers’ characteristics and try to detect the impact of the adoption of new technologies on the structure of the workforce at the plant level. The main conclusion is that at the plant level, the correlation between technology use and worker wages is primarily due to the fact that plants with high wage workforces are more likely to adopt new technologies (Doms et al. 1997, p. 255). The measure of technology employed is based on the type of production machinery used at factory level, increasing the level of automation. The positive relationship found for technical, clerical and sales workers is not verified for managerial and professional wages.

According to Entorf and Kramarz (1998), this technology-wage premium should be the result of workers with higher unobserved abilities being more likely to use advanced technologies. Comparing cross-sections and individual fixed-effect estimates, they show that computer-based new technologies are used by abler workers. These workers learn and become more productive when they get more experienced with these new technologies (NT). In terms of wage differentials, the introduction of computer-based NT contributes to a small increase in wages, but this effect vanishes applying a fixed effect estimator. Laaksonen and Vainiomäki (2001) classify manufacturing industries according to four technology levels studying the effect of technology on establishment-level wages. The technology wage premiums are estimated separately for non-manual and manual workers using wage equations with control variables for plant and workforce characteristics over the time period 1974–93. The major weakness of the study is related to the technology measure adopted. Despite this, it seems interesting the attempt to control for exit/entry firms of the unbalanced dataset.Footnote 1 An interesting study from a methodological point of view is the one conducted by Ester Martínez-Ros on Spanish manufacturing firms. Martínez-Ros (2001) explicitly adopts the rent-sharing model discussed above and tries to verify whether it fits on Spanish firm level data for the period 1990–1994, following the hypothesis tested by Van Reenen (1994) and considering rents generated by technological innovations. The bargaining between labour market agents is finalized to obtain an increasing share in the innovative returns. The ultimate effect is an increase in the level of pay. Martínez-Ros distinguishes between process and product innovation; exploiting panel data techniques, she tests for an impact of technology on wages. From a firm-level perspective, Tan and Batra (1997), for example, analyze wage differentials not accounted for by workforce characteristics, collective bargaining, market power in Colombia, Mexico and Taiwan. According to the authors, these wage differentials result from firms’ technology-generating activities. Building on the idea of technological capabilities developed in Innovation Literature by Bell and Pavitt (1997), they distinguish between production capacity and technological capability. Methodologically, one of the main problem faced by these studies is related to data. As underlined by Abowd et al. (1999), it is important to have a unique longitudinal dataset on firms and workers allowing to identify both firm and individual effects. For the large sample of French workers and establishments, Abowd et al. (1999) found that both high-wage workers and firms are both explanations of inter-industry wage differential, but the individual-effects are a significant component of total annual compensation variation, even more than firm-effects.

More recently, from a sectoral point of view, Pianta and Tancioni (2008) estimate the impact of innovation on aggregate wages where innovation is proxied by innovation expenditure, share of innovative firms and innovative turnover. Wages tend to grow faster in those sectors characterized by higher innovation expenditure. Focusing on European industries, they distinguish different patterns of technologies, in particular the contrasting employment effects of new products and new processes support wage growth in different ways. Industries with high sales of new products push wages; on the contrary, sectors dominated by the adoption of new processes depress wages. They do not distinguish among professional groups.

1.3 The approach

A burgeoning literature has attempted to investigate the innovation–wages link for developed countries (Vivarelli 2014); however, there is an increasing attention on innovation processes in Latin America (Crespi et al. 2014). From this point of view, Acemoglu and Autor (2011) has enriched the skill biased vision of technology with a task based framework. In fact, most studies for Europe and US are focusing on polarization more than skill biased technological change. From Acemoglu and Autor (2011), we kept the idea of focusing on professions more than skills. Following Pianta and Nascia (2008)’s analysis, we aim to distinguish between different patterns of technologies investigating their role on wages by professional group.

In this study, based on a new large dataset of Chilean firms (Encuesta Longitudinal de Empresas), we investigate whether firms introducing some specific type of innovation pay higher wages for different professional categories, in a developing country.

Compared to previous studies on the subject, we consider not only manufacturing firms, but also services, agriculture, mining and construction sectors. We go beyond the traditional skill-biased effect of innovation and, rather than focusing on skilled/unskilled wage, we analyze wages across professional categories (Managers, Clerks, Skilled Manual Workers, Unskilled Manual Workers). Most empirical studies in Chile have tried to explain the relationship between innovation and productivity (Vergara 2010; Benavente 2002, 2005; Alvarez et al. 2010), the determinants of innovation (Lauterbach 2009), the mechanisms of technological absorption, the effects on employment levels (Crespi and Tacsir 2011) or the changes in the product mix of firms (Navarro 2010).

To the best of our knowledge, there is a lack of empirical work on innovation and wages for Chile. This study intends to provide fresh evidence on this topic which has been little debated for Chile.

The rest of the paper is organized as follows. Section 2 describes the data and the empirical methodology applied; Sect. 3 presents the results of a multivariate analysis and Sect. 4 panel estimations. Finally, Sect. 5 concludes.

2 Data and empirical methodology

2.1 Data

The dataset used for the analysis is based on the I and II Firm Longitudinal Survey (ELE) realized by the Chilean Ministry of Economy and available on the website of the Institute.Footnote 2

In order to realize a panel, we merged I and II Longitudinal Survey on the basis of the firm identification number. The I Longitudinal Survey is based on a sample of 10.213 firms and the II Longitudinal Survey of 7.062 firms. Following the firm identification number presented in both waves, we construct a balanced panel of 2,667 firms for the two available years (2007 and 2009). Both I and II Longitudinal Survey are composed by five different sections including data on: finance and accountability, marketing, characteristics of the ownership, employment, innovation and ICT technologies. We selected those variables considered important for our research question and related to firm general characteristics, employment, innovation and ICT technologies.Footnote 3 Only a few studies have so far used the ELE database and no one has exploited the panel dimension, due to the very recent publication (2012) of the II Firm Longitudinal Survey. In our panel, large and mature firms seem to be overrepresented compared to the reality of Chilean firms. Most firms are located in the Metropolitan Region of the capital Santiago. Finally, considering the sectoral composition of our dataset, 23 % of firms belong to the wholesale and retail sector. Only 13 % of sampled firms are manufacturing, on the contrary hotels and restaurants (10 %) seem to be overrepresented compared to the population mean (5 %). The mining sector represents only 5 % of our sample and 1 % of the population mean, although Chile is one of the main producer and exporter of copper. In Table 1 we report some descriptive statistics of employment in our panel.Footnote 4

Table 1 Descriptive statistics

In terms of typology of innovation introduced, 13 % focus on marketing innovations, 23 % on management innovations, 23 % on process and services innovations and 26 % on product innovations. On average 25 % of sampled firms seem to introduce only one type of innovation.

2.2 Empirical methodology

The analysis is carried out in two different steps. Firstly we perform a factor and cluster analysis in order to group observations detecting different clusters. Secondly, we apply panel data model techniques to investigate causality between innovation and wages by professional group.

2.3 Factor analysis

Factor analysis is an explorative technique which does not distinguish between dependent and independent variables, but it predicts factors on the basis of communalities (shared variance) among variables. In factor analysis the researcher can make the assumption of an underlying causal model aiming to find a few common factors linearly reconstructing the original variables. The factor loadings are computed using the squared multiple correlations as estimates of the communality. We perform an orthogonal varimax rotation of factor loadings, based on the independence among factors and maximizing the variance of the squared loadings within factors. Finally, we create factor scores on the basis of regression or Thomson and Bartlett scoring method, where factor scores are the coordinates of the original variables, x, in the space of the factors. Because the difference between the two solutions is not too large, we decide to retain the Bartlett’s factor scores being slightly larger and helping to better define clusters. In fact, regression-scored factors have the smallest mean squared error from the true factors but they may be biased.Footnote 5 We perform some post-estimation checks in order to assess the validity of factors retained such as Kaiser–Meyer–Olkin measure of sampling adequacy and Bartlett’s test of sphericity to test the null hypothesis that the variables are uncorrelated in the population.Footnote 6

2.4 Cluster analysis

From a conceptual point of view, also cluster analysis is descriptive, a-theoretical and non-inferential. The intuition behind this technique is to define the structure of the data by placing similar observations in the same groups. In order to achieve this, we use the factors retained in the previous section as clustering variables. As a similarity measure we apply the Gower’s coefficient, the most appropriate in the case of a mix of binary and continuous data.

In order to determine clusters, we perform a two-step cluster approach. The two-step approach allows to firstly conduct a hierarchical procedure to detect the number of existing groups and then a non-hierarchical clustering method having the advantage to reassign observations until maximum homogeneity within clusters is achieved (Hair 2010). This implies that the hierarchical procedure facilitates the assessment of groups in our sample as it is carried out in a stepwise fashion and trough an agglomerative method. Furthermore, the hierarchical approach permits to graphically evaluate the selected groups through a dendrogram. In terms of hierarchical clustering algorithm, we use the Ward linkage procedure. Ward’s method is based on the identification of clusters minimizing the within-cluster sum of squares across clusters. The selected clusters are those minimizing the increase in total sum of squares across all variables in all clusters. The disadvantage of this method is to be sensitive to outliers creating clusters with only a few observations. Clusters equally sized are usually conformed. In order to account for these disadvantages, we firstly performed the average linkage method being less affected by extreme values. The Calinski–Harabasz pseudo-F stopping-rule index helps to identify the correct number of groups in the sample. Then, we perform a non-hierarchical clustering procedure based on k-means method and again Gower’s measure. The non-hierarchical procedure assigns objects into clusters given the number of cluster and optionally same starting points. We try to perform the analysis with both specific cluster seeds and without assignment (random election performed in STATA). The advantage of K-means algorithm is to divide data into the number of clusters detected in the first hierarchical analysis and then iteratively reassigning observations to clusters till the distance of observations in the same cluster is minimized and the distance between clusters is maximized. According to de Jong and Marsili (2006), the k-means method using randomly selected starting points seems to be quite weak compared to select k starting points. In this sense, we employ the centroids of the initial hierarchical solution (k = 4) as starting points. This procedure is strongly recommended by Milligan and Sokol (1980) and Punj and Stewart (1983). Finally as post estimation checks, we perform MANOVA test in order to assess clustering variables validity and cluster stability.

2.5 Panel data techniques

The adoption of new technologies and the introduction of new products and processes affect wages by professional categories. Building on the hypothesis of rent sharing between employer and employees, innovating firms could realize some extra profits due to the introduction of new products in the market.

The wage-setting model follows the insider–outsider approach where incumbent workers are protected by labor turnover costs and specific skills that they could learn in the production process. The idea is that insiders have the interest to maximize rents. Following the standard wage equation (Layard et al. 1991), the determination of wages will depend on inside factors like firm’s activity and other firm characteristics (X), bargaining power of the union (s), wage that workers face outside the firm (\( \overline{w} \)).Footnote 7 The wage determination is therefore expressed as follows:

$$ w = w(\overline{w} , s, X, I) $$
(1)

Following the Schumpeterian approach, rents are considered as the reward for the first commercialization of an invention. As a measure of technology, we will use the cluster identification factor developed in the previous section. This allows to disentangle the type of technology introduced and the strategy behind it, along with other firms’ characteristics.

2.6 Identification strategy

In order to check the relation wage-innovation, we will adopt the following specification:

$$ Log(w_{it} ) = \alpha + \beta^{'} \varvec{CLUSTER}_{it} + \gamma^{'} X_{it} + \delta^{'} s_{it} + \vartheta^{'} SKILL_{it} + \rho FEMALE_{it} + \tau Productivity_{it} + \varepsilon_{it} $$
(2)

where \( w_{it} \) is the average annual wage at firm level expressed in logarithms constructed by dividing annual labour costs by the average number of employees in each firm for each year;Footnote 8 \( CLUSTER_{it} \) is a dummy capturing firm innovating strategy according to the results provided in the previous section (Product Innovators, Cost Strategy Innovators, Non Innovators); \( X_{it} \) is referred to firm characteristics such as location, sector, size and size squared, age, percentage of export, ownership (private national/private foreign/public); \( s_{it} \) is the percentage of union affiliation among workers; \( SKILL_{it} \) is the share of skilled workers (diversified by educational level, i.e.: university, secondary, basic education, no education); \( Productivity_{it} \) is a measure of labor productivity; \( FEMALE_{it} \) is the share of female workers on total workers and \( \varepsilon_{it} \) is a random term composed of heterogeneous effects, \( \mu_{it} \), and a standard mixed error term, \( v_{it} \).

2.7 Pooling the time dimension and model selection

In order to check the relation wage-innovation, we firstly estimate a pooled model as a pooled cross-section, allowing for heteroscedasticity. The advantage of pooled model is to exploit both cross sectional and temporal dimension of our dataset. Unfortunately, the possible presence of firm-specific effects not presented in the regression specification (omitted variables) and correlated with the error term could bias our estimates. Furthermore, errors tend to be not independent from one period to the next, i.e. they might be serially correlated. Errors could also be correlated across firms. In this sense, heteroscedasticity and auto-correlation could be a serious problem of our pooled model. Hence, the relevance of sector-specific effects is tested to evaluate the viability of pooling the cross-sectional dimensions. This is done by employing a two-step procedure: first the statistical relevance of the pooled against the random effects (RE) model is evaluated and then, if the RE specification has to be chosen, the RE model is tested against the fixed effects (FE) model. In the first case, we apply the Breush–Pagan LM statistic testing whether the variance of individual effects in the error term is zero. Then, through the Hausman test we evaluate the orthogonality of the individual effect being the condition to apply RE versus FE. In the random model, the individual effects (\( \alpha_{i} \)) are considered as part of the error term \( (v_{it } = \alpha_{i } + u_{it} ) \) assuming absence of correlation of individual effects and regressors per each firm and for each year \( \left[ {E(\alpha_{i} ) = E (u_{it} ) = 0} \right]{\text{ and}}\,\left[ {E\left( {\alpha_{i } X_{it } } \right) = E\left( {u_{it } X_{it} } \right) = 0} \right] \). On the contrary, the FE model assumes that the section-specific effects on the dependent variables can be captured by heterogeneous constant terms only, in other words by dummies operating as intercept shifters of the linear relations (Pianta and Tancioni 2008, p. 13).

2.8 Endogeneity control

Finally, endogenous regressors and measurement errors can violate the orthogonally between regressors and errors leading to biased estimates due to the inconsistency of the OLS estimator. We assume that endogeneity, if any, can affect innovation and productivity; thus, we instrument innovation measure and productivity through past values of both variables adopting an Instrumental Variables (IV) estimator in place of OLS.

3 Exploring innovative patterns: results

The aim of this section is to identify whether a variety of innovative patterns is present in the sample and to group firms according to their “innovative” behavior. Even if in the dataset we have information on the specific kind of innovation introduced, the clustering technique allows to check how innovation is related to other firm characteristics and if there is a clustering tendency in the sample. First, we perform a factor analysis and then a cluster analysis to group observations.

3.1 Factor analysis

The variables usually employed to build taxonomies of innovation are related to innovative output, innovative input and sources of innovation (Pavitt 1984; Archibugi et al. 1991; Arvanitis and Hollenstein 1998). The section on innovation and technology available in our sample is not detailed as much as the one available in the Innovation Survey like CIS, Community Innovation Surveys developed for European countries. We choose those variables directly related to the introduction and motivation of different typology of innovation. More in detail, we have five dichotomous variables according to the type of innovative output realized in 2007 (product, process, service, management or marketing innovation).

Applying the Kaiser criterion suggesting to retain any component with an eigenvalue greater than 1, we decide to retain two factors.Footnote 9 Furthermore, the scree plot graphically confirms our decision plotting the eigenvalues associated with each component (Figs. 1, 2).

Fig. 1
figure 1

Scree plot of eigenvalues after factor

Fig. 2
figure 2

Factor loadings

Due to the small number of variables and the multicollinearity among them, the proportion of cumulative variance accounted for the two factors is total. The first factor accounts for 80 % of total variance and the second one for 20 %. Finally, the interpretability criterion suggests to verify the meaning of the retained components in terms of interpretation according to the constructs under investigation. In this sense, the first factor seems to account for product innovation strategies and it has a strong positive score on product innovation, service innovation and product innovation strategies. The second factor has a positive score on process and management innovation and on cost strategy performance. The two factors seem to define two alternative innovation strategies, product and process innovation. This is a quite well established finding reached in the Innovation literature. For each retained factor, we have at least three variables with significant loading (<0.50). Finally, we created factor scores on the basis of regression or Thomson scoring and Bartlett scoring method.

Then, on the basis of factor retained we perform a hierarchical clustering procedure on our sample selecting four groups, as shown by the following dendrogram. We retain the four clusters solution because it satisfies both cluster stopping rules (Calinski–Harabasz pseudo-F stopping-rule and Duda–Hart index). All four the MANOVA tests (Wilks’ lambda, Lawley-Hotelling trace, Pillai’s trace, Roy’s largest root) reject the null hypothesis that the four clusters have equal means (Figs. 3, 4, 5).

Fig. 3
figure 3

Dendrogram for cluster analysis

Fig. 4
figure 4

Distributions of firms by clusters

Fig. 5
figure 5

Distribution of firms by clusters (adjusted by sampling weights)

The four clusters retained are composed by 1,505, 368, 443 and 349 firms, respectively. Applying sample weights, we still have the same proportions, even if the first group is larger and consequently the second, third and fourth groups are smaller.

3.2 Non innovators

The first cluster groups firms relatively young (13 years) compared to the other clusters. The amount of investments in innovative activities (innovative inputs) is comparatively small: the only component in Innovative investments is due to ICT expenditure (0.02 % of total investments). R&D is negligible as well as the number of innovations realized. Surprisingly, these firms are large in terms of sales (234.000–1.165.992 $), they export more or less 3 % of sales and most of the investments are “standard investments” in machines and equipment. For this large group of firms, we do not detect cost or product strategy. This cluster has been denominated “Non Innovators” including half of the sample (1,505 firms). On average, no firms declare to use innovation subsidies, validating the character of “no innovator”.Footnote 10

3.3 Cost strategy innovators

In the second cluster we find the oldest firms in the sample. The aim of their innovative effort is mostly devoted to cost saving. They export more than 7 % of their sales and their income is between 801 and 2.400 UF (40.000–100.000 $). We can denominate this cluster as “Cost Strategy Innovators”. In terms of innovative inputs, 82 % are standard investments, but still there is a 5 % for innovative activities, in particular ICT technologies (4 %) and R&D for only 1 %. In terms of innovative output, half of them have some type of process/product quality certification. Finally, 5 % seems to participate to innovation support programs, in this case financing process innovation.

3.4 Product strategy innovators

Finally, the third and fourth clusters underline the same kind of innovative strategy, namely product strategy. The third cluster groups smaller firms in terms of number of employees (206) and level of sales. Firms are younger than those in the fourth cluster and with lower level of investments in innovation (4 %) compared to the fourth cluster. In the last cluster, on the contrary, we find the highest level of investment in Innovative activities, even if the component in R&D is still very low (0.01 %). It is characterized by high levels of both product and process innovations and product strategy is the main motivation behind innovative activities. We decide to unify the third and fourth clusters sharing the same innovative strategy. We can denominate this last unique group as “Product Strategy Innovators”.

Due to the multisectoral composition of our survey, it seems interesting to analyze the sectoral distribution of firms in the three clusters (Fig. 6).

Fig. 6
figure 6

Sectoral distribution of firms by cluster

In absolute terms, wholesale, retail trade and manufacturing are the most represented sectors in our sample. Unfortunately, a more detailed classification is not available in the survey in order to decompose ISIC Rev.3.1 at two digit level. However, the presence of services and agriculture allows to enlarge the analysis beyond the manufacturing.

If we only consider the specific strategy and the typology of innovation introduced, it is possible to assess almost an equal distribution of firms across sectors. In absolute terms, the “no innovators” group is the largest one.

4 Wages and innovations in Chilean firms: results

As stated in Sect. 2, firstly we perform a pooled model estimation. Then we test for the null hypothesis of absence of section-specific random effects. Table 2 provides the estimates of average wage premiums for different professional groups. We use the pooled model \( (y_{it} = \alpha + \beta x_{it} + \varepsilon_{it} ) \) corrected by heteroscedasticity implementing the robust variance option adjusting for within-cluster correlation. Table 2 describes the variables included in the model (Table 3).

Table 2 List of variables and description
Table 3 Pooled model by professional group

In terms of product and cost strategy innovators, the introduction of new products seems to positively impact on wages for all professional group except for unskilled manual workers. Product innovator firms pay on average and ceteris paribus 13.58 %Footnote 11 more than non-innovator firms. The relationship is still significant for each professional category except for unskilled manual workers. The average annual wage for Managers in product innovating firms is 15.78 % higher than in non-innovating companies, for clerks the average firm wage is 11.24 % higher in product innovating firms compared to non-innovators, it is 14.54 % higher for skilled manual workers. We do not find significant coefficients for unskilled manual workers.

If the firm is classified as “cost strategy innovator”: we still find on average a positive relation, the average firm annual wage per worker is 13.71 % higher than non-innovators. Analyzing carefully this relationship by professional group, the only significant coefficient is for clerks. The average annual wage for clerks is 9.76 % higher in “cost strategy innovators” than in “non-innovators”. Conceptually, it seems that the introduction of new products could be an explanatory variable for higher wages at least for managers, clerks and skilled workers. Also cost strategy innovations, such as process innovations, seems to push wages upward at least for clerks.

Labour productivity is a strong determinant of workers’ compensation, especially for Skilled and Unskilled workers. Also for Union affiliation, proxy for bargaining power, we found a positive coefficient for all professional categories, namely the higher the share of unionized workers in the firm, the higher the wage negotiated.

As expected education is an important determinant of wages; in case of Managers and Clerks the share of workers with a University degree pushes upward wages, in case of Skilled and Unskilled Manual workers, secondary and primary education are strongly significant.

The share of female workers decreases on average and ceteris paribus the firm average wage by 24.22 %, this is verified for each professional category except unskilled workers and most of all for managers (−40.12 %).

Firm age impacts on wages paid by professional group according to a non-linear trend detecting an inverse U shaped relationship.

4.1 Testing the viability of the pooled model

As stated in Sect. 2, we need to test the validity of the pooled model applying the Breush–Pagan LM statistic testing whether the variance of individual effects in the error term is zero. In all cases, except for unskilled manual workers, we reject the null of \( Var\left( u \right) = 0 \) suggesting that the pooled estimation is not appropriate. We implement fixed and random effect model relaxing the assumption that \( \alpha \) from the pooled model will be constant for firms, so only the error term will capture both differences among firms and across time. At this stage we perform both fixed and random effect estimations and through the Hausman test we evaluate the viability of random effect estimations. For the fixed effects model (within estimator), we allow different effects for each firm (\( \gamma_{it } = \mu + \alpha_{i } + \beta x_{it} + \varepsilon_{it } \)). We replace the CLUSTER variable (Product Strategy Innovator, Cost Strategy Innovator and Non-Innovator) with three dummies according to the kind of innovation introduced (Marketing-Management-Product), because the strategy proxied in the pooled model does not show too much variability in the short time dimension of our panel and it would be differenced out in the fixed effect model. The Hausman test implemented to determine the validity of the random effects model rejects in all cases the null hypothesis assuming independence of the random effects from the explanatory variables.Footnote 12 The general warning in this case is to apply the fixed effects model due to the bias deriving from an inappropriate use of the random effects model (Table 4).

Table 4 Fixed effects by professional group

Controlling for individual and time-invariant characteristics, the Fixed Effects model detects a positive impact of marketing innovation on the average wage, especially for managers and clerks. In fact, the wage premium registered for the introduction of marketing innovation at firm level is earned by high skilled workers, such as Clerks and Managers. Other types of innovations do not impact on the average wage by professional group.

Further controls are needed in order to check for endogeneity of innovation and productivity.

4.2 Controlling for endogeneity

In the first two sections we assume innovation as exogenous, and we introduce innovation as an independent regressor in the wage equation assuming no violation of the zero mean conditional \( [E\left( {x|u} \right) = 0] \). In the literature, the introduction of innovations is not always considered as an exogenous process because of the influence of wages on innovations. Following Martínez-Ros, it is possible that technical change is not exogenous due to the influence of wages on innovations, in this sense we need to instrument innovations using past innovations (\( I_{it - 1 } \)). It is reasonable to think that firms which introduced innovations in the past are more likely to innovate also in the future.Footnote 13 A good instrument should have two properties: firstly it should be highly correlated with the endogenous variable and secondly, it should not be correlated with the dependent variable. In this sense, past innovations could be a good predictor of actual level of innovations and past innovations with two lags are not an explanatory variable of actual wage.

On the basis of these considerations, we estimate our wage Eq. (2) only for 2009 instrumenting the introduction of product/marketing/management innovations with past innovations (2007). In particular, we use the 2SLS (two-stage least squares) estimator because the number of instruments coincides with the number of parameters and we consider the robust option for standard errors.

Estimating the Hayashi (2000) C statistic, also known as the difference-in-Sargan statistic, we do not reject the null of exogenous regressors. Differently from Martínez-Ros (2001), all three types of innovation considered are exogenous in our wage equations.

Another source of endogeneity can come from productivity, especially if we adopt the efficiency wage framework. According to this paradigm, higher wages lead to higher productivity causing a problem of reverse causality in our estimations. Also in this case, we adopt as a measure a lagged measure of productivity to instrument the actual level of it. In fact, in this case we reject the null of exogeneity of our instrumented variables. Due to the small dimension of our panel, a viable solution could be to re-estimate the equation in growth rates.

4.3 Growth rates, sample selection and shares

As a further step in the analysis, in order to reduce problems of endogeneity, we re-estimate the same model in growth rates interacting the technology variable with the share of managers, skilled and unskilled workers. In this case, we follow Acemoglu and Autor (2011) regressing the rate of change of the wage over a set of variables including also proxies of technological change and shares of professional groups. In particular, following Bogliacino et al. (2012), we interact the two innovation strategies identified in the cluster analysis (Product and Cost Innovators) with the initial share of managers, skilled and unskilled workers. Contrary to Acemoglu and Autor (2011) and Bogliacino et al. (2012), we do not introduce a time dummy because after computing growth rates, we have only one time dimension (Table 5).

Table 5 Growth rates

The significance of product and cost strategies is almost lost when we consider the rate of change of wages and the impact of the interacted technologies with the share of managers, skilled and unskilled workers. The total change in wages registered during 2007–2009 is influenced by the share of Managers representing the most skilled category in our sample. In this sense, a tendency toward upskilling is detected in the sample. Furthermore, the only impact registered is related to management innovation contracting wages for unskilled workers. As in previous models, education and change in productivity strongly influence the change of wages. We have the same picture when we introduce in the regression three dummies related to marketing, management and product innovations.

5 Summary of findings and conclusions

This study has aimed to investigate the links between innovation strategies and wage distribution across professional categories. Based on a sample of 2,667 firms of Chile, we have firstly identified the presence of different innovative patterns in terms of introduction of innovations, typology of innovations and motivation of innovation by mean of a factor analysis. We have identified three main clusters of firms with respect to the main innovative patterns: product innovators, process innovators with cost-cutting strategies and non-innovators. This subdivision does not reflect a sectoral distribution of firms per cluster. In this sense, we support recent findings in Innovation literature on innovative firms across sectors (de Jong and Marsili 2006) instead of “innovative sectors”.

Secondly, we have analyzed the existence of a causal relation between innovation patterns and wages paid by professional group. On average we found a positive and significant coefficient for product innovations and for process innovations in terms of impact on average firm wage compared to non-innovating firms. At firm level, product innovations seem to push managers’, clerks’ and skilled workers’ wage. This is also true for process innovation: on average the annual wage for process innovating firms compared to non-innovating firms is higher.

This general framework seems to confirm the theory of a positive impact of innovations, in particular product innovations on wages for all professional categories, except unskilled manual workers. Salerno et al. (2008) found a technology wage premium of 12.07 % for Brazilian firms innovating and differentiating products. On the contrary, on a sample of Spanish firms, Martínez-Ros (2001) found a wage premium of 7 % when firms innovate in process and 20 % when firms innovate in both activities (process and product innovations). Doms et al. (1997) found 8 % of technological wage premium for production, clerks, technical and sales workers after controlling for workers characteristics. Contrary to our estimates, the positive relation is not verified for managerial and professional wages.

Our results are pretty much in line with previous findings in terms of magnitude of coefficients. However, by implementing a fixed effects strategy in order to control for firm heterogeneity and assess the consistency of our relation wages-innovations, we found a positive impact of marketing innovation for Managers and Clerks, but it turns negative when we consider unskilled workers. Furthermore, marketing innovation seems to capture all other kinds of innovation strategies.

Various kind of challenges and caveats are related to the implementation and contribution of this study. Firstly, the novelty of the survey and the possible existence of misunderstanding about innovations by entrepreneurs filling the questionnaire.Footnote 14 Obviously, this could create a measurement error in our “technology measure”. Compared to previous studies, we enlarge our analysis to services, agriculture and mining; however, the lack of more specific sector decompositions makes difficult to control for sectoral and industrial specificities, most of all for the manufacturing sector.

Furthermore, the short time-span of our panel did not allow us to perform a dynamic panel data estimation introducing the lagged wage as explanatory variable as usually realized in the literature.

Future avenues of research will attempt taking advantage of the information available for Chilean Innovation Survey and Annual Industrial Survey (ENIA).