Introduction

It is widely accepted that a country’s capacity to generate wealth and achieve high levels of well-being is closely linked to its capacity to generate knowledge. Knowledge is the basis for innovation and an essential requirement for increasing productivity in modern societies. In the EU a great deal of the generation and transmission of knowledge falls to Higher Education Institutions (HEIs). HEIs account for around 23.7 % of all R&D expenditure and generate about 64.3 % of all scientific publications and 2.9 % of all patents. HEIs produce knowledge through research, they disseminate it by training graduates and postgraduates and by publishing the results of the research, and they transfer it via collaboration agreements with companies and institutions.

The role of HEIs in today’s knowledge society and their contribution to regional socioeconomic development has been highlighted and quantified in the literature (e.g. Schubert and Kroll 2014; Pastor et al. 2015a, b). A number of studies have recently evaluated their performance in a national or international context. This proliferation of studies has been promoted by several factors. Firstly, in a context of fiscal consolidation, HEIs are considered by taxpayers as large consumers of public funds, forcing HEIs to demonstrate to society that they are making a proper and efficient use of the public funds. Secondly, financial constraints and increasing competition among HEIs to obtain public funds make it necessary to ascertain whether HEIs are getting an optimal research output with the financial and human resources that they use.

Nevertheless, although the literature on the efficiency of HEI is large (e.g. Johnes 2006; Bonaccorsi and Daraio 2007; Johnes 1988; Glass et al. 1995; Flegg et al. 2004; Kempkes and Pohl 2010; Nazarko and Šaparauskas 2014; Worthington and Lee 2008; Kuah and Wong 2011; Guccio et al. 2016, etc.) most studies are devoted to analyzing the overall efficiency of HEIs without examining specifically their performance in terms of research output.

Moreover, even when studies analyze specifically the research output of HEIs (e.g. Johnes 1988 or Abbott and Doucouliagos 2004), they do not take into account the heterogeneity among them. In particular, there are significant differences among HEIs in terms of research output quality and the disciplinary composition of institutions, normally known as subject mix or specialization. In order to obtain a rigorous assessment of the performance of HEIs, we need to take into account the heterogeneity and possible impact that these differences may have in explaining their different performance.

Regarding the quality, as it has been asserted by some authors (Abbott and Doucouliagos 2003), focusing on outputs without considering the quality might bias the performance indicators of HEIs in favor of those institutions that provide low quality output. Similarly, since there are differences across fields of science (FOS) in terms of research productivity, quality of research and cost structure, these differences need to be taken into account when comparing measures of quantitative performance of HEIs, otherwise we may draw spurious or imperfect conclusions regarding the relative performance of certain institutions (Sarrico and Dyson 2004; Sarrico et al. 2009).

To our knowledge there is no article devoted to the analysis of differences in the research output of HEIs that jointly controls for differences in quality and specialization in different FOS and, besides, that quantifies how much of the measured inefficiencies are in fact merely the result of differences in quality or specialization.

The aim of this study is to analyse what determines the differences in scientific output per researcher in the HEIs of EU countries. To this end we develop a methodology that specifically considers the quality of scientific output from universities and their different specialisation according to field of science and technology (FOS). This methodology can be used to break down the differences in scientific output per researcher among the HEIs of each country in terms of (a) differences in efficiency within each field, (b) differences in FOS specialisation of the HEIs in each country, (c) differences in quality and (d) differences in allocation of resources per researcher.

The study is organised as follows. Following this introduction, Sect. 2 reviews the problems of measuring university activity, compiles some proposals from the literature, reviews the main existing problems and presents the proposal for a research output indicator. Section 3 describes the data used. Section 4 examines the importance of HEIs in EU research activity, evaluates the differences in scientific output among the EU countries and demonstrates the importance of approaching the problem in a disaggregated way in the different fields of science. Section 5 describes the methodology used. Section 6 very briefly presents some of the results obtained on the different components of inefficiency. The study ends with the main conclusions in Sect. 7.

The research output of the HEI

Researchers who analyse HEI research output face several problems (de Groot et al. 1991; Johnes 1996, 1988; Salas 2012; Abbott and Doucouliagos 2004) or Pastor et al. 2015a, b). First, universities undertake various missions simultaneously (teaching, research and technological transfer). Second, the productive processes of the missions of HEIs are multiproduct. Hence, for example, HEIs produce various teaching outputs at the same time (graduates, post graduates, etc.) or various research outputs at the same time (publications, patents, etc.). Third, even when the outputs to consider have been defined, not all of them have the same quality, being necessary to use some measure of the quality in other to avoid wrong conclusions. Finally, the level and the quality of the outputs are very different across FOS.

There is a fairly general consensus that universities’ teaching output can be reasonably measured by the number of graduates or number of students.Footnote 1 Similarly, the most frequently used research outputs in the literature are publications, citations and, to a lesser degree, patents (Pastor et al. 2015a, b).

The problem arises when we want to analyse universities’ research output using only one indicator, either publications or patents, since by doing so we do not take into account the multiproduct nature of HEIs, and therefore ignore the results of a significant part of their research activity.

Figure 1 shows the different orientation of research activity in the HEIsFootnote 2 of the EU-28 countries. The two lines in the figure represent the arithmetic average of publications and patents per researcher for the 28 member states of the EU and delimit four quadrants. The figure shows the coexistence of different university systems in the EU-28 such as those of France and Germany oriented to the production of patents located in quadrant I alongside university systems with a much stronger orientation to produce publications, located in quadrant IV, such as Sweden and Cyprus. However, the most striking revelation is that within the EU there are university systems that stand out for their excellence in both types of research output (quadrant II) and others with poor results in the two indicators (quadrant III). The first group, made up of Ireland, Belgium and Netherlands, stands out for excellent performance in both indicators. At the opposite extreme are university systems from countries in Eastern Europe such as Slovakia, Latvia, Lithuania and Bulgaria with modest patent and publication outputs.

Fig. 1
figure 1

Source: SCImago Journal & Country Rank, Eurostat and own elaboration

Patents versus Citable documents by R&D personnel. Annual average 2008–2010.

The choice of number of publications as an indicator of representative output of HEIs’ research activity (and therefore excluding patents) is problematic only if there are considerable differences in specialisations across the university systems in different countries. Some universities systems may specialise in the social sciences and humanities field of science (FOS) the main output of which are publications, and where patents are practically non-existent. Others, by contrast, specialise in technical FOS with a much higher tendency to patent. The problem we pose is whether or not the activity of publishing implies that patenting is relinquished and vice versa; in other words, whether the two outputs are positively correlated.

Some authors consider that patenting supplants scientific publishing, that is, that patenting implies that publishing is relinquished and vice versa. This is what some authors call the “substitution effect” (Klitkou and Gulbrandsen 2010). The explanation may be that the patenting process often involves a delay in publication, making it more difficult to publish a scientific paper. In turn, Crespi et al. (2011) state that if academic inventors become too involved in patenting activity, they may become distracted from (or devote less time to) other activities and focus mainly on the production of new knowledge that is patentable and from which some financial return can be extracted.

On the other hand there are authors who consider that a “reinforcement effect” (Klitkou and Gulbrandsen 2010) takes place between the two research activities of publishing and patenting, in other words, a situation in which research activity generates patents that translate into publications and/or publications that generate patents. This may occur in any direction since patenting can open up new scientific opportunities, lead to new ideas, create scientific networks, etc. And, alternatively, patents may result from these opportunities and networks.

Most of the empirical evidence supports the theory of the “reinforcement effect” suggesting that when a university produces one of the outputs (patents or publications), it may be likely to produce the other output as well. Stephan et al. (2007) examine the question of patenting for the US case finding patents to be positively and significantly related to the number of publications. This finding is robust to the choice of instruments and method of estimation. Carayol (2007) presents an empirical study on the patenting activities of the faculty members of the University Louis Pasteur revealing that publishing and patenting are positively related. Breschi et al. (2007) investigate the scientific productivity of 299 Italian academic inventors and match them with an equal number of non-patenting researchers. Their results do not support the idea of trade-off between patenting and publishing, instead they support a strong and positive relationship between patenting and publishing even in basic science. Azoulay et al. (2009) find that both the flow and the stock of scientists’ patents are positively related to subsequent publication rates. Moreover, this increase in output does not come at the expense of the quality of the published research. They disentangle correlation from causality in the assessment of the effect of patenting. This paper shows that patent holders differ from other researchers on many observable characteristics. More accomplished researchers are much more likely to patent, and controlling for the stock of past publications, scientists with a recent good run are also more likely to patent. Similarly, Buenstorf (2009) analyses the invention disclosure, licensing, and spin-off activities of Max Planck Institute directors finding that inventing does not adversely affect research output. Agrawal and Henderson (2002) estimated fixed-effect regressions of the effect of patenting in a 15-year panel of 236 scientists in two MIT departments. They found that patenting did not affect publishing rates. Fabrizio and DiMinin (2008) match 166 academic patentees with an equivalent number of non-patenting scientists finding a statistically positive effect of researchers’ patent stocks on their publication counts.

When analysing universities’ research output, the existence of various research outputs and the selection of merely one of them (e.g. publications) would not constitute an important problem if, as shown in the literature, there were a positive relationship between the two activities (publishing and patenting) that mutually reinforced them. Figure 2 shows that the two leading research outputs have kept pace over the last decade for the HEIs of the EU-28. Patents have multiplied by 1.84 and publications by 1.89. The similar evolution of patents and documents in indicates that the substitution effect does not exist, but rather there is a reinforcement effect between the activities of publishing and patenting. Therefore, as in other papers in the literature, the number of publications has been selected as a representative indicator of the volume of research output from European universities.

Fig. 2
figure 2

Source: SCImago Journal & Country Rank, Eurostat and own elaboration

Evolution of scientific output and patent applications. EU-28. 2000–2010, 2000 = 100.

Another important problem that researchers need to consider is that the level and the quality of the outputs vary greatly among the FOS. Many studies do not consider this fact, obtaining results that are biased by the specialization or subject mix of HEIs.

Some authors have considered explicitly the FOS specialization in the assessment of the performance of HEIs. Thus, Filippini and Lepori (2007) highlight the importance of taking into account the FOS specialization of universities when making financial planning and establishing financing mechanisms. According to these authors, many countries only consider the cost per student, which is not accurate as there are huge specialization differences between universities and the cost per student is greatly influenced by this specialization or subject mix. As the authors say, considering specialization is an interesting topic for further research, but at the same time difficult due to lack of data. Lepori (2007) stresses the importance of FOS specialization of HEIs and establishes various types of HEIs. According to this author, the internal differentiation has allowed Switzerland the creation of universities mainly specialized in technical and scientific fields which can compete on a world basis. Abbott and Doucouliagos (2004) analyze the research output of the Australian university research centers in economics concluding that taking into account the specialization is very important. Their results show that the research output per capita in the research centers was much greater than that of teaching departments. Nevertheless, that difference in productivity disappears when the different specialization is taken into account, showing that specialization is very important when analyzing efficiency. Johnes and Johnes (2009) considered the FOS specialization in a context of cost efficiency analysis. They use parametric control variables by specialization of HEIs to avoid bias and to control for differences in producing graduates in science and non-science, also finding that specialization is important. Finally, Thanassoulis et al. (2011) analyze the performance of the HEIs in the UK taking into account the difference in specialization of HEIs by constructing different groups of HEIs, concluding that subject mix is important for overall productivity.

Our paper adds to this literature since, to our knowledge, there is no article devoted to the analysis of differences in the research output of HEIs that jointly controls for differences in quality and specialization in different FOS and, besides, quantifies how much of the measured inefficiencies are in fact merely the result of differences in quality or specialization.

Data

The data correspond to 28 European university systems for the period 2008–2012. As a measure of output we use the number of citable documents by country and by field of science. There are two main databases that provide information on the research output: The Web of Science (WoS) and Scopus.

The WoS database, produced by Thomson Reuters, includes more than 12,000 international journals and is managed commercially by the International Scientific Institute (ISI). Although it compiles information from 23 million documents and 3300 publishers from 71 countries, it is predominated by journals in English and journals in the hard sciences. As a result publications written in languages other than English or in other fields of knowledge such as the social sciences are underrepresented.

Scopus includes 22,000 journals and 55 million documents since 1996. Nowadays, Scopus is the most serious competitor to the WoS. The geographical source of the titles of scientific journals is varied: it covers information from journals in 97 countries and English language journals are not overrepresented since 60 % are not based in the United States (US). This database is a serious alternative to the well-established Web of Science database, mainly because it is open access, it has a larger range of sources, it includes journals in languages other than English and it assesses the quality of citations (Falagas et al. 2008).

Researchers can freely access the following research output information by country and year: number of documents, number of citable documents, number of citations, citations per document, etc.Footnote 3 The information is also disaggregated by research area. This disaggregation is necessary in our study because of our aim to analyse the differences in output per researcher controlling for specialisation. To this end we created a correspondence between the research areas used by publications (SCIMAGO) and the fields of science (FOS) used by Eurostat for both patents and for R&D expenditure and personnel (Table 1).

Table 1 Correspondence between research areas (SCIMAGO) and fields of science (FOS)

In the case of input variables, we consider the intramural R&D expenditure (current and capital expenditure) and the full-time equivalent R&D personnel (researchers and other) by sector and by country.Footnote 4 This information is available from Eurostat (Statistics on research and development) for every HEI in each country and disaggregated by FOS.

Table 2 presents the information for the average of the period 2008–2012 for each of the EU-28 countries. The country with the highest scientific output is the UK (157,501 citable documents), representing 17.4 % of total EU output, followed by Germany (150,652 documents), France (111,261 documents), Italy (87,515 documents) and Spain (79,255 documents).

Table 2 Research indicators by country

In terms of quality, measured by the number of citations per document, the countries with the highest quality production are Denmark, Netherlands, Sweden, Belgium, Ireland, Finland, Austria or UK, all of which have more than 5 citations per citable document. At the opposite extreme are Romania and Lithuania with less than 2 citations per document.

The facts

The term “knowledge-based economy” stems from the wide recognition of the place of knowledge and technology in modern economies. These societies are characterised by their intensive use of knowledge not only in practically every sphere of daily life but also in production activities. Practically all their activities are based on knowledge and on knowledge management. In European countries HEIs play a key role in this area. In HEIs knowledge is created through R&D activities, disseminated through their teaching activities and the publication of their research results, most of the time with guaranteed free access, and transferred by means of collaboration agreements with companies.

HEIs are key actors in the knowledge society and are essential for achieving greater levels of sustainable well-being. An extensive literature demonstrates the importance of universities in the socio-economic development of their economies.Footnote 5 Governments, aware of these benefits, devote considerable resources to their public universities. Precisely for this reason they demand a better use of these resources and more and better results in all their activities, but especially in R&D. The empirical evidence shows that universities in some countries have better R&D results than others, even when they use fewer resources (i.e. Pastor et al. 2015a, b). Before going on to explore the causes of this varied performance across European countries, we first consider it useful to review some of the typical features of research activity in their university systems.

We begin by analysing the importance of universities in research activity. Eurostat considers four large sectors of execution in expenditure on R&D activities: Higher Education, Government, Business enterprise sector and Private non-profit sector. Figure 3 shows that the HEIs of the EU-28 account for almost a quarter of R&D expenditure (23.4 %) and are, following companies (63.5 %), the second most important agent in R&D activities. In some countries HEIs account for more than half the total amount of financial resources devoted to R&D. This is the case of Cyprus or Lithuania, where expenditure on R&D in HEIs represents 57.3 and 54.7 % of total R&D expenditure, respectively.

Fig. 3
figure 3

Source: Eurostat

Distribution of R&D expenditure by sectors of performance. EU-28 countries. 2013.

Obviously, the more resources devoted to research in universities, the greater the research output will be. Figure 4 shows the relationship between the resources in public R&D agents (universities, public research centres and hospitals) and one of the most important research outputs: the number of publications. Note that the EU countries with the greatest weight in terms of R&D expenditure by HEI also have the greatest weight in terms of publications. However, Fig. 4 also reveals a very important fact: research output does not depend exclusively on the resources used. Some countries are getting more value for the money allocated to R&D than others. That is the case of some small countries like Bulgaria, Romania, Croatia, Cyprus, Slovenia, Hungary, Greece and Portugal. The weight of these countries in terms of publications is more than twice their weights in terms of R&D expenditure. On the opposite side are the largest EU countries, Germany and France, where the weight in terms of R&D expenditure is higher than in terms of publications.

Fig. 4
figure 4

Scientific output versus R&D expenditure. EU countries

Figure 5 shows the scientific output related to R&D personnel confirming the heterogeneity across countries. As can be seen, there are important differences in output per capita among the EU countries. (i.e., the scientific output per capita in Cyprus is 6.8 times that of Latvia).

Fig. 5
figure 5

Source: SCImago Journal & Country Rank and Eurostat

Scientific output related to R&D personnel. EU countries.

The next question we analyse is whether there are differences in the specialisations of European university systems. Figure 6 reveals important differences in specialisation in the fields of science (FOS). For example, the specialisation of Estonia in Humanities is 2.6 times the EU average and 8 times that of Luxembourg. Similarly, UK is overspecialised in Social Sciences and Humanities: its specialisation in Social Sciences is 60 % higher than the EU average and in Humanities, 70 % higher than the EU average. In contrast, Germany is under specialised in Humanities: 40 % lower than the EU average. The Netherlands and the Nordic countries (Sweden and Denmark) show a strong specialisation in Medical and Health Sciences.

Fig. 6
figure 6

Source: SCImago Journal & Country Rank and own elaboration

Distribution of scientific output by field of science. EU countries.

Any differences in specialisations in the university systems will only explain the differences in output per capita among the HEIs in European countries if there are also different outputs per capita between the various FOS. Figure 7 represents the number of citable documents per R&D personnel. It reveals important differences in productivity among the FOS. The productivity of FOS3 (Medical sciences) is 1.58 citable documents per R&D personnel, 14 times higher than FOS6 (Humanities). Similarly, the productivity of FOS1 (Natural sciences) is 0.95 citable documents per R&D personnel, 8.4 times higher than FOS6.

Fig. 7
figure 7

Source: SCImago Journal & Country Rank, Eurostat and own elaboration

Scientific output related to R&D personnel by field of science. EU countries. a FOS1 Natural sciences. b FOS2 Engineering and technology. c FOS3 Medical and health sciences. d FOS4 Agricultural sciences. e FOS5 Social sciences. f FOS6 Humanities.

As well as the FOS specialisation, another of the reasons that may explain the differences in per capita output in EU countries’ HEIs is the difference in per capita resources. Countries whose researchers have more resources for research activity will obtain greater output. Figure 8 represents the R&D expenditure per R&D personnel and reveals important differences in R&D expenditure per capita. Note, for example, that the R&D per capita in Sweden is 2.2 times higher than the EU average and 25 times higher than in Bulgaria. In general one group of countries allocates far more resources than the average: Sweden, Austria, Netherlands, Denmark and Germany. The R&D expenditure per capita of these countries is more than 40 % higher than the EU average. In contrast, in countries like Bulgaria, Romania, Croatia, Slovakia, Latvia, Lithuania, Hungary, Greece, Poland, Portugal, Slovenia and Estonia the R&D per capita is 40 % lower than the average.

Fig. 8
figure 8

Source: SCImago Journal & Country Rank, Eurostat and own elaboration

R&D expenditure per R&D personnel. EU countries.

In summary, we find considerable differences in output per capita (citable documents per R&D researcher) among the HEIs of EU countries. The evidence indicates that there are four possible factors causing these differences among the HEIs of the EU countries: differences in field of science specialisation, differences in efficiencies within FOS, differences of quality and differences in R&D expenditure per capita.

We will analyse the extent to which differences in terms of specialisation, output quality, efficiency within the scientific fields and R&D expenditure per capita explain the differences in the research output among the HEIs in the EU.

Methodology

We need a methodology that identifies the determinants of HEI research output. Specifically, we want to know to what extent differences in terms R&D expenditures, output quality, field of science specialisation and technical inefficiencies explain the differences in the research output and productivity among the EU HEIs.

The concept of technical efficiency refers to the optimal use of resources. The first author to introduce this measure of efficiency was Farrell (1957). He proposed defining the technical efficiency as the radial increment that can be performed on the outputs of a company (in our case, a HEI) given a vector of inputs.Footnote 6

The indicator of technical efficiency (θ) is illustrated in Fig. 9 that represents a simple case with one output (Y) and one input (X) and four HEIs, A, B, C and D. The technical efficiency measure (θ) is represented by the ratio between potential output vector (Y *) and real output vector (Y). The technical efficiency indicator of HEI D would be represented by the ratio θ D = Y *D /Y D. As can be seen in the figure, the HEI D is technically inefficient (θ D>1), since it is possible to increase its output with the same amount of input. However, A, B and D are technically efficient (θ A  = θ B  = θ C  = 1) because it is not possible for them to increase the level of output given the level of inputs used. The methodology data envelopment analysis (DEA) is a linear programming methodology to measure the efficiency of multiple decision-making units when the production process, as in the case of HEIs, presents a structure of multiple inputs and outputs (Charnes et al.1978).

Fig. 9
figure 9

Output oriented technical efficiency

We develop a multi-step methodology based on a DEA non-parametric methodology.Footnote 7 This step by step DEA-based methodology allows us to decompose total inefficiency into the composition (or specialisation) effect and the effect due to inefficiency within each sector. This methodology will allow us to analyse the universities’ research output in terms of differences in the output quality within each specific FOS, differences in intra-field inefficiency (inefficiencies of HEIs within each specific field), and differences in specialisation (the effect due to their FOS specialisation).

The usefulness of this approach is that it allows us to incorporate the particular nature of HEI research activity into the analysis. FOS are characterised by different propensities to publish as the data suggest (Fig. 7). These differences in the characteristics of the FOS may influence the aggregated results. For this reason, instead of directly considering the aggregate research output of HEIs, we consider the output of each FOS. From this standpoint the approach allows us to distinguish two different effects: a composition effect due to specialisation and another component that we will call intra-field inefficiency, which is associated with a deficient use of resources allocated to each particular FOS. In order to properly measure the maximum achievable output of HEIs in each country, and their global inefficiency, the analysis should include both effects. Intra-field efficiency is due to a more or less efficient use of productive factors within each FOS, and the composition effect depends on being specialised in the FOS that are more (or less) productive. According to this second component, it would seem as if a HEI could improve its efficiency simply by increasing the weight of the FOS that tends to look more productive in terms of the efficiency indicator. Actually, if there is substantial heterogeneity across FOS and this fact is not taken into account a recommendation such as “close down fields with lower publications-per-scientist ratios” would seem to make sense indeed. Nevertheless, when the heterogeneity across FOS is taken into account, as our method allows us to do, things are more complex. The part of the apparently bad results due to the particular specialization can be taken into account, showing that the research system is not as bad as it would seem otherwise. If countries consider that those “low productivity” FOS are important they should keep researching on them and this should not penalize them.

In order to illustrate our 5-step methodology let us assume that there are R countries and N fields of science (FOS), and that (Xni1,…, XniM) is the vector of M inputs that the HEIs of country i use in FOS n for the production of Y n i .

STEP 1: Research output quantitative inefficiency by scientific field

First we consider efficiency in terms of number of documents by FOS to evaluate by how much each country could increase the number of documents in each FOS without using more resources and personnel. The research output quantitative inefficiency of the HEIs of country i in FOS n n i ) will be obtained by the following standard DEA problem:

$${\text{Max}} \theta_{i}^{n}$$
(1)
$$\begin{aligned} & {\text{S}} . {\text{t}} .\\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} Y_{r}^{n} \ge Y_{i}^{n} \theta_{i}^{n} \\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} X_{\text{rm}}^{n} \le X_{\text{im}}^{n} \quad m = 1, \ldots ,M \\ & \lambda_{r} \ge 0\quad r = 1, \ldots ,R \\ \end{aligned}$$

θ n i is the efficiency score of the HEIs of country i in the scientific field n, and represents the potential increase that the HEIs of country i could achieve in their output in scientific field n without increasing the input vector (in our case R&D expenditure and R&D personnel). A higher score implies more inefficiency and a value of 1, the minimum value, means that country i is efficient in field n, as it is at the frontier.

Using this efficiency score of HEI of country i in each of the six fields of science considered, (θ n i ) we are able to calculate the potential output of the countries in each FOS (\(\hat{Y}_{i}^{n}\)), that is, the maximum output that the countries’ HEIs could achieve in each FOS if they were efficient in each one of their n FOS.

$$\hat{Y}_{i}^{n} = Y_{i}^{n} \theta_{i}^{n}$$
(2)

STEP 2: Research output inefficiency by scientific field including the quality of the output (pure inefficiency)

The previous research output inefficiency of HEI of country i in FOS n n i ) does not consider the quality of the output. However, failing to consider quality would imply penalising those HEI that consume more inputs not because they are more inefficient, but because the output they produce is of a higher quality. If this aspect is not taken into account, we would be interpreting as inefficiency what is actually a higher consumption of resources to produce a higher quality output.

The number of citations per document is the most commonly used indicator by researchers in order to take into account the quality of research. Other indicators are the impact factor (IF), the percentage of publications in journals in the first quartile (Q1), the SCImago Journal Rank (SJR), the Eigenfactor score, the h-index and the nh3 index (Abbott and Doucouliagos 2003; Pastor et al. 2015a, b). All these indicators are based on the analysis of the citations received by documents and all of them attempt, via a normalisation technique, to improve information on the number of citations, to compensate for the variability of the citation culture in different fields (CWTS 2009; SCImago 2012a, b; Vieira et al. 2009).

The use of citations as an indicator of research quality and impact is based on the assumption that the citation of a document represents recognition of its interest and usefulness in the construction of new knowledge (González-Albo et al. 2012). Although citation-based indicators have certain limitations, widely described in the literature (Rey 2009; Moed 2005), their use is currently accepted as indicators of research influence. We use the number of citations per document (CD) as an indicator of scientific output quality.

The research output inefficiency of the HEIs of country i in FOS n that controls for the quality of output (θ n i ) will be obtained by including an additional restriction to the problem of STEP 1.

$${\text{Max}}\;\theta_{\text{Qi}}^{n}$$
(3)
$$\begin{aligned} & {\text{s}} . {\text{t}} . {\text{ And}} \\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} Y_{r}^{n} \ge Y_{i}^{n} \theta_{\text{Qi}}^{n} \\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} X_{\text{rm}}^{n} \le X_{\text{im}}^{n} \quad m = 1, \ldots ,M \\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} {\text{CD}}_{r}^{n} \ge {\text{CD}}_{i}^{n} \\ & \lambda_{r} \ge 0\quad r = 1, \ldots ,R \\ \end{aligned}$$

where θ nQi is the efficiency score of the HEIs of country i in the scientific field n that controls for the quality, and represents the potential increase that the HEIs of country i could achieve in the output of the scientific field n without increasing the input vector and maintaining the same quality of the production research (citations per document).

As in STEP 1 we can calculate the potential output of each field of science n controlling for quality \(\left( {\hat{Y}_{\text{Qi}}^{n} } \right)\), in other words, the maximum output that could be achieved in each FOS if the HEIs of each country i were efficient, controlling for quality. To do this we use the efficiency score of the HEIs of country i in the scientific field n that controls for quality θ nQi

$$\hat{Y}_{\text{Qi}}^{n} = Y_{i}^{n} \theta_{\text{Qi}}^{n}$$
(4)

STEP 3: Scientific field efficient aggregate research output

Using the results of STEP 1 and STEP 2, we can estimate the efficient aggregate research output of the HEIs of each country (i.e., the aggregated output assuming that all HEIs are efficient in each scientific field). We will calculate both the aggregated output in terms of the number of documents \(\left( {\hat{Y}_{i} } \right)\) and the aggregate output controlling for quality \(\hat{Y}_{\text{Qi}}\)

$$\hat{Y}_{i} = \mathop \sum \limits_{n = 1}^{N} \hat{Y}_{i}^{n} = \mathop \sum \limits_{n = 1}^{N} Y_{i}^{n} \theta_{i}^{n}$$
(5)
$$\hat{Y}_{\text{Qi}} = \mathop \sum \limits_{n = 1}^{N} \hat{Y}_{\text{Qi}}^{n} = \mathop \sum \limits_{n = 1}^{N} Y_{i}^{n} \theta_{\text{Qi}}^{n}$$
(6)

However, being efficient in each scientific field does not guarantee appearing as efficient in aggregated scientific output, since there is still another effect associated with the field of science composition of production. In other words, scoring as efficient in aggregate production necessarily implies being efficient in each FOS (i.e., intra-field efficiency), but also depends on the FOS specialisation (i.e., composition effect).

STEP 4: Composition effect

In this step we estimate the composition effect (θ CE i ) that would exist even with no technical inefficiency within any scientific field

$${\text{Max}}\;\theta_{i}^{\text{CE}}$$
(7)
$$\begin{aligned} & {\text{s}} . {\text{t}} .\\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} \hat{Y}_{r} \ge \hat{Y}_{i} \theta_{i}^{\text{CE}} \\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} X_{\text{rm}} \le X_{\text{im}} \quad m = 1, \ldots ,M \\ & \lambda_{r} \ge 0\quad r = 1, \ldots ,R \\ \end{aligned}$$

\(\theta_{i}^{\text{CE}}\) is the efficiency score of the HEIs of country i and represents the potential increase that the HEIs of country i could achieve in their aggregate research output without increasing the input vector and assuming that they are also achieving the maximum output (given the quantity of inputs) in each scientific field. Therefore, this composition term captures the impact on output associated with the particular scientific composition/specialisation of the HEIs of each country.

From the results of STEP 3 we can calculate both the aggregated potential output of the HEIs of each country without adjusting for quality \(\left( {\hat{Y}_{i}^{*} } \right)\) and the potential output controlling for quality \(\left( {\hat{Y}_{\text{Qi}}^{ * } } \right)\). That is, the maximum aggregated output that each country i could achieve without using more inputs if their HEIs had a suitable composition (specialisation by scientific fields).

$$\hat{Y}_{i}^{ * } = \hat{Y}_{i} \;\theta_{i}^{\text{CE}}$$
(8)
$$\hat{Y}_{\text{Qi}}^{ * } = \hat{Y}_{\text{Qi}} \theta_{\text{Qi}}^{\text{CE}}$$
(9)

STEP 5: Global research output inefficiency

The global research inefficiency score in terms of quantity of documents without adjusting by quality is θ i . It can be obtained as the ratio between the maximum attainable output \(\hat{Y}_{i}^{*}\) and the actual output Y i :

$$\theta_{i} = \frac{{\hat{Y}_{i} \theta_{i}^{\text{CE}} }}{{Y_{i} }} = \frac{{\hat{Y}_{i}^{ * } }}{{Y_{i} }}$$
(10)

or by solving the following problem:

$${\text{Max}} \theta_{i}$$
(11)
$$\begin{aligned} & {\text{S}} . {\text{t}} .\\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} \hat{Y}_{r} \ge \hat{Y}_{i} \theta_{i} \\ & \mathop \sum \limits_{r = 1}^{R} \lambda_{r} X_{\text{rm}} \le X_{\text{im}} \quad m = 1, \ldots ,M \\ & \lambda_{r} \ge 0\quad r = 1, \ldots ,R \\ \end{aligned}$$

Note that part of the potential improvement in terms of number of documents shown by this score might be associated with a decrease in their quality.

We can express this global quantitative inefficiency score (θ i ) as the product of two factors:

$$\theta_{i} = \frac{{\hat{Y}_{i}^{ * } }}{{Y_{i} }} = \frac{{\hat{Y}_{i}^{ * } }}{{\hat{Y}_{\text{Qi}}^{ * } }} \cdot \frac{{\hat{Y}_{\text{Qi}}^{ * } }}{{Y_{i} }} = {\text{QE}}_{i} \cdot \theta_{i}^{\text{PE}}$$
(12)

The first factor is the quality effect \(\left( {{\text{QE}}_{i} = \hat{Y}_{i}^{ * } /\hat{Y}_{\text{Qi}}^{ * } } \right)\) and represents the quality bias in the global quantitative inefficiency indicator due to considering only the quantity of documents and not their quality. If QE i  < 1, it means that the quantitative indicator is penalising that country because it has a higher quality output that is not taken into account. The second factor is the global pure inefficiency score (θ PE i ). This indicator, when controlled for quality, is a more suitable indicator of efficiency because it measures how much the scientific output of HEIs in each country can increase without raising inputs or reducing quality.

In turn, we can decompose the global pure inefficiency score into two additional components according to the following expression:

$$\theta_{i} = \frac{{\hat{Y}_{i}^{*} }}{{Y_{i} }} = \frac{{\hat{Y}_{i}^{*} }}{{\hat{Y}_{\text{Qi}}^{*} }} \cdot \frac{{\hat{Y}_{\text{Qi}}^{*} }}{{Y_{i} }} = \frac{{\hat{Y}_{i}^{*} }}{{\hat{Y}_{\text{Qi}}^{*} }} \cdot \frac{{\hat{Y}_{\text{Qi}}^{*} }}{{\hat{Y}_{\text{Qi}} }} \cdot \frac{{\hat{Y}_{\text{Qi}} }}{{Y_{i} }} = {\text{QE}}_{i} \cdot \theta_{i}^{\text{PE}} = {\text{QE}}_{i} \cdot \theta_{i}^{\text{CE}} \cdot \theta_{i}^{\text{IE}}$$
(13)

The composition effect (θ CE i ) represents the impact of the field of science composition/specialisation on the measured global pure inefficiency score. The factor of intra-field inefficiency (θ IE i ) indicates the aggregate intra-field inefficiency. This intra-field inefficiency has the advantage of controlling by the particular specialization by FOS, making feasible comparisons across countries without penalizing those more oriented to FOS with lower publication rates. Those FOS mixes may be still considered appropriate by each country in spite of those lower publication rates compared to the amount of inputs used.

Results

Table 3 presents the results of the different indicators. Column 1 shows the results of the global quantitative inefficiency score. On average, given the actual use of inputs and without taking into account quality, the research output (number of publications) of the HEI in the EU could increase by around 20 % if the inefficiencies were removed.

Table 3 Global inefficiency and its components

In some countries output could be increased by a factor of 2 or more (Latvia, Luxembourg, Lithuania, Malta, Slovakia). United Kingdom is the only efficient country, the only one whose HEIs produce the maximum number of publications given the inputs used. Sweden (1.01) and Germany (1.05) are in the group of most efficient countries (low inefficiency scores).

A more suitable indicator to measure the countries’ real degree of efficiency is the indicator that also controls for quality of scientific output. The second column presents the quality effect and the third, the results of efficiency controlled for quality. The results indicate that output (number of publications controlled by quality) could increase by 18 % for the EU countries as a whole and if all inefficiencies were removed. Control for quality does not significantly alter the results in most countries. As can be seen, the quality effect is very limited except in cases like the Netherlands and Denmark, where control for quality significantly improves their performances.

Columns 4 and 5 show the two components of that global inefficiency. Most of the inefficiency comes from inefficiencies within each specific field while the effect associated with the composition is much less significant. Hence, for the EU-27 as a whole, the composition effect is only 2.2 %, whereas intra-field inefficiency is 15.4 %. In other words, the composition effect represents 12.3 % of global pure inefficiency while intra-field inefficiencies represent the remaining 87.6 %. Therefore, taking into account quality and allowing for differences in specialization across fields of science reduce the measured global inefficiency (from 20 to 15.4 %). Both are important issues to be considered when evaluating research inefficiencies. Nevertheless, the potential increase of the research output of the HEIs in the EU is still quite substantial (15.4 %) even after controlling for quality and specialization.

Figure 10 represents the magnitude of global quantitative inefficiency across EU countries, namely, the percentage increase of the research output of each country’s HEI, and its sources. According to these results Latvia is the most inefficient country. Its research output could be increased by 225.9 %. In contrast, the UK is the most efficient country since it has a favourable specialisation and appears as efficient in all the FOS.

Fig. 10
figure 10

Source: SCImago Journal & Country Rank, Eurostat and own elaboration

Scientific research inefficiencies: quality effect, composition effect and intra-field inefficiency.

Although the quality effect tends to be small for most of the countries, it is relevant in some countries with high quality output such as Denmark and the Netherlands (in the latter country two thirds of its apparent inefficiency vanishes after taking quality into account). The composition or specialization effect of most of the countries is fairly moderate in general. Nevertheless, it is more relevant for countries such as Luxembourg, the Baltic republics, Finland, Portugal, Denmark, Greece or the Netherlands. The absolute size of this effect in these countries is greater than total pure inefficiency in relatively efficient countries such as Germany. As a percentage of total inefficiency it appears as fairly relevant in countries such as Germany (where it represents 66 % of total inefficiency), Luxembourg (35.3 %), Portugal (29.1 %) or Greece (24.2 %).

In summary, major differences can be seen in the efficiency levels of the EU countries’ HEIs and their components. Figure 6 reported important differences in output per capita of HEIs and posed the question of whether these differences were due to a different type of specialization across FOS, intra-field inefficiencies, differences in output quality or differences in the quantity of resources per capita. The results of the exercises performed allow us to advance in responding to this question.

Figure 10 shows that the countries whose HEIs devote more resources to R&D per capita also have higher scientific output per capita (real situation). There is a positive and significant relationship between the two variables in the EU countries. On the other hand, the figure shows that the widespread heterogeneity in output per capita is not only explained by the amount of resources used, since some countries obtain a much higher output per capita with the same resources per capita than others. For example, Slovenia has a similar level of scientific output per capita to Denmark, while its R&D expenditure per R&D personnel is one third that of Denmark; or the case of UK which has a similar output per capita to Italy but using a much lower (23 %) amount of per capita expenditure. Indeed, the differences in R&D expenditure per capita explain little more than one third of the differences in output per capita. So are the huge differences in efficiency levels underlying the differences in output per capita?

If we considered that countries are efficient within each field of study in which they work (intra-field effect), all the countries would see an increase in their level of output per capita, taking the United Kingdom as the reference unit. Countries such as Latvia, Malta and Lithuania could double their scientific output if they were efficient in their fields of study. Other countries would significantly increase their scientific output, such as Finland (+54 %), Austria (+40 %), Czech Republic (+39 %), Spain (+35 %) or Italy (+29 %).

Figure 11 also shows the effect that removing all inefficiencies would have, also considering the quality effect and the specialisation effect on output per capita (optimal situation). The blue dots represent maximum output per capita corrected for quality once inefficiencies have been removed. Logically, again all the countries improve, particularly the most inefficient ones. In this case countries like the Netherlands and Denmark would see an increase of 14 and 11 % in their output due to the quality effect of their scientific output. However there is still considerable dispersion in the levels of output per capita in HEIs.

Fig. 11
figure 11

Source: SCImago Journal & Country Rank, Eurostat and own elaboration

Maximum scientific output versus R&D expenditure.

Figure 12 represents the deviation coefficient of the output per capita levels of the EU countries’ HEIs, and of the outputs per capita once the different types of inefficiencies have been removed. If we removed the effect of quality, specialisation and the intra-field inefficiencies, the deviation coefficient would decrease by 16.5 %, from 0.468 to 0.391, mainly because of the intra-field inefficiencies. This is a non-negligible change. Nevertheless, most of heterogeneity in research output per capita would still remain. This indicates the key role that differences in the amount of resources per capita plays on output per capita within the EU.

Fig. 12
figure 12

Source: Own elaboration

Dispersion of the research output per capita.

Conclusions

This study has analysed the research output of the EU’s HEIs and has explored the determinants of the differences among them.

To this end a 5-step approach was designed to explicitly consider the quality of the universities’ scientific output and their specialisation in terms of fields of science (FOS). This methodology allows us to decompose the differences in scientific output per researcher among countries in terms of differences in efficiency within each field (intra-field efficiency), differences in the FOS specialisations of HEIs in each country (composition efficiency), a quality effect and differences in R&D expenditure per researcher.

Our results indicate that, on average, given the actual resources used, the scientific output of HEIs could apparently increase by around 20 % in the EU if all the inefficiencies were removed. Nevertheless, the margin for improvement is somewhat smaller taking into account quality and specialization. Part of that apparent inefficiency is linked to the particular specialization by field of research adopted by each EU country and to differences in quality. Therefore, only an increase of around 15 % would be feasible without lowering quality or changing FOS specialization or using more inputs. However, it would still be a substantial increase.

The margins for improvements vary greatly across countries. Our results uncover large differences between countries in this subject. Inefficiency is a particular problem in countries like Latvia, Luxembourg, Lithuania, Malta, Slovakia, but much lower in countries like the United Kingdom, Sweden or Germany, where research is carried out more efficiently.

When research output is controlled for by quality of scientific output, one of its key aspects, the results in general hold. However, the impact is considerable in some cases such as Denmark or the Netherlands. The Netherlands rises from 12th to 4th position in the efficiency ranking after taking into account the quality of output.

Most of the global inefficiency estimated is intra-field (87.6 % of total inefficiency), while the composition effects, linked to the specialisation in terms of the different fields of science, are generally lower (12.4 % of the total). On the other hand, the magnitude of the latter effect in some countries is higher than the total inefficiency of others.

Relative inefficiency has a direct impact on the differences in research productivity among countries. One sixth of the heterogeneity in research output per capita would be due to the specialisation effect and the intra-field inefficiencies. Removing both intra-field inefficiencies and specialisation effects would lower the deviation coefficient of research output per capita from around 0.47 to around 0.39.

All in all, the results confirm the importance of intangible aspects as determinants of the research productivity of the European HEIs. There are substantial efficiency differences in research activity across countries. The results suggest that there is a wide margin for the EU to substantially increase research output, by up to 15 %, without having to assign additional resources, lower the quality or change the field of science. This would require improvements in efficiency, especially in countries that are further away from best practices.

These results highlight the need to maximise the quality, effectiveness and impact of both EU and national expenditure on Research and Innovation. The best practices among EU countries in this field should be analysed and adopted by the rest when feasible and appropriate. In this sense, the differences in the FOS specialisation considered worthy by each country should be taken into account, since some differences linked to it will be always unavoidable.

Linking to a higher degree the allocation of resources to the achievement of objectives could contribute to an improvement of results. In fact, our results underline the relevance of strengthening the evaluation of research and innovation policies. Without seeking complementarities between, and rationalisation of, instruments at EU and national levels a more efficient use of resources will be harder to achieve. Nevertheless, the chosen methods of evaluation should take into consideration the different dimensions affecting research outcomes: research resources per capita, research quality and specialization. Otherwise the results could prove quite misleading.

The quest for efficiency is a key aspect of the research system but, in addition, the amount of resources is also important. The results confirm that in the case of the EU countries research output per capita tends to grow with the volume of resources per researcher. A large part of the differences in research output per capita across EU countries is associated with differences in this area and would persist even if all the countries were capable of completely removing their inefficiency.