1 Introduction

Universities are increasingly perceived as key actors in contributing to economic growth and prosperity (Valero and Van Reenen 2016). They generate and transmit knowledge and human capital, provide a community of experimentation, and supply the region with innovation, thus shape regional competitiveness (Audretsch et al. 2005; Breznitz and Feldman 2012; Leyden and Link 2013; Link and Welsh 2013; Carree et al. 2014; Lehmann 2015; Lehmann and Menter 2016). This shifts universities into the focus of governments and policymakers, since ever, but in particular in the last years (Audretsch et al. 2015; Audretsch and Lehmann 2015). Building up world-class universities by shifting public resources towards the most promising universities has become a phenomenon in many higher education systems (Cremonini et al. 2014).

In 2006, Germany launched such a multi-billion Euro publicly funded Excellence Initiative (ExIn), aimed in part at propelling a handful of universities into the global research elite: “If you want to compete in the research world, you have to have some top universities that play in the first league”, Cornelia Quennet-Thielen, state secretary in Germany’s Federal Ministry of Education and Research (BMBF), points out (see Morgan 2016).

In the past, higher education policies in Germany were characterized by two main characteristics: a disaggregation of higher education policy on the state or Länder level, and a strong desire of remaining egalitarian instead of playing in the global elite. While Germany is widely recognized for its long tradition following the statues of the philosopher Wilhelm von Humboldt of the unity of research and teaching, also egalitarianism has been the watchword of German higher education policies. Competition among universities was not desired, and accordingly tuition has generally been free, admission relatively non-selective and all universities more or less equally funded. Consequently, only a few German universities were listed among the top universities worldwide in the past. These rankings have been dominated since WW2 by universities from the Anglo-Saxon countries and a small number of universities from Scandinavian countries, Switzerland or France, and are recently roughed up by universities from Asia (Audretsch et al. 2015).

Given these challenges, the German government broke with its tradition in higher education policy in two ways. First, planned in 2004 and finally launched in 2006, the ExIn aimed at accelerating the visibility of German universities and at developing project-based, top-level university research to compete in the global elite league (DFG 2013). Second, the government directly brought federal money to the universities, which are otherwise under the control of the sixteen states.Footnote 1

A decade later, in January 2016, the federal program of the ExIn was evaluated by an international expert commission, led by Dieter Imboden, emeritus professor at ETH Zurich, with the verdict of ‘very positive’ and recommended that the project should be continued (IEKE 2016). And Bernd Huber, president of the LMU Munich, one of the first universities selected and labelled as an ‘Excellence University’ notes that this program “changed the perception of German universities all over the world” and adds that “they [federal and state governments] …really got a bang for their buck” (see Morgan 2016). However, not all experts where so convinced about the success of the ExIn. Ulrich Teichler from the International Center for Higher Education Research (INCHER) at the University of Kassel says that the initiative’s significance has been “overblown”, calling it “peanuts” in financial terms and suggests that German research performance improvements have begun much earlier, with increased publications in English and “smarter” publishing strategies (see Morgan 2016). A recent volume of the university and scientific journal Forschung & Lehre (i.e. Research & Teaching), a German language journal focusing on current trends in higher education and science policy published by the German Association of University Professors and Lecturers (Deutscher Hochschulverband, in short DHV), calls attention to an online petition against the Excellence Initiative, which has been signed by more than 3000 academics and scientists. The opponents of the ExIn thereby argue that the initiative creates pseudo-markets within the higher education sector, stimulates an artificial long-distance competition for public funds, fosters an orientation towards mainstream, encourages precarious project-based employment relationships in science and leads to social inequalities among universities (F&L 2016).

With this case study, we try to shed some light on this discussion. The objective of this case study is to provide an examination of the ability of public university policies to accelerate the strength and performance of outstanding and research intensive universities. Such universities provide a good example of analyzing the trade-offs because policy initiatives almost select universities that have both potential and motivation for achieving above average and promising performance in the future. We specifically focus on the three universities selected in the first round of the Excellence Initiative: the Technical University of Munich (TU Munich), the Ludwig-Maximilians-University of Munich (LMU Munich), and the Karlsruhe Institute of Technology (KIT). Relying on both qualitative and quantitative measures, we analyze the selection process and how and whether the ExIn has led to superior results for these universities, in both qualitative (citations) and quantitative (publications) terms, considering a time period from 1998 through 2012.

In comparison to the overall positive and enthusiastic evaluation of Dieter Imboden (see IEKE 2016), our results reveal a more differentiated picture of the first Excellence Initiative. From a qualitative perspective, the ExIn was a success as the positioning within the world university rankings, i.e. the visibility of selected German universities, has increased over the years. Our quantitative difference-in-differences analyses tell a different story: it is not just the funding itself which led to a higher university performance but also the announcement of potential government funding triggered diverging performance paths among German universities. Thus, evaluations of higher education policies should be based on period-of-time rather than point-of-time related data to be able to capture holistic effects of an initiative and derive adequate policy recommendations.

The remainder of this study is structured as follows. In the next section we provide a brief overview on the literature of policy evaluation programs and introduce the concept of ‘picking the winners’ as a policy approach underlying the ExIn. We further briefly describe the ExIn as well as its expected goals. Chapter three describes the dataset and our methodological approach. In section four we first evaluate the program with qualitative measures. In the second subsection we present and discuss the results from our difference-in-differences estimations. A final section concludes.

2 University policies and the challenge of unbiased evaluation

Higher education policies like the ExIn have to ensure both the effective and efficient spending of public money, which is not trivial (Powell et al. 2012). On the one hand, policymakers have to decide on what money should be spend on, the effectiveness of measures. On the other hand, policymakers have to ensure that universities spend public money efficiently, i.e. incentivize universities to better coordinate and motivate research activities in order to avoid duplication of effort. The following sections discuss the difficulties of implementing and evaluating higher education policies, explain the mechanisms behind higher education policies and detail the concept of the Excellence Initiative.

2.1 Efficiency of universities

A plenty of initiatives has attempted to capture the benefits associated with universities and university policies. However, despite good intentions and the investment of large sums of public funds, empirical evidence on the impact of public university initiatives is disappointing or still remains nascent (Audretsch and Lehmann 2005; Mason and Brown 2013; Autio and Rannikko 2016). Although there is some increasing experience on how to design and create research centers and initiate strategic research cooperations, little is known about the evidence of such interventions or whether such policies actually work beyond the statement that “universities (still) matter” (Mowery and Sampat 2005). Since evidence-based university policies require substantiation to further support this kind of policy intervention, solid evidence is particularly important where decisions involve trade-offs across alternatives (Autio and Rannikko 2016).

Given scarce resources on the one hand and an increased competition on the other hand, university policies trade off against more inclusive university policies like supporting disadvantaged universities. Evidence to support this view is important to have verification that university policies are fit for their purpose. And not all university policy initiatives track the performance of their subjects systematically enough to support impact evaluation, or specify several objectives ex ante, where at least one of the objectives could always be considered as a success. Such challenges make policy evaluations struggle to contain selection biases and risk sampling on the outcome or performance variables.

‘Picking the winners’ and supporting them does not automatically guarantee that this policy is effective at all. Irvine and Martin (1984) made the ‘picking the winners’ argument popular, analyzing and discussing such a controversial policy in the context of the Margaret Thatcher era. Controversial, since ‘picking the winners’ should be left to the market forces and not be a business of the government (Martin 2010). Consequently, university policy in the UK during the Thatcher era, and still since then, intends to induce market competition and to force universities to compete for scarce resources like students and grants, private funds, and in particular scientists. As a result, universities applying for public funds from the Higher Education Funding Council are evaluated every five years by their publication records and third mission, i.e. their value for society. These evaluations of university performance have recently been institutionalized under the so-called Research Excellence Framework in the UK (see Smith et al. 2011). In the annually published world top university rankings, at least the Universities of Oxford and Cambridge are among the top 5, followed by the Imperial College London and the University College London. However, these institutions have been dominating the rankings for decades and the success could not be traced back to the Thatcher initiative. Despite country-specific differences concerning the higher education system between the UK and Germany, UK’s competitive attitude among universities had been used as a role model for the ExIn.

The in parts controversial discussion between phenomenon-based university policy justifications and skepticism regarding the ability and motivation of policymakers to effectively implement university policies underlines the need for solid empirical evidence of such initiatives. However, assessing and evaluating the effectiveness of public university policy initiatives is challenging, especially due to the lack of a clear mission statement spelled out about the expected effectiveness and efficiency of such initiatives. The expected objectives are almost spelled out very vague and imprecise, but promising and future orientated. This leads to measurement problems concerning ex post evaluations. It can take years for the desired effects to materialize, and within these time spans other events may shape the results. The desired effects and the impact of the public university policy initiative should thus also be expected to be reflected in the data.

An extensive and fruitful literature focuses on evaluating higher education policy initiatives on the university levels. Especially data envelopment analysis (DEA) has thereby become the most important analytical tool (see Johnes 2006; Warning 2004, 2007), and has more recently been complemented by the more robust partial frontier analysis (PFA) (see Wohlrabe et al. 2017). This literature almost finds a rather narrow distribution of efficiency scores, which are in parts shaped by policy initiatives, in parts by regional and thus exogenous factors (see Lehmann and Menter 2016). Gawellek and Sunder (2016) followed this approach and investigated whether participating in the German ExIn program or preparing an application, affected productivity and efficiency. Applying a dynamic non-parametric approach, they could not support a substantially positive effect of the ‘winning’ universities that extends beyond the public funding. Instead of gaining productivity through an increased competition of research-oriented universities, their results point out that the applicants suffered a drop in efficiency during the contest. The same holds for status effects. Whereas losing an excellence status is accompanied by negative effects (e.g. drop in the number of first year students), Bruckmeier et al. (2017) do not find evidence for a respective positive effect.

While efficiency is without doubt an important concept and performance measure, we are not aware that any university proclaims in their individual mission and vision statement that efficiency is the most important goal and objective. Efficiency is also not pronounced by policymakers when launching a new public university initiative—like the ExIn.

2.2 The ‘picking the winner’ tournament

The ‘picking the winner’ strategy is heavily criticized for at least two aspects. First, policymakers are not perfectly informed and markets are thus better in selecting promising and efficient organizations. Second, moral hazard and adverse selection problems are induced by asymmetric information between policymakers and organizations. Predicting the success of universities is difficult, even for professionals and by favoring some universities over others the government may unwittingly crowd out viable alternatives or reduce efforts of others.

If markets are not sufficiently perfect, theory suggests the creation of contests as a substitute for (perfect) markets. Such contests, tournaments, replace the lack of competition inherent in organizations, create quasi-markets and may attenuate the problems of asymmetric and private information. Initially developed as optimum labor contract theory to explain promotions and high jumps in wages, tournament theory has since then become a prominent strand in the economic literature explaining how and why individuals or organizations compete for promotions (Lazear and Rosen 1981; Nalebuff and Stiglitz 1983; Kräkel and Sliwka 2004). ‘Picking the winner’ in a tournament thus replaces the lack of market mechanisms in selecting the best universities in two ways. First, by the self-selection effect in that only those universities will participate in the contest, where the expected profits exceed the sunk costs. Second, by inducing a competition that encourages effort with an expected positive effect on output and future performance (Lazear and Rosen 1981).

‘Picking the winners’ could be intuitively explained by tournament theory (see Gürtler and Kräkel 2010; Imhof and Kräkel 2014). Like workers compete with one another for promotion, universities compete with one another for being promoted as an excellence university. The insights from tournament theory could intuitively be applied to a ‘picking the winner’ contest of the ExIn. The intensity of competition among the universities strongly depends on the nature of the prize. Prizes are determined in advance, and winning the prize depends on the relative performance of the applying university rather than the absolute performance. In the context of the ExIn, the price structure was fixed in advance and announced by the German government. Whether or not a university gets a promotion or is picked as a winner depends on the relative quality of the proposal and only the best universities are picked as winners. Not all high quality applications are selected ex post. The effort spent by the participants and the type of competition is also endogenously determined by the spread between the winners’ and the losers’ prize. Effort and competition is highest in ‘winner-takes-it-all’ contests, often leading to adverse effects like rat races (Frank and Cook 1996; Backes-Gellner and Pull 2013).

If the spread is highly compressed, then participants are less motivated to engage in extraordinary effort. Thus, an optimal amount of effort exists in that more effort is not necessarily better. As a result, the spread is nailed down in that a too small spread results in too little effort and too large spreads in more effort but also requires higher levels of compensation to attract players to the tournament (Lazear 2011, p. 155). In the context of the ExIn, the spread of the different funding lines was significant: Graduate Schools have been funded with one million Euro per year, Clusters of Excellence with 6.5 million Euro per year, and Institutional Strategies, i.e. Excellence Universities, have been funded with 21 million Euro per year. Thus, although the ExIn was no ‘winner-takes-it-all’ contest, in that the winner receives all the funding, it still can be argued that universities were strongly incentivized to apply for the promotion as an excellence university.

2.3 The excellence initiative as a beauty contest

In the following, we will present the ExIn in the light of the basic assumptions and predictions of the tournament theory (see Lazear 2011). In 2004, the Federal and State Governments in Germany have started to develop the idea for an Excellence Initiative and finally passed the first Excellence Initiative in June 2005 with the aim to promote top-level research and improve the quality of German universities and research institutions, i.e. make Germany a more attractive, internationally competitive research location (see DFG 2013). Three main areas were identified ex ante to achieve these objectives: Graduate Schools to promote young scientists and researchers, Clusters of Excellence to promote top-level research, and Institutional Strategies to develop project-based, top-level university research.

The contest was jointly organized by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) and the German Council of Science and Humanities (Wissenschaftsrat, WR). The Federal and State Governments provided a total of 1.9 billion € to fund the successful projects until the end of 2012. The competition in the first phase consisted of two tournament rounds held in 2005/2006 and 2006/2007, each involving a preliminary and a final round.

The first round consisted of draft proposals being submitted to the DFG. The DFG Head Office was responsible for examining the draft proposals to ensure that it complies with the formal requirements. Its scientific merit was then evaluated by review panels, which were appointed according to qualification and specialist knowledge of the field with which the proposal dealt. The DFG Head Office took care to avoid any possible conflicts of interest. The review panels thus included numerous scientists and academics from abroad. The reviewers carefully considered the proposal and gave their assessment, which formed the basis for the subsequent decision-making stage. An ‘Expert Commission’, appointed by the DFG, then assessed the recommendations made by the review panels (ensuring the quality of this decision-making stage) and drew up a short-list for the ‘Joint Commission’. The ‘Joint Commission’ was made up of members from the ‘Expert Commission’ and the ‘Strategic Commission’, whose members were appointed by the German Science Council; it is an international body of scientists and academics drawn from a variety of disciplines. The ‘Joint Commission’ finally decided which universities were invited to submit full proposals.

In the second round of this procedure, detailed proposals had to be submitted to the DFG. These full proposals were evaluated by international review panels and then short-listed by the ‘Expert Commission’. The ‘Joint Commission’ then compiled a list of funding recommendations on which the ‘Grants Committee’ for the ExIn based its final decision. The ‘Grants Committee’ consisted of members of the ‘Joint Commission’ and representatives from the German Federal and State Ministries of Education and Research. Scientists and academics thereby had the majority vote in the ‘Grants Committee’. Finally, the funding decisions were publicly announced by the German Federal Minister of Education and Research.

In response to the first call for proposals a total of 319 draft proposals were submitted by 74 universities. These were distributed between the three funding lines as follows: 135 draft proposals for Graduate Schools, 157 draft proposals for Clusters of Excellence, and 27 draft proposals for Institutional Strategies to promote top-level university research. In January 2006, the ‘Joint Commission’ invited 36 universities to submit full proposals. This decision was based on reviews conducted by 20 international expert panels. The proposals were distributed between the three funding lines as follows: 39 proposals for Graduate Schools, 41 proposals for Clusters of Excellence, and 10 proposals for Institutional Strategies. In October 2006, after a total of 90 proposals for the three funding lines were evaluated and discussed by international review panels and the ‘Joint Commission’ of the German Council of Science and Humanities and the DFG, the ExIn ‘Grants Committee’ selected 38 projects from 22 universities for funding and thus awarded funding to 18 Graduate Schools, 17 Clusters of Excellence, and 3 Institutional Strategies. A prerequisite for the third funding line (Institutional Strategies to promote top-level research, i.e. ‘Excellence Universities’) was that at least one Cluster of Excellence and at least one Graduate School had been selected for funding at the respective university. Three ‘Excellence Universities’ have finally been nominated: the Technical University of Munich (TU Munich), the Ludwig-Maximilians-University of Munich (LMU Munich), and the Karlsruhe Institute of Technology (KIT).

The ExIn thereby provided funding for Institutional Strategies that were aimed at developing top-level university research in Germany and at increasing its competitiveness at an international level. The funding covered all measures that allowed universities to develop and expand their areas of international excellence over the long term and to establish themselves as leading institutions in international competition and make a significant contribution to strengthening science and research in Germany in the long term and increasing the visibility of current research excellence.

In January 2016, the International Expert Commission on the Excellence Initiative (IEKE), led by Dieter Imboden, delivered the evaluation of the whole program (2006–2012) to the state government. The presented results indicated the success of the ExIn as it made, according to the commission, the German university system more dynamic and improved the international competitiveness of German universities. The commission thereby highlighted the “impressive qualitative performance regarding publications” (IEKE 2016, p. 5) and suggested to continue the ExIn, i.e. the funding of selected universities.

3 Dataset and methodology

Besides qualitative measures like global university rankings, this study relies on quantitative measures, requiring both a solid dataset as well as a rigor methodological approach (see Cunningham et al. 2017). The following sections describe our main variables of interest, provide an overview of the differences between excellence and non-excellence universities as well as explain our basic estimation techniques.

3.1 Dataset

Our primary goal is to analyze the effects of the first Excellence Initiative launched in 2006. In order to assess academic performance differences of universities induced by governmental funding, our dataset consists of all public universities in Germany. As the staggered funding of the ExIn has divided the German higher education system into three groups of universities, i.e. Excellence Universities with augmented funding (winning the funding line Institutional Strategies) (hereafter: first tier universities), Non-Excellence Universities with funding (winning either the funding line Graduate Schools, Clusters of Excellence, or both) (hereafter: second tier universities), and Non-Excellence Universities without any funding (none of the three funding lines of the ExIn) (hereafter: third tier universities), we split our dataset accordingly. Hence, we are able to compare (1) first with third tier universities, (2) second with third tier universities as well as (3) first with second tier universities. Promoted universities, i.e. first tier universities for (1) and (3) as well as second tier universities for (2), are thereby classified as the ‘treatment group’ while the other university groups build the control group (non-treatment group) respectively. Our main variables of interest are measures of academic performance, in particular citations, as a measure for the quality of conducted research, and publications, as a measure for the quantity of conducted research. We rely on data provided by the German Federal Statistical Office and the Thomson Institute and include publications in academic journals and the number of the citations as well as further university characteristics like students, research fellows, public funds (third-party funds from the German Research Foundation) and private funds (third-party funds from industry) within a 15-year period from 1998 to 2012. This time period captures the effects before the treatment effect in 2006 and the following 6 years after the treatment effect has occurred.

3.2 Descriptive statistics

Based on difference-in-differences estimations, we analyze how the staggered funding of the first Excellence Initiative has shaped the performance of universities. Table 1 provides some comparisons between Excellence Universities and all other German public universities to reveal first insights into the differences of the German higher education landscape. The rows of Table 1 compare the average (mean values) of the main variables of interest, each for the treatment group (Excellence Universities), the non-treatment group (all other German public universities) and the difference, with statistics tabulated separately for the pre- and post-treatment period. For each period, the third column tabulates the difference of the mean values. The last column of Table 1 shows the difference-of-the-difference of the mean values between the pre- and post-treatment period. To consider across the main academic fields, we distinguish between publications, citations and students in the natural sciences (SCI), the social sciences (SSCI) and in arts for our descriptive analysis (see Audretsch et al. 2004). Public and private funds as well as the variables public funding and private funding are depicted in 1000 Euro. All other figures reflect the respective absolute values.

Table 1 Descriptive statistics (difference in difference estimation)

In the pre-treatment period, the treatment group more than doubles the control group in size, as measured by the number of full-time research fellow equivalents, and triples them in the number of students enrolled in the natural sciences. The selected ‘Excellence Universities’ are thus focused on the natural sciences and reveal a strong scientific record in this field measured by the number of citations and publications. Here, the difference is statistically significant on the 1% level. Differences in the social sciences and in arts, either in the number of students or scientific output remain rather small and not significantly different from zero. At first glance, the additional funds are invested in research fellows. While the number of research fellows remains rather constant for the control group in the post-treatment period (a slight increase of about 10%), this number increased for about 30% for the treatment group.Footnote 2 This increase is also reflected by the drop of the student to faculty ratio for treated universities, our proxy for teaching intensity. While the number of articles published in the natural sciences in the post-treatment period has also been increasing, the number of citations has declined.Footnote 3 Despite the number of publications and citations in the natural sciences, the social sciences and arts still remain less important.

Splitting our dataset into the three different groups of universities, i.e. first, second, and third tier universities, offers further insights into the effects of governmental funding through the ExIn on university performance. Figure 1 illustrates the described increase in the number of research fellows for first tier universities, but also an increase in the number of research fellows for second tier universities. Hence, funded universities spent a large portion of the additional funding in new employees. Third tier universities were thus not able to keep up with promoted universities—as indicated by the only moderate increase in the number of research fellows over time.

Fig. 1
figure 1

Development of the number of research fellows

The additional number of research fellows yet decisively influences the efficiency scores of promoted universities. As shown by the quality of research, i.e. the number of citations per research fellow, depicted in Fig. 2, the scores for all subgroups of German universities drop—especially those for first tier universities.Footnote 4 Taking the quantity of research, i.e. the number of publications per research fellow, reveals a different picture (see Fig. 3). Especially second tier universities were able to drastically increase the number of publications per research fellow. A main driver for this increase is the core objective of the funding line Clusters of Excellence with the expressed goal to promote top-level research. Whereas also third tier universities were able to improve their scores, first tier universities again experienced a drop in their publication to research fellow ratio.

Fig. 2
figure 2

Development of research quality (research quality is measured by the number of citations per research fellow)

Fig. 3
figure 3

Development of research quantity (research quantity is measured by the number of publications per research fellow)

It is a straightforward way to compare these findings with international main competitors of the German university system. Figure 4 provides a snapshot of the three selected ‘Excellence Universities’, comparing them with the main ‘competitors’ from the US (Stanford, MIT, Berkeley, and Michigan), the UK (Oxbridge) and Switzerland (ETH Zürich).

Fig. 4
figure 4

Annual budgets and number of students for selected universities (See also the final report of the International Expert Commission to Evaluate the Excellence Initiative managed by the Institute for Innovation and Technology (iit), IEKE (2016): http://www.gwk-bonn.de/fileadmin/Papers/Imboden-Bericht-2016.pdf)

Abstracting from Stanford and the MIT with the highest budget, the other universities did not differ extremely, with the exception of Karlsruhe with the lowest budget of all 10 universities. The budget-to-student ratio is, compared to ‘Oxbridge’, the ETH, Stanford and the MIT, lower for the three ‘Excellence Universities’, but in line with the one from Berkeley and Michigan. To overcome the drawbacks of an egalitarian system, the ExIn provided additional funds to the three universities, as depicted in total by the right bar (Excellence Initiative). Although it may appear rather low compared to the total budget of the universities, the provided grants constituted a remarkable increase for Karlsruhe.

3.3 Methodology and estimation techniques

Based on our panel data, the most straightforward method to estimate the effects of receiving governmental funding through the ExIn is to employ a fixed effects model. In order to assess whether the ExIn affected university performance, we extend the following naïve fixed effects model, implemented in point-of-time examinations:

$$ Y_{rt} = \beta_{r} + \beta_{t} + \beta_{treat} Treatment_{rt} + X\gamma + \varepsilon_{rt} $$
(1)

where Yrt is the number of citations per research fellow of university r at time t. We use this measure as a proxy for the quality of research within a university—one of the most important indicators within the Excellence Initiative. Individual fixed effects βr account for university specific heterogeneity, βt controls for common output growth across universities. Treatment is modelled as a dummy variable, which takes the value of one at t and all future periods t + s if a university receives funding through the ExIn, and zero otherwise. We further include control variables, which are assumed to affect scientific performance independent from the governmental funding effect and covered in the vector X. This vector includes the intensity of third-party funding activities (public and private) as well as teaching intensity, measured by the student to faculty ratio. As teaching can be stimulating but also distracting for research activities, we include both a linear and squared term of the student-to-research fellow ratio (see Wood 1990; Edgar and Geare 2013). As usual, \( \varepsilon_{rt} \) represents the error term. In order to obtain clear cut treatment groups, we exclude all universities which lose their excellence status during the program (2006–2012).

Following Bertrand et al. (2004), \( \beta_{treat} \) measures the desired treatment effect of changing from a ‘normal’ university to a ‘promoted’ university. Due to a possible positive correlation in a university’s output, we expect downward-biased standard errors. Hence, we employ clustered standard errors on a university level, which control for both, autocorrelation as well as for heteroscedasticity in the error term. As we cannot exclude the possibility that more than one university from a specific region is elected, we further report regionally clustered standard errors. As we will see later, clustering either on regional or university level does not change the results qualitatively.

To obtain consistent estimates of the treatment effect, one crucial assumption is the parallel trend assumption. This assumption is generally violated if universities’ output doesn’t follow a common trend during the observed time period given the Excellence Initiative would have never been implemented.

In general, we should be concerned identifying the treatment effect based upon Eq. (1). Our major point is that it is reasonable to assume that universities increase their endeavors becoming an excellence university prior the initiative starts. This in turn may impact the universities’ output in a positive or negative way. Hence, if the treatment coefficient \( \beta_{treat} \) is contaminated by a kind of preparation effect, or other strategic effects, we should observe an effect on a university’s output the period before the treatment occurs. Finally, we cannot exclude the possibility of reverse causality between Yrt and the control variables.

It is hard to test for the parallel trend assumption as we cannot observe the counterfactual development of a university’s output. Nevertheless, we include time effects in our fixed effects model before and after the actual treatment occurs to test in an informal way whether the parallel trend assumption holds. For instance, if there is a significantly time effect prior the treatment, we obtain a spurious correlation due to the fact that anticipation effects appear prior the treatment, generally known as the Ashenfelter’s (1978) dip problem. By including year effects, we also control for dynamic output effects of universities.

As a direct reflex to the Ashenfelter’s (1978) dip problem and our focus on period-of-time examinations, we expand the just explained naïve regression equation towards a dynamic equation. This dynamic equation controls for both potential anticipation effects and dynamic output effects

$$ Y_{rt} = \beta_{r} + \beta_{t} + \mathop \sum \limits_{q = 1}^{4} \delta_{t - q} Treatment_{rt - q} + \beta_{treat} Treatment_{rt} + \mathop \sum \limits_{q = 1}^{4} \delta_{t + q} Treatment_{rt + q} + X\gamma + \varepsilon_{rt,} , $$
(2)

where \( Treatment_{rt - q} \) is equal to one in period q and zero otherwise. Hence, the additional coefficients \( Treatment_{rt \pm q} \) can be interpreted as pre- and post-treatment of the Excellence Initiative and should deliver some information whether our treatment effect can be interpreted in a causal way. For instance, obtaining significantly negative or positive pre-treatment effects should be seen as a signal that our treatment is not causal (see Autor 2003; Autor et al. 2007). On the other hand, when we obtain no significantly pre-treatment effects, we can interpret the treatment effect of the Excellence Initiative in a causal manner.

4 Results and discussion

We evaluate the first Excellence Initiative by applying a qualitative and a quantitative approach. Concerning the qualitative perspective, we take the world university rankings of Quacquarelli Symonds (QS) and Times Higher Education (THE) into account (see Figs. 5, 6).Footnote 5 Both rankings reveal an almost steady improvement in the rankings for all three ‘Excellence Universities’ selected in the first Excellence Initiative. These rankings reinforce the statement of Bernd Huber, president of LMU Munich, claiming that the ExIn increased the visibility of German universities and generated an additional competitive edge. Both the Technical University of Munich (TU Munich) and the Ludwig-Maximilians-University of Munich (LMU Munich) thereby hold a special position as they got promoted and funded again in the third Excellence Initiative which started in 2010 and was launched in 2012.

Fig. 5
figure 5

QS World University Rankings (QS World University Rankings is based on six performance indicators (respective weighting are in parentheses): 1. Academic reputation (40%), 2. Employer reputation (10%), 3. Student-to-faculty ratio (20%), 4. Citations per faculty (20%), 5. International faculty ratio (5%), 6. International student ratio (5%))

Fig. 6
figure 6

THE World University Rankings (THE World University Rankings are based on five performance indicators (respective weighting are in parentheses): 1. Teaching (30%), 2. Research (30%), 3. Citations (30%), 4. Industry income (2.5%), 5. International outlook (7.5%))

Our fixed effects estimations, i.e. our quantitative approach, yet reveal a different picture of the first Excellence Initiative (see Tables 2, 3). Comparing universities funded through the ExIn (both either through Institutional Strategies or Graduate Schools and/or Clusters of Excellence) with non-funded universities, the treatment effect, i.e. the effect of funding within the framework of the ExIn on university performance, measured by the citations per research fellow, is negative for all model specifications. This implies that being funded through the ExIn resulted in a decreased research performance which is, at first glance, contrary to the conclusions drawn from world university rankings which also base their scores mainly on research performance, but also on research grants and funding.Footnote 6 Beyond the influence of higher education policies, the quality of research is positively shaped by teaching activities. Research productivity and intellectual contributions are generally moderated by the scientist’s inherent task bundle ranging from networking and collaboration, teaching and mentoring to commercialization activities (see Kyvik 2013). Our results suggest that the relationship between research performance and teaching intensity is an inverted U-shape, indicating that student interaction is up to a certain point stimulating for scientific research. White et al. (2012) confirm this view as they claim that teaching responsibilities shape time and effort for research activities.

Table 2 Fixed effects estimations of pre-treatment and post-treatment effects of 1st round of excellence initiative (comparison of first with third tier universities)
Table 3 Fixed effects estimations of pre-treatment and post-treatment effects of 1st round of excellence initiative (comparison of second with third tier universities)

The announcement or anticipation effects prior to the actual treatment effect is significant and positive for the comparison of first with third tier universities, and positive yet not significant for the comparison of second with third tier universities, as indicated by δt−2. Including the quantity of research, proxied by the publications per research fellow, as an alternative measure for university research performance, confirm the robustness of the results. (see Appendix Tables 6, 7).

The results revealed by the comparison of first with second tier universities are not that straightforward (see Table 4). Neither the treatment nor the announcement effect differs statistically significant from zero, and the expected positive sign of δt−2 cannot be confirmed. Including again the quantity of research as an alternative measure, the expected signs, i.e. a negative treatment effect as well as a positive announcement effect is revealed (see Appendix Table 8). In order to thoroughly investigate respective mechanisms, we created a subsample of our dataset as a robustness test by comparing Excellence Universities (of the first round of competition, winning the funding line Institutional Strategies) with to-be Excellence Universities (of the second round of competition, winning the funding line Institutional Strategies) (see Table 5). All to-be Excellence Universities of the second round of competition have received funding in the first round of competition, either through Graduate Schools or Clusters of Excellence. These additional estimations confirm our initial assumptions and confirm the negative treatment effect as well as a positive effect (albeit not statistically significant on the 10% level). The results are robust against our alternative measure, i.e. the quantity of research (see Appendix Table 9). We conclude that both a negative treatment and a positive announcement effect exist and are induced by the ExIn funding framework—just the effect sizes for the respective university groups differ, indicated by the heterogeneous significance levels.

Table 4 Fixed effects estimations of pre-treatment and post-treatment effects of 1st round of excellence initiative (comparison of first with second tier universities)
Table 5 Fixed effects estimations of pre-treatment and post-treatment effects of 1st round of excellence initiative (comparison of Excellence Universities (of the first round of competition) with to-be Excellence Universities (of the second round of competition))

Conflating both the qualitative and quantitative perspectives unearths the actual aggregate effect of the examined higher education policy initiative: it is not necessarily the ExIn itself, but the announcement of the launch of the initiative with its associated funds and benefits for selected universities which triggers university research performance and leads to diverging performance paths among universities. Those universities which self-select themselves to apply for such an initiative and participate in the contest show superior research performance and high functioning departments prior to the potential funding period. Thus, the creation of a strong cultural ethos in the course of the application for such an initiative finally influences research performance outcomes (see Edgar and Geare 2013). The funding itself constitutes only the reward of previous out-performance and hence does not create sufficient further stimuli, resulting in a non-positive treatment effect of the ExIn. Hence, we can support the argument of Gawellek and Sunder (2016) that applicants of the ExIn suffered a drop in efficiency during the contest.

Despite the robustness of our empirical results, this research is subject to a number of limitations. The most pressing issue relates to the fact that we only investigated the first Excellence Initiative. Meanwhile, in the course of the second and third Excellence Initiative, further universities have been selected and promoted as excellence universities within the respective funding lines. As already mentioned, some universities like the LMU Munich and the TU Munich, have been selected again in the third round of the Excellence Initiative. Other universities like the RWTH Aachen, the University of Konstanz or the Free University of Berlin have been promoted in both the second and third round of the Excellence Initiative. A more sophisticated approach should consider these interdependencies and the associated volatile phases of being an ‘excellence’ university in 1 year and just a ‘normal’ university in another year. Future studies should also try to capture the entire application process: applying for the ExIn is extremely time consuming, resulting in less time to engage in research and publishing. Thus, the three excellence universities of the first ExIn had to face disadvantages in comparison to other universities that did not apply.Footnote 7 As not only Institutional Strategies but also the two other funding lines of the ExIn, namely Graduate Schools and Clusters of Excellence, have influenced the performance of respective universities, future studies should try to capture and consider the interdependencies of the three funding lines and their respective impact on university performance (see Wollersheim et al. 2015). Propensity score matching models might be one way to achieve a more fine-grained comparability of treated and non-treated universities. Finally, other political interventions or regional idiosyncratic effects might have impacted university performance thus potentially triggered the observed diverging performance paths.

5 Conclusion

An extensive and fruitful strand of literature has been established focusing on the assessment of university performance and corresponding evaluations of respective higher education policy initiatives (Astin 2012; Fabel et al. 2008; Hazelkorn 2015). Previous studies have based their empirics mainly on point-of-time rather than period-of-time related data, thus neglected performance differentials of universities before and after the implementation of respective higher education policies. Beyond quantitative research, also the establishment of academic rankings of world top universities has caught increasing attention—also among policymakers, interpreting those rankings as the ability to create and sustain competitive advantages within the global higher education system without considering potential biases of such rankings (see Le and Tang 2015).

This study tries to tackle this issue. Relying on both qualitative and quantitative measures, we evaluate the effects of the first Excellence Initiative in Germany. Whereas world university rankings, our qualitative approach, revealed a positive effect of the ExIn, our fixed effects estimations, i.e. our quantitative approach, unearthed a per se negative treatment effect but a positive announcement effect of the ExIn. Accordingly, endeavors of universities to get promoted and funded triggered university excellence and associated research outcomes, not the actual funding itself—serving only as an incentive to engage prior to the actual treatment. This may lead to several policy implications.

First, ‘picking the winners’ as a policy approach might not always lead to the desired outcomes in that additional money leads to additional output, i.e. in the context of universities to an increased research performance. Policymakers should carefully select and promote universities while considering the effects for promoted and rejected applicants. Establishing world class universities at the expense of all other universities might overall decrease the well-known positive externalities of the higher education sector and thus a country’s competitive edge. Second, higher education policies should always comprise the entire higher education system to sustainably promote the university environment while still setting incentives for outstanding university performance, i.e. establishing elite universities while providing under-performing universities the opportunity to catch up. Third, a sole focus on academic rankings might initiate, according to Martin (2017), a worrying development with centralized top-down management, increased bureaucratic procedures, prescribed formulas for teaching, and scientific research driven by performance targets. Policymakers should prevent university activities solely concerned about improving their academic ranking instead of fulfilling their fundamental tasks of teaching, research, and its commercialization, i.e. avoid rat races among universities. Additional performance criteria are thus needed to capture the complex interdependencies within academia.

Further research should specifically focus on the effects of the formulation of targets of higher education policies and respective outcomes. Only well-framed and overall accepted policy initiatives might lead to desired outcomes, enabling a rigor target/performance comparison, i.e. facilitate the evaluation of respective programs. Moreover, future research should focus on the examination of the entire Excellence Initiative with its three phases and three funding lines from 2006 to 2017, i.e. take a more holistic approach to cover the aggregate effects of the ExIn. While this research focused on the German university system and respective performance differentials of treated and non-treated German universities, future studies should be conducted focusing on performance paths of promoted German universities and funded universities from other countries with comparable funding initiatives, i.e. investigate the efficiency and effectiveness of German university policies in an international context while considering different institutional forms, governance mechanisms and endowments. Following Leyden and Menter (2018), also the cross-fertilization of basic and applied research and the respective impact of various funding sources on the performance of funded universities should be further examined.