Introduction

For a number of years New Zealand universities have produced lists of articles published in scholarly journals, chapters in books, monographs and complete books, conference presentations and other ‘outputs’ that indicate the breadth of the research enterprise. At the individual researcher level these outputs are used to compile ‘evidence portfolios’ as part of the Performance-Based Research Fund (PBRF) scheme, through which universities and other participating tertiary educational institutes receive a proportion of their funding.

In this scheme, individual researchers compiled a portfolio which listed up to 30 research outputs over a 5-year census period within a number of categories (e.g., journal articles, chapters in books, research books, published conference papers, etc.). From these each researcher selected four outputs, and for each of them described the value of the research to the discipline and commented on its likely impact, often by referring to impact factors of the journal in which the research was published, the number of times the article had been cited (if published early enough in the 5-year census period), or cited favourable reviews of the work. There has been a perception of an implicit preference for articles to have been published in peer-reviewed journals, and for other outputs to have been invited, selected or by some other external process demonstrated as being of quality. In addition to research outputs, each researcher listed and commented on their activities across a range of categories that either enhanced the researcher’s standing within the research discipline—‘peer esteem’, or that contributed to the ‘research environment’.

Assessment panels for a range of disciplines examined each portfolio, assigning a weighted ranking to an assessment of the quality of the research outputs (70%), ‘peer esteem’ (15%) and ‘contribution to the research environment’ (15%). The quality of the research outputs is based on the number of outputs and a judgement of quality, based on the self-reporting by the researcher as described above and panelists’ knowledge of the standing of journals, book publishers and professional conferences in their respective disciplines. On this basis, quality scores for each researcher were determined and converted to an individual ranking of A, B, C or R (the last designating no research activity). The scores were then weighted and aggregated to give a quality score for each ‘subject area’. Weightings were assigned to the six quality categories of staff. The agreed funding weights—“A” (5), “B” (3), “C” (1), “C(NE)” (1), “R” (0), “R(NE)” (0), where NE indicates a ‘new researcher’ and “R” indicates ‘research inactive’—were multiplied by 2, to give an enhanced weighting. These weightings were summed and then divided by the number of staff in a research area. This resulted in a rating scale of 0–10, where 10 would be the research score for a School, department or other academic unit, if all researchers were graded A.

The results of Schools, departments or other academic units are compared in ‘league tables’ published after each PBRF round (Tertiary Education Commission 2004, 2007). Coupled with information collected from universities and other participating educational organizations about the number of research degree completions and the amount of external research income generated, an overall research ranking to the institution is assigned and this determines the research funding component of Government funding provided.

While there has been some analysis of various aspects of the PBRF (e.g., Smart 2007, 2009), little use seems to have been made of the aggregated data of research outputs to quantify other changes in research culture consequent on the introduction of the PBRF that have been reported. This paper uses bibliographic information from compilations of staff and student research at Victoria University of Wellington from 2000 to 2007 (Staff and Student Research 2000–2007) and from compilations on university websites (Massey University 2000–2006; University of Waikato 2004–2008; University of Otago 2004–2006) to demonstrate that since the introduction of the PBRF (1) all subjects are becoming ‘science-like’, with an emphasis on the production of articles in journals, and the PBRF is influencing the type of research outputs produced; (2) collaboration, as measured by co-authorship, is increasing; and (3) despite the rhetoric about the PBRF system enhancing quality, it effectively remains a “publish or perish” numbers game.

Becoming ‘science-like’

Butler (2006) identifies four ‘bands’ of fields of subjects compiled by the Australian Department of Education Science and Training. For Band 1, the quantitative sciences, Butler infers that bibliometric indicators, particularly those involving citation analysis, are normally robust; for Band 2, “bibliometric analysis has something useful to say about performance…” of the more descriptive sciences; for Band 3 more caution is needed in the use of bibliometric measures; and for Band 4 she alleges “the use of standard bibliometric measures cannot be supported at all”. These bands and the corresponding Schools within Victoria University are shown in Table 1.

Table 1 DEST bands and fields and representative Schools at Victoria University of Wellington

While not denying the likely validity of Butler’s (2006) concern for work involved with citation analysis and similar activities, a concern echoed by Corbyn (2008), Fig. 1 suggests strongly that the PBRF scores of subjects are correlated with the number of research articles published by individual staff: the more articles, the higher the score. The highest-scoring subjects are the quantitative sciences, so the trend of Fig. 1 could be considered to be demonstrating the extent to which subjects are becoming ‘science-like’.

Fig. 1
figure 1

Variation of subject PBRF score for representative subjects of DEST bands and the number of articles published per staff member for the period 2004–2006. The trend line shown is for Band 4 subjects (crosses) is close to the lines that could be drawn between the two subjects representing Band 1 (filled diamonds) and the two subjects representing Band 3 (open triangles). The PBRF ‘quality scores’ are from Tertiary Education Commission (2007)

Figure 2 shows the trend in the number of articles produced, aggregated over the representative subjects (Table 1).

Fig. 2
figure 2

A Variation of number of articles published for subjects that are representative of DEST Bands 1–4, relative to publications in 2000 for the period 2000–2007. The base numbers of articles (i.e., in 2000) to which subsequent years are compared are: DEST Band 1, 86; DEST Band 2, 55; DEST Band 3, 34; DEST Band 4, 121. B Variation of the number academic staff members in the discipline areas of Table 1 relative to the numbers in 2000, for the period 2000–2007, compiled from the annual Calendars of Victoria University

While a general upward trend in the number of publications could be explained as simply caused by an increasing number of academic staff (the trends for which are shown in the lower part of the diagram), the peak and subsequent decline are clearly not attributable to such a cause. The most pronounced increase occurs in 2005, in advance of the partial PBRF round in 2006, although the small peak in 2003 for Band 4 and the rising trend for Band 1 at this time suggest some strategies may have been in place to enhance output generation, albeit belatedly. The trends for Bands 2 and 3 suggest a lack of strategic intervention to enhance the PBRF. The subsequent peaks for Bands 1, 2 and 4 in 2005 suggest strategies were in place to increase the number of outputs published for inclusion in the 2006 round. Whether these ‘strategies’ represent motivation for improvement by individual researchers or represent a wider School- or Faculty-based activity is not known. The trends for Bands 2 and 3 to 2003 suggest a lack of engagement with the first PBRF round. All bands show a decrease in publication of research articles in 2006; this decrease also suggests that any deliberate effort to increase the number of research outputs reported in the evidence portfolios for the 2006 round has not been sustained since then.

The number of conference papers (comprising those published in proceedings and those presented orally or as posters) shows an increase from 2000 to 2005 and a decrease thereafter for all bands (Fig. 3). The peaks for all bands in 2005 suggest strategies were in place to increase the number of outputs published for inclusion in the 2006 round. The large increase for Band 1 between 2006 and 2007 may represent a change to a more comprehensive reporting of conference-based outputs in disciplines traditionally more concerned with reporting articles in refereed journals. The growth in activity is greater for Bands 2 and 4 than in Band 1, probably reflecting Band 1’s traditionally greater emphasis on reporting research in refereed journal articles rather than presenting it to conferences.

Fig. 3
figure 3

Variation of number of conference presentations, posters and conference papers in published proceedings for subjects that are representative of DEST Bands 1–4, relative to those in 2000. The base numbers of conference presentations (i.e., in 2000) to which subsequent years are compared are: DEST Band 1, 37; DEST Band 2, 24; DEST Band 3, 35; DEST Band 4, 177. The base numbers of conference papers published in formal proceedings (i.e., in 2000) to which subsequent years are compared are: DEST Band 1, 9; DEST Band 2, 15; DEST Band 3, 20; DEST Band 4, 43

The ratio of conference papers published in proceedings to those presented as oral or poster papers (Fig. 4) varies significantly through the time period. For the science subjects of Band 1 there is a clear drive towards papers in conference proceedings in the run-up to both the 2003 and 2006 PBRF rounds, but this reverts to levels typical of Band 2 subsequently. Band 3 shows a higher incidence of conference proceedings to other forms of conference paper presentation. Bands 2, 3 and 4 all show a recent increase in favour of conference papers published in proceedings. This could be caused by a perception among researchers that research reported in conference proceedings is more highly regarded than that which is not; equally it might more reflect conference organizers’ belief that publication of conference proceedings adds to the prestige of their conference.

Fig. 4
figure 4

Ratio of conference papers published in proceedings (PROC) relative to those presented (PRES) (including poster papers) for subjects representative of DEST Bands 1–4, relative to those in 2000

The ratio of journal articles to conference papers fluctuates widely through the period (Fig. 5). For science subjects journal articles are consistently the output of choice, especially in the run-up to the PBRF round. This trend is very clear for Band 1, while Band 2 shows a similar but more subdued trend. For Band 4, the ratio is essentially constant; there is also comparatively minor variation for Band 3 except for the anomalous peak in 2004. The overall conclusion that can be reached from these curves is that the PBRF environment appears to encourage journal article production over conference presentations. There is anecdotal evidence that those charged with fostering research in the University support this trend.

Fig. 5
figure 5

Ratio of articles published to conference papers presented and published for subjects representative of DEST Bands 1–4. Higher values of the ratio indicate a dominance of articles over conference presentations

The myth of quality

New Zealand’s Tertiary Education Commission (2007, p. 62) reported an increase in the quality scores between the first round of the PBRF in 2003 and the second ‘partial’ round in 2006:

Overall, the results show that the quality score for the tertiary education sector has increased from 2.59 in 2003 to 2.96 (FTE-weighted) in 2006. This represents a 14.3% improvement in measured research quality. It would, however, be erroneous to suggest that research quality has improved by this precise magnitude. Nor is the quality score the only measure of research quality.

The report elaborates a little later (p. 63):

… whatever the actual improvement in average research quality, there can be little doubt that there has been an increase in research quality and in the quantity of research output since 2003. [italics added]

The previous section certainly makes clear that the number of research outputs—particularly articles in journals—has increased and that this is related to the PBRF quality scores. Whether PBRF ‘quality scores’ are real measures of research quality must be less certain. Bibliometric approaches that purport to be indicators of research quality have their origins with Garfield’s (1972) work on citations, and their most recent expression in the changes proposed for the United Kingdom’s research assessment exercise (Adams 2008). Countering an earlier finding by Doyle and Arthurs (2000), Taylor and Dillon (2007) found that highly cited articles are generally those published in more highly ranked journals. Thus, the proportion of an individual researcher’s or a business school’s published papers that are in journals of higher rank, might be considered to be a measure of research quality.

The percentages of publications in journals ranked by the Australian Business Deans Council (ABDC) (2008) on a four-point scale—A* (highest), A, B, C (lowest)—can be determined for the business schools at Waikato University and Victoria University from 2004 to 2008, and Otago University from 2004 to 2006 (Table 2).

Table 2 Percentage of peer-reviewed journal articles from business schools/faculties at Victoria University, University of Waikato and University of Otago, in journals graded by Australia Business Deans Council, 2004–2008

From a peak in the proportion of papers published in A*-rated journals in 2005, Victoria University shows a marked decrease, a trend not seen for Waikato and Otago Universities. By comparison, the trends in percentages of peer-reviewed articles in A-rated journals are similar for Waikato and Victoria. From peaks in 2006 and 2005, respectively, the percentage of peer-reviewed articles in B-rated journals for Waikato and Victoria trend to comparable values, while from similar values of the percentage of peer-reviewed articles in C-rated journals in 2006, Victoria has shown a continued decrease, unlike Waikato which has started to increase from 2007. For the percentage of articles in journals that are not included on the Australian Business Deans Council (ABDC) list, Victoria shows a marked decline from 2004 to 2005, a lesser decline from 2005 to 2007, and then a marked increase from 2007 to 2008. Over the same period the percentage of such articles increases steadily for Waikato University, while the percentage for Otago University remains essentially constant.

It is easier to envisage the overall effect of the trends of Table 2 if the percentages of articles in each category of ABDC journal rank are used to calculate a weighted value (J) as shown in the equation below.

$$ J = \left( {\left[ {\% {\text{A}}^*} \right] + 0. 8 \left[ {\% {\text{A}}} \right] + 0. 6 \left[ {\% {\text{B}}} \right] + 0. 4 \left[ {\% {\text{C}}} \right] + 0. 2 \left[ {\% {\text{U}}} \right]} \right) $$
(1)

This means that if all the publications were ranked A*, the weighted value (V) would be 1.0; if all were ranked A, V would be 0.8; if all were ranked B, V would be 0.6; if all were ranked C, V would be 0.4, and if all were un-ranked (i.e., U), V would be 0.2. It might have been anticipated that if the PBRF system was a driver of research quality that the weighted rank would increase over time. Although there are fluctuations, such a plot shows an overall decline for all three universities after 2005, i.e., an increasing proportion of articles in lower ranked journals (Fig. 6), suggesting that the ABDC rankings are having less impact over time.

Fig. 6
figure 6

Variation of the weighted value (J = [%A*] + 0.8 [%A] + 0.6 [%B] + 0.4 [%C] + 0.2 [%N])

The high variability in the proportions of A*, A, B, and C-ranked journals from year to year provides no long-term evidence that researchers are writing for more highly ranked journals. Indeed, there is an overall trend towards higher percentages of publication being in C-ranked and un-ranked journals, suggesting that the ABDC rankings are having less impact over time. Coupling this observation with the slight overall increase in the number of publications over the period for business schools at Victoria University and the University of Waikato (Table 3) leads to an inference that one of the consequences of the implementation of the PBRF may have been simply to increase the numbers of publications in journals without real regard to their quality.

Table 3 Number of peer-reviewed journal articles from business schools/faculties at Victoria University, University of Waikato and University of Otago, 2004–2008

Anderson and Tressler (2008) noted that while there is a consistency across measures for the lowly ranked economics departments in New Zealand tertiary institutions participating in the PBRF,

the scheme used to measure output does matter to some departments, especially for Auckland, Victoria and Waikato. Crudely put, what is at play here is the matter of quantity versus quality. It is apparent that Waikato [economics] researchers are the most active in New Zealand, but based on our weighting schemes, especially those based largely on citation counts, many of their publications carry little weight in the overall calculation. On the other hand, under certain weighting schemes, researchers at Auckland and Victoria can be characterized as low volume, but high quality producers.

In a comparison of citations per full-time equivalent researcher with the quality of research as determined by the PBRF for all disciplines, Smart (2007) similarly found “a positive association between the quality of research and the academic impact of research”, but noted that “the strength of this relationship varies between the subject panels and between 2003 and 2006”. In contrast, Smith (2008) found good correlations between citation counts and the total number of research outputs across all disciplines within New Zealand universities. Despite this, overseas experience using citation counting as a measure of research quality will be being watched with interest in New Zealand (Smart 2009).

Collaborative behavior

Researchers often assert informally that one of the consequences of the PBRF is the loss of collegiality in the academic environment. One measure of research collaboration is the extent of co-authorship, for which a weighted value for collaborative authorship (A) can be obtained as described in Eq. 2:

$$ A = \left( {N_{ 1} + 2.N_{ 2} + 3.N_{ 3} + 4.N_{ 4} } \right)/ \left( {N_{ 1} + N_{ 2} + N_{ 3} + N_{ 4} } \right) $$
(2)

where N 1 is the number of articles with one author; N 2, the number of articles by two authors; N 3, the number of articles by three authors; and N 4 is the number of articles by four or more authors.

Figure 7 shows that co-authorship by researchers in all subject bands has generally increased during implementation of the PBRF.

Fig. 7
figure 7

Weighted average number of authors per article published for subjects representative of DEST bands 1–4, Victoria University of Wellington, 2000–2006

Of course, co-authorship has been common for many years in chemistry and physics, and more recently publications in biological sciences have followed suit. In these disciplines co-authorship is generally driven by both a team-research culture and, more pragmatically, the need to share equipment and resources. Figure 7 shows that an average of three authors per paper is characteristic of Band 1, with lower levels of co-authorship in other sciences and the social sciences (Bands 2 and 3). Even the humanities (Band 4) show a slight increase towards higher levels of co-authorship during the period. That the lines for the Bands do not cross suggests, nevertheless, that traditional patterns of co-authorship are being maintained—and not being unduly influenced by the PBRF, at least at Victoria University.

Patterns of co-authorship vary between universities. As an example, Fig. 8 shows the variation of co-authorship in the ‘management’ discipline at Victoria University (which for this compilation comprises the Victoria Management School and the School of Government) and at Massey University (which for this compilation comprises the departments of Human Resource Management, Management and Enterprise Development, Management and International Business, and Management at Palmerston North and Wellington). The Massey trend shows a significant increase in co-authorship around the time of the 2003 PBRF round, but this was not sustained into the preparation for the 2006 round.

Fig. 8
figure 8

Variation of weighted average number of authors per article in management disciplines at Massey University and Victoria University of Wellington, 2000–2006. (At Massey University 64 academic staff-members are listed in management disciplines on its website; the 2008 Victoria University Calendar lists 35 staff-members at the Victoria Management School. The data sets are small for this analysis, for example, in 2006, 42 peer-reviewed papers were published by the Massey staff, and 21 were published by the Victoria staff)

The corresponding curves for accounting (which at both universities includes business and commercial law) are shown on Fig. 9. In this case, the overall trends with time are similar, but of particular interest is that there is consistently more co-authorship at Massey than Victoria throughout the period. While it may be that accounting staff are simply more professionally gregarious at Massey than at Victoria, it seems more likely that co-authorship among Massey business staff was actively encouraged to bolster its number of PBRF outputs. Mutual co-authorship of papers, especially with those in the same institution, is an obvious and easy method of enhancing the number of research outputs, one which accrediting agencies have become aware. As an example, EFMD (2009, Table 2, p. 46) suggest reporting research outputs with the number of contributing faculty members to each research output, the number of items jointly produced by two or more faculty members, and the number of items jointly produced by cross-institutional collaboration.

Fig. 9
figure 9

Variation of weighted average number of authors per article in accountancy and associated law disciplines at Massey University and Victoria University of Wellington, 2000–2006. (At Massey University 29 academic staff-members are listed in accounting and business law on its website; the 2008 Victoria University Calendar lists 33 staff-members at the School of Accounting and Commercial Law. The data sets are small for this analysis, for example, in 2006, 12 peer-reviewed papers were published by the Massey staff, and 19 were published by the Victoria staff)

This sort of ‘game-playing’—albeit minor in this instance—is not unexpected, nor without precedent. Professor Charles Oppenheim, for example, is cited by Clark (2009) as saying that in the citation-counting climate of the United Kingdom’s research assessment exercise there is evidence of “citation clubs, pressure to publish in highly cited journals, a reluctance to publish with junior staff, reluctance to publish anything that is blue-sky, speculative or interdisciplinary”, an echo of an earlier comment by Barker (2007).

Conclusion

Arguments for the introduction of New Zealand’s PBRF included the claim that the quality of research would increase. After one full and a subsequent partial round of the process, even the Tertiary Education Commission is guarded about the scheme’s success in this regard. What is clear, however, is that the scheme has stimulated the production of more ‘research outputs’, and that those disciplines that score highest on the measures of the scheme are those that produce more journal articles. Moreover, there is a widely held perception that journal articles are the most highly valued research output—and contribute most to personal PBRF scores, and thereby strongly influence academic unit and institutional scores. This is supported here by the trends in the number of presentations of conference papers and their publication in conference proceedings over time being strongly mediated by the timing of the PBRF rounds.

The introduction of quality ratings of business journals—as determined by the Australian Business Deans Council—has had little influence on publishing behavior of staff at Victoria, Waikato and Otago Universities. While there are fluctuations in the proportion of articles in various grades of journals over the 2004–2008 period, the overall tendency is towards more articles in journals of lower grades, i.e., the quantity of articles appears to be over-riding considerations of the quality of the journal. In this the New Zealand PBRF experience is coming to mirror Moed’s (2008) analysis of the United Kingdom’s Research Assessment Exercise.

It is also apparent that co-authorship—while generally increasing slowly in all disciplines—may on occasion be being actively encouraged in business disciplines at least, simply as a means of increasing the number of research outputs. Currently, the rules of the PBRF do not assign a per-author value to a research output, meaning that a co-authored paper ‘counts’ for every author. Thus, co-authorship is an expedient means of potentially increasing the quality score for the authors, the Schools or departments, and the universities.

Although not the focus of this paper specifically, it has become apparent while compiling this paper that a considerable amount of time and effort is being expended on monitoring research outputs—not only for PBRF purposes, but for accreditation and similar audit and evaluation trails. Piercy (2000) argued that “for business schools to devote their efforts and resources to improving their RAE scores is likely to be damaging and counter-productive.” After a few more years it may be possible to determine whether his prophecy is true for New Zealand business schools.