Introduction

There is an increasing interest in comparing the efficiency of national health care systems rather than just health care expenditure. The present work purports to provide such a comparison between the hospitals of two neighboring countries, Germany and Switzerland. It seeks to answer the question of whether a given bundle of hospital services can be provided with fewer resources in the German federal state of Saxony compared to Switzerland, and whether findings are robust when attempts are made to take institutional differences into account. The method used to assess relative performance is DEA (data envelopment analysis). First, the institutional background of the two hospital sectors is described, followed by a characterization of the DEA applied and the two data sets used. The following section contains a preliminary test for efficiency by juxtaposing each country’s decision-making units (observations) to a joint reference set. It continues by restricting the sample to those units that can be projected on the other country’s efficiency frontier. Also, a test of robustness is performed at this stage. Next, differences between the countries with respect to the stringency and payment for hospital services are tested by modifying the DEA. The final section presents concluding remarks.

The analysis of hospital efficiency in a given country has a certain tradition. One of the first applications was presented by Banker et al. [1], who compared estimated efficiency using DEA and a parametric translog cost function. Färe et al. [10] not only addressed efficiency of Swedish hospitals, but also measured changes in productivity by adapting the Malmquist index to DEA. A vast number of studies have been presented for the United States hospital sector (e.g., Burgess and Wilson [3], Ferrier and Valdmanis [12], with extensive citations). Dalmau-Matarrodona and Puig-Junoy [9] analyzed the effects of market structure on hospital efficiency in Spain using data from Catalonia; Linna and Hakkinen [16] and Linna [17] estimated efficiency of Finnish hospitals, comparing DEA with a wide range of parametric alternatives. Linna [18] used similar methodology to test for the productivity effects of a reform in Finnish health care finance. Bjørn et al. [2] analyzed the effect of a change of financing regime on the efficiency of Norwegian hospitals. Steinmann and Zweifel [28, 29] related efficiency of Swiss hospitals to regional differences in hospital financing and ownership.

By way of contrast, international comparisons of hospital efficiency are rare. On the output side, one reason is the differences in patient and treatment classification systems that impede comparability of outputs. In addition, the absence of quality measurement, already constituting a problem in within-country comparisons, detracts even more from international comparisons of hospital outputs [23]. On the input side, differences in labor law, e.g., weekly working hours, and the question of how to transform national currencies and price levels, complicate the analysis. However, Mobley and Magnussen [19] and Magnussen and Mobley [20] compared and analyzed the relative efficiency of regulated public Norwegian and Californian hospitals, which operate in a largely unregulated, competitive environment, to find systematic differences.

Of course, the ensuing analysis suffers from the same limitations as those noted in the previous paragraph, although the sensitivity of results to inputs whose measurement depends on the exchange rate will be tested in the section entitled “Efficiency comparison restricted to comparable observations”. However, being based on DEA, which seeks to establish relationships between inputs and outputs, the comparison of performance neglects possible differences in quality. In the context of the present international study, this point needs to be borne in mind always.

The hospital sectors of Germany and Switzerland

The macro perspective

Germany is partitioned into 16 federal states, of which five acceded to the Federal Republic of Germany in the process of reunification in 1990 from former East Germany. Saxony is one of these five states, whose health care system was completely different from and incompatible with the one of the 11 old states. This fact together with considerable obsolescence in East Germany led to the political decision to entirely restructure the health care system of the new states. The modernization of the hospital sector constituted a major challenge, as not only replacement investments were overdue but also the state of technology was lagging far behind. To finance the process of hospital restructuring, a program was put in action based on a special law (Gesundheitsstrukturgesetz, GSG, art. 14) [13], according to which the equivalent of some US$ 10.6 billion will be allocated to the five states of former East Germany through the end of the year, 2004 (for an overview, see Table 1).

Table 1 Hospital sectors of Saxony and Switzerland

The hospital sector in Germany (and also in Saxony) is characterized by a hierarchical ordering. Hospitals for ordinary care, constituting the lowest level, must have at least two specialties. In Saxony, these have to be a medical and a surgical department which may be supplemented by a gynecological and/or a pediatric division. Hospitals for intermediate care provide services at an already advanced level, featuring all the major diagnostic as well as therapeutic facilities. In Saxony these hospitals contain surgical, medical, gynecological, ophthalmologic, otolaryngology, orthopedic, pediatric, and urology departments. To meet regional demand, they may be supplemented with dermatology, neurology, and psychiatry departments. Hospitals for advanced care offer the full range of treatments available, using the newest medical technology. They also engage in medical research and education. In addition to these three levels of hospitals, there is a fourth group of hospitals that offer specialized care, e.g., for cardiac patients (Sächsisches Staatsministerium für Soziales, Gesundheit, Jugend und Familie) [24, 25].

In Saxony, a state with large rural areas, the prime objective in restructuring the health care system, and especially the hospital sector, was to provide services of different levels according to population density. The outcome of this “location–allocation assignment” problem is a large number of hospitals for ordinary care, comprising 300–400 beds. This differs markedly from the structure prevailing in Germany as a whole, where units with up to 200 beds are more common (Table 1).

Germany has a dual system of hospital finance. Investments are the responsibility of the confederation, while operating costs are covered by payments from public and private insurance, mainly through per diems. Case-based payments and fee-for-service items make up 22–23% of hospital revenue. State governments impose strict hospital planning but consult the regional hospital associations and health insurers.

In Switzerland, the 26 cantons (member states) are responsible for assuring the provision of health care services, in particular in the hospital domain. However, this does not imply that hospital finance comes from cantonal sources only. On the contrary, current hospital expenditure is financed by social health insurers, resulting in a dual system of hospital finance (Table 1).

Hospitals are not distinguished according to a hierarchical level as in Germany, although cantons seek to put a degree of division of labor in place. In particular, hospitals are not required to have a minimum number of specialties to qualify for a certain functional status; the main distinguishing feature is whether or not psychiatric and geriatric care is offered. The new federal law on social health insurance of 1994 (effective 1996) requires cantons to specify lists of hospitals that are admitted to provide treatment to individuals insured by one of some 100 competing sick funds. However, the criteria used for inclusion in these lists vary between cantons, encouraging heterogeneity between hospitals.

This lack of criteria also translates into an absence of stated criteria for ongoing hospital planning in most cantons beyond the objective of appropriate provision at the regional level. In many instances, even rather rural cantons dispose of facilities that would be deemed to be appropriate for advanced care; however, the great majority of patients requiring advanced care can reach a teaching hospital within 2 h. Conversely, hospitals with more than 50 and fewer than 100 beds dominate the picture; in the Swiss sample (see the section entitled “Description of the two data sets”), hospitals with fewer than 200 beds make up 78% of all units.

Contrary to Germany, the financing of hospital investment lies entirely with the cantons, with federal involvement only in support of teaching. With regard to operating costs, the law of 1994 obliges insurers to contribute at most 50% of the cost accruing in the public ward, while cantons have to cover the residual cost. Modes of financing differ. In the majority of cantons, hospitals are paid per diem; some are experimenting with prospective per-case payment, and others (notably the canton of Vaud) have introduced a global budget to which insurers contribute although it is under the control of the state.

The goals of decision makers

In this section, we briefly describe binding restrictions, optional targets, and incentives of decision makers in the German and Swiss hospital sector. The parties involved are public agencies, public and private health insurers, hospital management, and patients.

Legal restrictions in Germany and Switzerland are, to some extent, different. They express differing social preferences and give rise to differing incentives for decision makers, especially for hospital managers.

In Germany, the Hospital Finance Act (KHG, art. 1) [15] mentions appropriateness of provision and acceptable per diems as objectives, while the National Ordinance on Hospital Rates (BPflV) emphasizes stabilization of contribution rates to public health insurance and performance of hospital comparisons. Cost efficiency thus is one of the prevalent official objectives.

The decision situation of hospital managers in Saxony can be described as follows. The number of beds in each department is fixed in the process of hospital planning. Since shifts between departments are not admitted, the number of beds at the department level as well as the hospital level amounts to a nondiscretionary quantity. Of the total stock, private beds account for a minimal share, as a mere 2% of Saxony’s population has private insurance. Negotiations with the public agency and the association of public health insurers result in annual budgets, composed of the per diem and the number of patient days. Therefore, management has a degree of discretion over patient days. This degree of discretion is limited by the fact that the number of cases are negotiated, too, which implicitly defines a targeted length of stay. If the budget target is exceeded, the value of the per diem is reduced. Conversely, if expenditure falls short of the budget, the hospital can keep only part of the difference. This means that hospital managers have a clear incentive to meet the approved number of patient days. If below budget, extending length of stay is difficult because this variable is closely monitored by the public agency and health insurers. Increasing the number of cases makes sense only if low-cost patients can be attracted. If successful, this policy tends to enhance cost efficiency.

At this point, it should be noted that the Saxon data refer to target rather than actual quantities. However, this difference may not be as important as it seems at first. Firstly, the arguments of the preceding paragraph suggest that hospital managers have a strong interest to meet targets in terms of patient days, length of stay, and therefore number of cases, at least as long as targets are within reach. Secondly, these targets probably remain within reach because they result from a negotiation process that starts from realized quantities in the previous period and involves comparisons with other, similar units.

Patient choice is hardly reflected in hospital performance, since according to the Social Security Code (SGB V) [26], admitting physicians may only choose among the two hospitals closest to residence. Otherwise, they must seek the consent of the health insurer. For treatment outside the state of residence, the patient has to pay any difference in cost. This makes patient migration a rare phenomenon, with only 2% of all patients residing in Saxony receiving cross-border care. The bulk of patient movement within Saxony is between hospitals of different hierarchical level, reflecting medical reasons. Judging from a study of patients with cardiovascular disease (A. Karmann, G. Dittrich and J Vaillant, unpublished paper “How much coordination in patient careers? A macro analysis of repeated admissions in the hospital sector of Saxony”, circulated at Dresden Technical University of Saxony, 2003), migrations between hospitals of the same hierarchical level amount to 4% of all migrations only. In sum, a patient’s choice of hospital is de facto quite limited, providing little incentive for quality competition.

In Switzerland, the new federal law of 1994 stipulates that health insurers may not cover more than 50% of current expenditure of public wards in publicly owned hospitals (excluding costs of excess capacity, investment outlays, and teaching and research).Footnote 1 The remainder of expenditure and investment outlay must therefore be covered by communities forming regional hospital associations and by the cantons. This involvement of cantons in the financing of hospitals creates an incentive on their part to control cost. However, this incentive is undermined by their ability to shift the burden of hospital deficits to health insurers through high fees (which create leeway for cost increases). Cantons can act in this way because fee negotiations involve their cantonal hospital association, and in the case of failing negotiations, they serve as the ultimate arbiter. In all, cantons have limited interest in achieving cost efficiency in the hospital sector. This conclusion is little affected by the fact that the new federal law not only confirms the authority of cantons to engage in hospital planning but also introduces an obligation to this effect.

Hospital efficiency is attained through a favorable quality–cost ratio, brought about by public planning or competition. The new federal law states effectiveness, appropriateness, and efficiency as objectives, while explicitly mentioning only hospital planning as a means to achieve these ends. On the planning side, hospital associations have limited incentive to resist hospital physicians in their quest to increase quality of treatment through investments because the canton shares in the investment outlay. Cantons are in a similar situation since higher quality attracts patients from other cantons, who are made to contribute through substantially higher fees; an estimate for 1994 puts the share of patients crossing cantonal borders for treatment at 15% [8], roughly sevenfold the figure for Saxony which in addition includes migrations within the federal state. With regard to cost, increases in current expenditure triggered by cantons’ investment decisions are borne up to one-half by social health insurers. Insurers, doing business nationwide and regulated to set largely uniform premiums, in fact make their members who reside outside of the canton in question contribute to some extent to hospital costs engendered by that canton.

On the competition side, hospitals again put emphasis on quality because the insured have free choice of hospitals within their canton of residence, without any implication for premium paid or (minimal) cost sharing. Moreover, at least 22% of the population (compared to 2% in Saxony) have supplementary health insurance granting mainly hotel-type amenities and choice of hospital beyond the canton of residence (Federal Office for Social Insurance, p. 143) [11].

Finally, the new federal law stipulates a premium subsidy for residents designed to limit the fraction of income that must be paid for health insurance premiums. This serves to reduce pressure by voters to limit premium increases and indirectly the surge of hospital cost. Moreover, these subsidies are financed in the guise of matching grants. Cantons are obliged to augment federal contributions by at least 50%. But they can forego up to 50% of these contributions, thus reducing their own burden. Thus, implementation of the law varies considerably between cantons, whose governments can still let hospital fees increase without having to bear more than a small part of the engendered subsidization cost.

Conclusion 1

In Germany, the hospital remuneration scheme makes patient days the primary target variable. Moreover, the fact that the observations are planned rather than actual quantities is of minor importance. In Switzerland, quality competition is enforced to some extent by patient migration, causing the number of cases to be emphasized as an objective.

This conclusion, along with the other entries of Table 1, suggests that comparability of the two samples may be an issue. To address this problem, the data are purged in several ways that serve to increase the degree of comparability (see “Description of the two data sets”). On the other hand, the standard DEA assumption of a homogeneous universe is explicitly tested in the section “Efficiency comparison restricted to comparable observations”.

Characterization of DEA applied and data sets used

A specific DEA formulation

DEA is a procedure for determining efficient frontiers by maximizing a generalized distance between inputs and outputs [4, 7]. In the case of a hospital, the definition of outputs is not trivial. First, with measurements of the change in health status as the true output lacking, the number of cases treated, grouped into clinical categories to control for health status at admission, serves as a proxy. In this work, five major patient categories are distinguished (see Table 3). Second, the number of patient days is often included among the outputs [3, 10, 17]. However, time spent in hospital amounts to an input required by the hospital and provided by patients. This variable thus appears as an input. It would have been tempting to divide this total up between the five clinical categories; however, the value of a patient day is the (unobserved) opportunity cost of time, which presumably does not vary much between a medical and a surgical patient (for instance), compared to a member of the labor force and a child. Moreover, increasing the number of inputs causes more observations to be recognized as fully efficient, thus reducing the DEA discriminatory power for a given sample size. In this way, the number of inputs is limited to six, among them the number of beds, which has the special feature of being considered nondiscretionary. In Germany, this variable is set by hospital planning authorities; in Switzerland, the number of beds is fixed by authorities in some cantons, at least with regard to beds in the public ward.

The fact that one of the inputs is nondiscretionary has implications for the formulation of DEA. Contrary to the conventional DEA formulation, the linear program reads:

$$ \begin{aligned} & \min \;\;\;\theta _{l} \\ & {\mathbf{X}}^{d} \lambda \leqslant \theta _{l} X^{d} _{l} \;\;\;{\text{d}}:{\text{discretionary}}\;{\text{input}} \\ & {\mathbf{X}}^{n} \leqslant X^{n} _{l} \;\;\;{\text{n}}:{\text{non}} - {\text{discretionary}}\;{\text{input}} \\ & {\mathbf{Y}}\lambda \geqslant Y_{l} \quad {\text{ }} \\ & \lambda \geqslant 0 \\ \end{aligned} $$
θ l ::

Efficiency score of observation l under evaluation

X d::

k times o matrix of inputs, where k is the number of discretionary inputs and o the number of observations (X d l is the l th column of this matrix, the vector of discretionary inputs observation l)

X n::

j times o matrix of inputs, where j is the number of nondiscretionary inputs and o the number of observations (X n l is the l th column of this matrix, the vector of nondiscretionary inputs for observation l)

Y::

The s times o matrix, whereas s is the number of outputs (Y l is the l th column of this matrix, the output vector for observation l)

λ::

o times 1 vector of weights pertaining to observations

This is the input-oriented version of DEA because the objective variable relates to inputs; constant returns to scale are assumed because the vector λ is not constrained except for being nonnegative. This formulation corresponds to a planning view that seeks to guarantee a certain level of provision with hospital services for a minimum use of resources. Also, the assumption of constant returns to scale allows the total inefficiency to be split up into technical and scale inefficiencies. Thus, the minimum value of θ is sought (equivalent to the maximum reduction of all discretionary inputs) that is still compatible with a given production possibility set. This factor is not applied to the nondiscretionary input (number of hospital beds in this case); however, hospital beds continue to enter the determination of the production possibility set.

Description of the two data sets

The German observations refer to Saxony exclusively and cover the years 2000–2002. They were provided by the Saxon Hospital Association. Out of 123 observations, some had missing values for inputs and/or outputs as defined in the previous section (“A specific DEA formulation”) and shown in Tables 2 and 3 and had to be excluded; on the other hand, a unit did not have to be reported in all 3 years to be retained in the sample. Furthermore, only observations coming from units that at least satisfy the criteria for an ordinary care hospital (as stated in the section “The macro perspective”) qualify. For increased comparability with the Swiss data set, the minimum number of beds is 20, which has the consequence of excluding all specialized hospitals that are not part of the standard hierarchical structure. The final sample size is 105 observations.

Table 2 Hospital inputs, by country (expenses on materiel and minor investment in 1995 prices, Swiss francs)

In the case of Switzerland, hospitals report their actual data to cantonal health authorities, who forward them to the Federal Statistical Office. The observation period covers the years 1997–2001.While the office runs a series of plausibility tests in particular with regard to outputs, observations were subjected to additional restrictions for retention in the sample because some characteristics on the input side did not appear credible. First, cases treated per physician have to be nonzero but also less than 1,500 per year for a realistic workload, a limit suggested by inspection of the density distribution. Next, annual labor income (in 1995 prices) has to be more than CHF 30,000 (Swiss francs; some US$ 20,000 at 2002 exchange rates), which corresponds to subsistence level. On the other hand, a hospital would have to employ exclusively senior physicians to report an average labor income in excess of CHF 150,000 (US$ 100,000), which therefore serves as the upper limit. In view of the fact that earlier surveys, compiled by the Swiss Hospital Association (H+), had always assigned hospitals having fewer than 75 beds to one category, the lower limit is put at 20 beds to eliminate reporting errors. With regard to personnel, an observation must have at least three physicians and three nursing staff to qualify. Since every employee is assigned to one of some 70 categories, which facilitates correct categorization, hospitals featuring more than 5% nonspecified personnel are removed from the sample. Observations reporting nonzero geriatric and psychiatric cases are excluded in order to focus on short-term hospitals, in parallel with the German data set. These restrictions jointly cause the sample to be reduced from some 950 observations in short-term hospitals to 251.

With regard to hospital inputs, the result of these restrictions is shown in Table 2. Focusing on mean values first, one notes that German hospitals use much larger quantities of inputs, with the exception of academic staff and possibly nursing input. The same difference is already marked in the case of administrative staff and expenses on materiel. However, it is dwarfed by the difference in patient days and beds, where the German units are almost three times larger than their Swiss counterparts. With regard to beds, this is the likely consequence of the statistical convention that only staffed beds are counted in Switzerland. On the whole, however, one retains the impression that the German units are larger than the Swiss and that they produce their services using fewer resources both in the administrative and curative domain. Turning to the ranges, one observes that even where the mean values are larger for the German units, their variability often remains below that characterizing the Swiss sample. Only in the cases of patient days and beds do size and variability go together. Thus, Swiss hospitals seem to have a large degree of diversity whereas the German sample is much more homogeneous. This homogeneity may be a result of hospital planning and the process of hospital restructuring in East Germany.

Turning to the outputs (Table 3), the number of cases treated in German hospitals is approximately double that of Switzerland in four out of five categories, which again points to larger units. The notable exception is an equal number of surgical cases. Apparently, a German hospital is used for a broader range of purposes than its Swiss counterpart, where surgery has a comparatively prominent place. Once more, the ranges show Swiss hospitals to be characterized by much diversity, which may be the result of strong specialization.

Table 3 Number of cases treated, by category and country

The differences between the two hospital sectors can be highlighted by aggregating outputs and using the German data as the benchmark (normalized at 1.00, see Table 4). First, a Swiss hospital treats only one-half the number of a German unit. If it scaled back inputs in proportion to the total number of cases, its input quantities would have to be a multiple of 0.50 of the German figures. The actual multiples are higher, amounting to 1.00 for academic staff, 0.83 for nursing staff and expenses, and 0.77 for administrative staff. The one exception is the number of patient days, with a multiple of 0.34. This conforms to the differences in incentives noted in conclusion 1, namely, the importance of patient days as a target variable in Germany.

Table 4 Stylized representation of input and output ratios, between countries

Conclusion 2

Both input and output quantities suggest that the hospitals of the German sample are roughly twice as large as their Swiss counterparts. At the same time, they are far more homogeneous, which is remarkable in view of the many exclusion restrictions that had to be imposed on the Swiss sample.

The larger size of German hospitals gives rise to the expectation that the DEA will indicate a larger share of units exhibiting constant and decreasing returns to scale in the German subsample.

Efficiency comparison between the two hospital sectors

The objective of this section is to find out whether German or Swiss hospitals are recognized as relatively efficient if pitted against their counterparts and to see whether they are subject to differing returns to scale. At the beginning, the reference set consists of observations of both countries; later, the efficient frontier is constrained to contain only observations from the other country.

Standard DEA efficiency scores

For a first comparison, the empirical densities of hospitals with regard to their efficiency scores are shown in Fig. 1, based on the standard DEA assumption that all units belong to the same universe. The two densities differ markedly. Neglecting for a moment the fully efficient observations (their cumulation being a consequence of DEA), the Swiss distribution appears to be unimodal whereas its German counterpart seems to have a second mode around a score value of 30%. This feature of the German distribution is puzzling because assuming that the majority of the observations satisfy the output targets set by hospital planning, such a discrepancy would have to reflect widely divergent targets. However, this assessment is conditional on the assumption of constant returns to scale, which means that observations are held against the most productive units, whereas, in fact, German hospitals are not free to choose their scale. Therefore, the degree of inefficiency shown is made up not only of technical but also of scale inefficiency, which emanates from planning rather than management failures.

Fig. 1
figure 1

Distribution of efficiency scores

Figure 1 also shows that the German sample has a much higher share of fully efficient observations, and that they are concentrated in the upper end of the range. The group characterized by full efficiency consists of 73 observations of which 35 are German (48%) and 38 are Swiss hospitals (52%), representing 33 and 15% of their respective samples.

A first explanation of the difference in efficiency is that the German data cover a more recent period, therefore mirroring a more advanced state of medical technology and possibly management skills. However, a year-by-year DEA shows that the efficiency scores decreased rather than increased in both countries (see Table 5).

Table 5 Mean efficiency scores, by year and country

Another explanation is the fact that the German data are target rather than realized quantities. To the extent that the output targets are difficult to reach, hospital managers have an incentive to meet the approved number of days by attracting less costly cases (see “The goals of decision makers”). This would result in an increased efficiency score as long as DEA is not conditioned on case severity. By contrast, Swiss hospital managers do not face output targets. If this reasoning were relevant, one would expect a higher number of cases treated in Germany for a given population. However, in the year 2000 the number of cases treated per 100,000 inhabitants is only 4% larger for Germany compared to SwitzerlandFootnote 2, which would account for only a small part of the efficiency gap. A third reason for the gap may be that the Swiss observations, being smaller on average, do not reach the range of constant returns to scale.

Indeed, Table 6 indicates that the shares of German observations exhibiting increasing, constant, and decreasing returns to scale are significantly different (based on a chi-square test) from their Swiss counterparts. For example, if the two countries had the same distribution, the expected number of German observations exhibiting increasing returns to scale would be 43.14, more than triple their actual number of 12. By way of contrast, the expected number of observations with increasing returns is below the actual number in the case of Switzerland. This confirms an expectation based on the observation that German hospitals are clearly larger than Swiss ones (conclusion 2).

Table 6 Number of observations by returns to scale and country

Efficiency comparison restricted to comparable observations

In view of the different institutional constraints facing the two hospital sectors and the disparities noted in Tables 1 and 2, it seems appropriate to test the standard DEA assumption stating that all observations come from one and the same universe. Failing homogeneity, the analysis would have to be confined to those observations that are comparable. One way to achieve comparability is to retain those observations that can be mutually projected on the (Pareto-Koopmans) efficient frontiers. Specifically, the Swiss observations are projected on a reference set that is exclusively composed of German observations, and vice versa, the German observations are projected on a reference set consisting of Swiss observations only. If the projection proves impossible due to a lack of a reference set, an efficiency score cannot be assigned, and the observation is excluded from the comparison. A previous attempt at establishing comparability consists in projecting all units on their own group-specific efficiency frontier and checking for difference in the location of these frontiers [5, 14]. However, this procedure fails to test whether or not the units come from the same universe. For comparability, they should belong to the same universe, and this condition is imposed by the alternative proposed here.

Indeed, a substantial number of Swiss hospitals does not have a reference set defined by German observations. A full 67.3% of Swiss hospitals cannot be projected on the German efficient frontier. This failure can be traced to the very great amount of variety in their choices of input–output mix (see Tables 2 and 3). This points to a larger degree of specialization among Swiss compared to German hospitals, made possible by lower barriers against cross-border care, as argued in the section “The goals of decision makers”. Conversely, however, all German observations can be projected on their Swiss counterparts.

As a consequence of mutual projection, efficiency scores can exceed the value of 1. For example, a score of 125 implies that all inputs could be increased by 25%, with the observation still remaining efficient. In Fig. 2 (left panel), the share of German hospitals exceeding the unit threshold amounts to 74.3%. The share of Swiss hospitals beyond the unit threshold is only 12.3% (right panel). This confirms the earlier conclusion that the German observations are more efficient on average and have a greater relative share of fully efficient observations.

Fig. 2
figure 2

Efficiency scores: mutually projectable observations only

Conclusion 3

The German hospitals are more efficient on average than the Swiss. This finding is reinforced when taking into account that two-thirds of the Swiss observations cannot be projected on a German reference set, indicating that the two sets are largely disjoint.

This conclusion implies that if the efficiency scores are interpreted on the basis of the standard DEA assumption of a joint efficiency frontier while there are group-specific frontiers, one measures efficiency differences within a given group. Since maximum efficiency is fixed at 1 in both groups, the one with the greater dispersion tends to exhibit the lower average efficiency score. Conversely, since a common benchmark does not really exist, one fails to measure efficiency differences between groups, which constitute the main research objective in an international comparison.

In view of this difficulty, the procedure adopted in the remainder of this paper is as follows. First, the set of comparable observations is defined once and for all on the basis of the test that leads up to conclusion 3. Next, additional DEAs continue to be performed on the entire sample of comparable observations. Finally, while efficiency scores are calculated for all observations, only those pertaining to the set of comparable observations are retained for presentation and statistical analysis. Specifically, this applies to those Swiss observations that could not be projected on the German reference set.

In keeping with this procedure, mean efficiency scores are calculated once more (see Table 7). As could be expected, the German observations, which turned out projectable without exception, display the same scores as in Table 5, where the reference set was still combined. By way of contrast, the mean efficiency score of Swiss hospitals drops by no less than 13 percentage points. This change is related to the fact that while the number of Swiss observations is reduced by two-thirds overall, the reduction with regard to the efficient observations is far more marked. Indeed, whereas 38 Swiss observations had been part of the combined efficient frontier, now only 3 observations are recognized as fully efficient in the reduced set. Thus, depriving observations of the possibility of being compared with observations of the same country has a particularly important effect on the Swiss subsample. This effect must result in lowered efficiency scores because the German observations are generally more efficient according to Table 5 and also in the segment of comparable input–output combinations, as evidenced in Table 7. The resulting average differential between the two countries increases from 9 to almost 22 percentage points.

Table 7 Efficiency scores, projectable observations only

This result calls for an explanation. One possibility is that important (technical) inputs and outputs that would have favored Swiss hospitals are lacking. However, the sets of outputs and inputs used here are at least as comprehensive as those of other studies that do not have access to diagnostic information [1, 10, 29]. Another reason may be the fact that patients in Switzerland, in particular those covered by supplementary health insurance, have a larger choice of hospital without being exposed to cost differences. To the extent that inputs are valued by patients as relevant dimensions of quality, Swiss hospitals must provide them, resulting in excess inputs for a given output.

A particular difficulty characterizing international comparisons of performance stems from the fact that for inputs and outputs measured in value terms, different currencies are involved. In the most simple case, with one input in value terms (as in so-called cost DEA with constant returns to scale), calculated efficiency scores depend linearly on the exchange rate chosen. In the present study, only one out of six inputs, expenses for materiel and minor investment (“expenses” in Table 2), is expressed in monetary units. Still, it is conceivable that the comparison between German and Swiss hospitals could be sensitive to the presence of this one input. To test for this possibility, the DEA is repeated with expenses on materiel and minor investments excluded.

The resulting differences in efficiency scores (new minus previous value) are shown in Fig. 3. Since adding to the number of inputs increases the number of variables in the linear program, efficiency scores may increase or remain unaffected; conversely, excluding one input cannot increase efficiency scores [6]. This theoretical expectation is borne out for both countries. Indeed, the new mean value is 3.2 percentage points lower in the case of Germany and 6.2 percentage points lower in the case of Switzerland. While the difference in these reductions is statistically significant (Mann-Whitney rank sum test, significance level 1%), it confirms the finding that German hospitals are more efficient. However, no less than 64% (Germany) and almost 42% (Switzerland) of the efficiency scores remain unchanged, and changes exceeding 20 percentage points are very rare. In fact, the correlation coefficient between the efficient scores calculated with and without expenses amounts to 0.96. In view of this stability of results, the choice of exchange rate cannot make much of a difference.

Fig. 3
figure 3

Effect on efficiency of excluding expenses

Conclusion 4

In the present DEA, calculated efficiency scores depend heavily on the standard homogeneity assumption. On the other hand, they may be considered largely robust against the choice of and changes in the exchange rate.

Testing for influences of institutional factors

In this section, two institutional factors that differ between Germany and Switzerland are tested. One is the fact that hospital planning is less stringent in Switzerland in general and with regard to the number of beds in particular, the other, that the possibility of patient migration makes the number of cases an important performance indicator for hospital management (conclusion 1).

More stringent hospital planning in Germany

Apart from the regulated number of beds in the public ward, Swiss hospitals are free to add beds in the private ward as they see fit. Therefore, they should not be assigned a very much higher efficiency score if the number of beds is introduced as a discretionary input variable in the DEA. By way of contrast, the relaxation of the bed restriction in the German subset should make a marked difference, permitting observations to improve their efficiency score.

As shown in Fig. 4, the observations of both countries do not feature any decrease in efficiency scores, consistent with theoretical expectations. However, the effects turn out to be very small, the German scores increasing by a mere 0.26 percentage points and the Swiss by 0.67 points. While the German figure is smaller, contrary to expectations, the difference is far from statistically significant (Mann-Whitney test, significance level 67%). One possible explanation for this is the fact that the share of fully efficient hospitals was higher to begin with in Germany, and relaxation of a restriction cannot make them more than fully efficient. Indeed, limiting the analysis to the observations that are inefficient initially reveals that the change in efficiency among German hospitals now amounts to 0.39 (rather than 0.26) percentage points, whereas it remains almost the same (0.69 rather than 0.67 points) among Swiss hospitals.

Fig. 4
figure 4

Effect on efficiency of making beds a discretionary input

Relaxing the constraint on the numbers of beds has a surprisingly small effect in both countries, suggesting that regulating the number of beds hardly affects hospital efficiency as measured by DEA.

Patient days as an output measure in Germany

The other institutional difference is that in Germany patient days served as a principal output measure until recently (DRG payment introduced 1 January 2003). By way of contrast, the possibility of patient migration makes the number of patient days an important performance indicator for Swiss hospitals. Therefore, when switching hospital days from inputs to outputs in DEA, the German observations should be more likely to show an efficiency increase than their Swiss counterparts. In Fig. 5, 50% of Swiss and 48.6% of German hospitals have increased efficiency, a statistically nonsignificant difference. However, the average increase amounts to 6.6 percentage points among German hospitals but 2.6 percentage points among the Swiss ones, and this is a statistically significant difference (Mann-Whitney test, 5% level of significance).Footnote 3

Fig. 5
figure 5

Effect on efficiency of defining patient days as an input

Conclusion 5

Based on the fact that patient days relative to cases treated have been a more important performance indicator for German than for Swiss policy, counting patient days among the outputs in DEA should increase German efficiency scores more than the Swiss. This prediction is confirmed.

Concluding remarks

This contribution purports to compare the productive efficiency of a sample of German (Saxon) with a sample of Swiss hospitals using DEA. This comparison is of interest because it pits similarities with regard to culture and language against considerable institutional differences. These differences stem from a rather tight hospital planning imposed on Saxony following reunification in 1989 on the one hand and a strongly decentralized Swiss hospital sector, where regulatory authority continues to be mainly vested with and exercised in different degrees by member states (cantons), on the other. Specifically, patient migration is possible in Switzerland, making the number of cases treated an important indicator of success which should be reflected in DEA (conclusion 1). Especially in the Swiss sample, an attempt is made to reduce the impact of possible reporting errors by imposing exclusion restrictions which on the whole serve to increase average size of the hospital. Even then, however, the average size of the German units is found to be roughly double that of their Swiss counterparts, combined with much smaller ranges for inputs and outputs (conclusion 2). Even with patient days included among the outputs, the German hospitals are clearly more efficient than their Swiss counterparts. Yet, the basic DEA assumption that the production possibility sets come from one and the same universe may not be tenable in this international comparison. For a test, the German hospitals are projected on a reference set comprised exclusively of Swiss observations, and conversely for the Swiss hospitals. Indeed, two-thirds of the Swiss observations cannot be projected on a reference set formed by German observations, whereas all German observations can be projected. Thus, the two production possibility sets appear to be largely disjoint (conclusion 3).

This finding suggests limiting the ensuing analysis to mutually projectable observations, resulting in even larger efficiency differences in favor of German hospitals. An international comparison may still be affected by inputs or outputs measured in value terms because of the choice of exchange rate influencing results. Dropping the one input in value terms leaves efficiency scores largely unaffected, suggesting robustness of results (conclusion 4). Finally, two tests for assessing the importance of institutional differences are carried out. First, the number of beds does not really constitute a discretionary input in the case of Germany in view of strict hospital planning. Treating beds as a discretionary quantity should therefore serve to increase German efficiency scores more than Swiss scores. However, this prediction is not confirmed (conclusion 5). Second, since patient days constitute a comparatively more important indicator of success in Germany, the German hospitals are advantaged when patient days are shifted from the input to the output category. While patient days arguably belong to the input side, transferring them to the output side should result in a more marked increase of German efficiency scores. This prediction is confirmed (conclusion 6).

It has become customary to perform a second-stage regression analysis of DEA efficiency scores. However, potential explanatory variables are country-specific; for example, a functional hierarchy of hospitals exists only in Germany, while regional differences in hospital finance are a Swiss idiosyncrasy (see “The macro perspective”). Therefore, all regressors of interest would be collinear with a country dummy variable, precluding detailed analysis of differences between the two countries. For this reason, regression analysis cannot provide additional insight in the present context.

In sum, the application of DEA to hospitals operating in different institutional environments is fraught with great difficulties. However, it proved at least possible to establish comparable subsets, using the feasibility of projecting the observations of one group on a reference set formed by the other as the criterion. In those comparable cases, the efficiency gap between German and Swiss hospitals, measured in technical terms, widens even more. This difference may reflect the fact that patients in Switzerland have a larger choice of hospital without being exposed to cost differences. To the extent that inputs are valued by patients as relevant dimensions of quality, Swiss hospitals must provide them, resulting in excess inputs for a given output and therefore low DEA efficiency. However, to verify this claim one of two conditions would have to be satisfied. One is to have internationally comparable indicators of quality for hospital services. The other is international migration of patients covered by social health insurance who in some way or another share in the additional cost engendered or saved. In this way, international price and quality competition would also be brought to hospital sectors that at present are not even exposed to much domestic competition.