Introduction

The control of efficiency in the provision of health care is of constant concern to policy-makers worldwide, especially given the high level of expenditure reached in most developed countries. Therefore, it is not surprising that, beginning with the seminal study on nursing services by Nunamaker [1], there has been great interest in evaluating the performance of health care organisations. By far the majority of this work has concentrated on hospitals, with primary health care services having received much less attention.Footnote 1 The present study attempts to redress this imbalance with an evaluation of technical efficiency for a sample of primary health care centres. For this purpose, we use data envelopment analysis (DEA) [2]—a nonparametric technique that has been employed widely in the health sector because it can easily handle multiple dimensions of performance and is less vulnerable to the misspecification problems that affect econometric models.Footnote 2

The aim of this paper was to contribute to extending the literature on measuring economic efficiency in primary health care by considering jointly two particularly relevant aspects of health care provision: the quality of care and the effect of environmental factors. Both aspects have been identified in the literature as key factors with a significant influence on results. Accordingly, recent research on this subject has considered quality [3] and environmental factors [4] when measuring economic efficiency in primary health care, although both aspects have been considered separately so far in previous literature. This paper constitutes a first approach to combining both issues within the same framework. The methodological approaches employed in this study to deal with both aspects help us minimise potential bias that could affect the results.

We measure quality in health care services through technical indicators that reflect the capacity of medical staff to diagnose and treat medical problems. These indicators are based on control programs established by health care authorities in order to improve the provision of services, and can be understood as particular qualitative outputs of the production process. When resources are constrained, there is an inevitable trade-off between (quality) non-quantitative output and (activity) quantitative output [9]. Therefore, any model specification should take into account the implications of such a trade-off and including both activity and quality indicators in the output of the process. We ensure this by incorporating weight restrictions into the DEA models evaluated in this study.

The performance of a health centre can also be influenced by factors beyond its control, especially by differences in the socio-demographic characteristics of the patients that each unit serves [10]. Despite the acknowledged importance of this effect, the few empirical studies that have attempted to incorporate this information performed only second-stage analysis in order to identify potential explanatory factors of inefficient behaviour [11, 12]. In our view, however, the population plays an active part in the process that takes place in health care centres, so this information must be incorporated unequivocally into the construction of efficiency scores. The present study addresses this problem by explicitly including this type of exogenous information using a four-stage model developed by Fried et al. [13] and subsequently enhanced by Cordero et al. [14]. This methodology not only allows the most influential variables to be identified, but also distinguishes different effects for each variable, which could be of great help in interpreting the results.

The rest of the paper is organised as follows. The following section on “Methodology” explains the methodological approaches employed to incorporate the effect of qualitative indicators and environmental factors into the efficiency scores calculated with DEA. “Dataset and variables” describes the characteristics of the dataset and the variables used in our study, while “Results” presents and discusses the main results. Finally, "Conclusions" are drawn in the last section.

Methodology

Although the importance of quality is widely assumed in the health care sector, there have been few attempts to incorporate quality indicators into the performance measurement of primary health care units by means of DEA models. Moreover, the approaches used so far have significant drawbacks. Some studies employ a two-stage estimation procedure by which efficiency scores are estimated from data on inputs and outputs in the first stage, and then those scores are regressed on a set of explanatory variables that include some proxies for quality [15, 16]. This formulation assumes that quality indicators influence the efficiency with which inputs generate outputs, but that they do not influence the transformation process itself, which is difficult to maintain given that the level of quality attained depends crucially on the quantity and form of the resources employed [17]. Another option to include quality indicators in a DEA evaluation is provided by Shimshak and Leonard [18] and Shimshak et al. [19], who suggest the estimation of two different DEA models, one for output indicators based on activities and another for quality outputs. After obtaining initial results from the two separate models, efficient units in a model that have low scores in the other are eliminated. The main shortcoming of this method is that the evaluation of a set of units is incomplete, since some of them are removed according to the criterion selected.

An alternative approach to overcome those limitations is to include quality measures in DEA models as an additional output [2023]. The problem that arises with this approach is that the number of efficient units increases artificially due to the loss of discriminating power that DEA undergoes when the total number of variables increases and the number of observations remains constant. Moreover, the inaccuracy of this technique in assigning weights to different output indicators can lead to unreasonable results given that the weights provided are frequently inconsistent with prior knowledge of the production process involved. Thus, some units will achieve high scores or even will be identified as efficient in spite of performing poorly in terms of quality simply because this component is ignored by having a zero weight assigned to it.

In our opinion, it is possible to strike a balance between rigidity and excessive flexibility by including weight restrictions in DEA models. This approach, which has been employed previously in some empirical studies measuring efficiency in the health sector [24, 25], consists of restricting the multipliers by directly imposing a lower bound on the weights for both quantity and quality outputs considered in the analysis.Footnote 3 This procedure eliminates the possibility that units with a poor performance in terms of quality (or quantity) might be considered as fully efficient, and hence the reference units must have a demonstrated good performance on both indicators. By doing so, we attempt to diminish the shortcomings associated to preceding approaches developed in the literature, although the use of this method implies the assumption of strong value judgments about the importance of each dimension.Footnote 4

If appropriate specification of the model is important in terms of properly considering the importance of quality, then the development of models that take into account factors that are not under the control of the health care provider and that can be a potential source of inefficiency are no less so. An evaluation of health care facilities should explicitly include this information in the analysis to ensure that the efficiency score finally assigned to the unit truly reflects the portion of the production process for which that centre is itself responsible [28]. Particularly, in making an efficiency assessment of primary health care units these variables are represented mainly by the characteristics of the population demanding care [29]. Examples of these factors are poverty rate, crude birth rate, mortality rate, and minority populations. In this latter case, for instance, minority populations are more likely to be poor and to endure poverty-related conditions such as chronic illnesses, inadequate health-related behaviour and stress [30].

Given that it is not feasible to observe the health status of a centre’s target population prior to serving them, these factors are beyond the organisation’s control, and must therefore be given specific treatment in the efficiency analysis. Moreover, our interest in this study is not just to identify the variables that can affect the output of primary care centres, but also to explicitly include them in the calculation of efficiency scores so that those scores can be an appropriate measure of how the units are performing. We would stress that it is important to bear this purpose in mind, since it determines to a large extent the methodological approach used to deal with those exogenous variables, as will be explained in the rest of this section.

Recent years have seen the development of different models to incorporate the effect of so-called exogenous factors, uncontrollable inputs, or environmental variables in the production process on estimating efficiency scores through non-parametric approaches as DEA [31]. The approach most widely used in the health context is the two-stage approach [32], although it only allows for identifying factors affecting outcomes such as funding issues or geographic location [33]. The main drawback of this approach is that it does not correct adequately for the efficiency scores incorporating the effect of the exogenous variables, because it assumes wrongly that the influence of non-controllable inputs is the same over all inputs and outputs of the production process. Therefore, variables that explain the overall inefficiency of a decision making unit (DMU) can be inconsistent or incongruent with those that cause its individual input inefficiencies [34].

An alternative method to calculate corrected efficiency scores was proposed by Fried et al. [35]. This work claims that inefficiency is not neutral among the variables included in the initial evaluation because there can be significant slacks (non-radial inefficiency) in some of them.Footnote 5 From this idea, Fried et al. [13] proposed a four-stage model based on the use of the total slacks obtained from the first DEA stage, which distinguishes the technical inefficiency and the effect of exogenous factors for each variable. This model also uses regression as a mechanism to adjust the inefficiency values obtained with DEA, but in this case applying a system of n Tobit regressions (one for each variable) in the second stage:

$$ T\hat{S}_{j} = f\left( {Z_{j} ,\beta_{j} } \right) + u_{j} \quad j = 1,2, \ldots ,n $$

where \( T\hat{S}_{j} \) represent the total slacks (radial and non-radial) computed in the first stage, \( Z_{j} \) are the exogenous variables, \( \beta_{j} \) the parameters to be estimated, and \( u_{j} \) the error term, which is distributed normally, \( u_{j} \approx N(0,\sigma^{2} ) \). This decomposition can be done through the estimated parameters \( \left( {\hat{\beta }_{j} } \right) \), which can be used to predict new slacks for each variable taking into account the real endowment of exogenous variables of each unit. These coefficients provide information about the negative or positive effect of the uncontrollable inputs on each slack. This effect may vary from one slack to another, and it is possible that some exogenous factors influence some of the slacks but not all. Once the decomposition has been done, in the third stage the original values of the variables are corrected using the predictions obtained in the second stage in order to discount the effect of exogenous factors. According to Fried et al. [13], when an input orientation is considered, the adjustments are calculated by adding to the original value of each input the difference between the maximum predicted slack and the predicted slack: \( X_{\text{adj}} = X + \left\lfloor {Max\left\{ {TS_{{j\;{\text{pred}}}} } \right\} - TS_{{j\;{\text{pred}}}} } \right\rfloor \). Finally, by running a DEA with the corrected values of the variables, one obtains new scores that establish exclusively the efficiency level at which each producer operates.

This four-stage procedure constitutes an attractive alternative to calculate technical efficiency scores in the presence of exogenous factors, since the scores can be interpreted readily as production targets. In fact, it has been used previously in the literature to obtain a measure of managerial inefficiency that controls for the effect of exogenous factors for a set of dialysis facilities in Greece [37]. The main shortcoming of this model is that, in its second-stage, the estimated parameters can be biased because the total slacks are computed by taking into account the information for the whole sample and are thus correlated with each other [38]. However, this problem can be overcome by using the enhanced method developed by Cordero et al. [14], which provides unbiased estimations for the parameters using a bootstrap approach in the second stage. To the best of our knowledge, this work represents the first empirical study using this enhanced version of the four-stage model in the context of primary health care.

Dataset and variables

The database used in this application compiles information about the primary health care sector in the Spanish region of Extremadura in the year 2006. Due to the extension of the territory (41,634 square kilometres in area) and its low population density (26.18 inhabitants per square kilometre), this sector is structured into two fundamental territorial administrative levels of aggregation: Health Areas and Health Zones. The former represent the reference entities for establishing health care objectives and funding needs, and their limits and size are established by the regional government according to organisational criteria. At that moment, there were eight Health Areas with one public hospital in each of them. The latter are all organised around a primary care centre (PCC) as the main provider of primary health care services. Those PCCs are the units evaluated in this study. Specifically, our dataset is referred to 94 centres, after eliminating some of them with missing data and others identified as outliers.Footnote 6

We retrieved data from APEX06 [39]—an integrated information system for primary care that provides detailed information for each one of the above PCCs on multiple variables, including the population covered, human resources, activities, costs, accessibility indicators and socioeconomic indicators. Table 1 reports the descriptive statistics of the all the variables used in the analysis together with their role in the productive process of a PCC.

Table 1 Main variables descriptive statistics

As output indicator we use two sets of variables. The first is the most common in primary health care studies: the number of visits or consultations per primary health care professional (activity-output variables). For each PCC evaluated in the study, the variables FREQUENCYGP, FREQUENCYP, FREQUENCYN, and FREQUENCYU indicate the number of visits or consultations per capita with general practitioners (GPs), paediatricians, nurses, and emergency units, respectively.

The second set of variables is represented by information about quality aspects, which can be defined according to multiple criteria [41]. For example, health managers tend to focus on professional standards and health outcomes, while patients often relate quality to an understanding attitude, communication skills and the comfort of health care facilities. The combination of these three quality dimensions in a single indicator is extremely complex, thus most authors attempting to measure quality usually focus on a single component. In our case, we dismissed the use of variables based on patient satisfaction because the underlying scale used by different customers depends considerably on their prior expectations and therefore is not identical for all units evaluated [42]. Furthermore, it can be argued that those expectations and, consequently, their opinions about facilities and the interpersonal aspects of quality, also depend on their socio-economic characteristics [43], which could imply the existence of a potential bias due to the presence of correlation between output and environmental variables [44]. Therefore, in our study, the measurement of quality involves the use of three (quality-output) variables capturing technical aspects of the production process related to staff characteristics and the accomplishment of certain objectives. In particular, we use information derived directly from available evidence, expert opinions and clinical guidelines [45]. The variable EXPERIENCE is a proxy for the experience of GPs and paediatricians that work in each PCC, measured in days of work during the previous 15 years. HEALTHTARGET captures the extent to which the PCC is able to fulfil some specific health targets. Specifically, it represents an average of the coverage ratios of each of the programs implemented within the PCC portfolio services.Footnote 7 Finally, QUESTIONS indicates the number of affirmative answers to a ten-question questionnaire distributed to the managers of the PCCs. This questionnaire is based on the standards considered in the model of total quality elaborated for the Public Health Service of Extremadura and reports information on the three different categories of qualitative aspects, continuous learning of medical personnel and health management skills.Footnote 8

Unfortunately, we cannot use all the available output variables in DEA because the technique would lose its discrimination power. Moreover, activity-output variables were (unsurprisingly) found to be correlated strongly among them. The same was detected for the quality-output variables, although they were uncorrelated with activity-output variables. We therefore decided to use principal component analysis (PCA) in order to condense the original set of variables into two single components.Footnote 9 This method decomposes original data with correlated values into a new set of uncorrelated (i.e. orthogonal) variables. Depending on the context, these variables are known as principal components, factors, eigenvectors, singular vectors or loadings. Each factor or principal component is a linear combination of the standardised values of the original variables used for the definition of the index. The weight given to each of these variables corresponds to its statistical correlation with the latent dimension that the synthetic index attempts to measure.Footnote 10 The number of factors to be retrieved depends on the correlation of the initial variables. If they are correlated strongly with each other, one factor will be sufficient to explain most of their variance. However, if the correlation is weak, several factors are required in order to explain a significant percentage of their variance. In this case, a set of intermediate indicators could be achieved, as many as there were common factors, and the final synthetic index will be calculated as their weighted sum. The proportion of the total variance explained defines the relevance of each factor.

Following this methodological approach, we performed two separate analyses and calculated two synthetic indices for each PCC in the sample. Figure 1 shows the relationships among the initial variables and the new components calculated for each PCC in the sample, the levels of correlation (given in square brackets), and the common factors (F) involved in the definitions of those indices. INDACT is an activity-output index that synthesizes information on quantitative performance of each PCC by combining the number of visits or consultations with each of the types of primary care professional (GPs, paediatricians, nurses and emergency units), while INDQUA compiles information about the experience of the medical personnel, the coverage ratios of the portfolio of services and the affirmative answers to the questionnaire completed by PCC managers.

Fig. 1
figure 1

Synthetic indices and principal component analysis (PCA). Levels of correlation are given in square brackets. F indicates the common factors involved in the definitions of the indices. INDACT Activity-output index that synthesizes information on quantitative performance of each primary care centre (PCC) by combining the number of visits or consultations with each of the types of primary care professional (GPs, paediatricians, nurses and emergency units). INDQUA Compiles information about the experience of the medical personnel, the coverage ratios of the portfolio of services and the affirmative answers to the questionnaire completed by PCC managers

As input variables, we use data on personnel (labour) and prescriptions, represented by the following four variables: HLAB1, the total number of medical staff, including GPs, paediatricians, nurses, nursing assistants, emergency physicians and emergency nursing assistants; HLAB2, the total number of other technicians such as physiotherapists, dentists and X-ray specialists; NHLAB, the total number of non-medical staff, i.e. administrative staff, porters, veterinarians and social workers; and PHARMA, the number of prescriptions per capita, since these also form part of the budget assigned to each health centre.

Finally, in order to include information about exogenous variables that might affect the activities of the PCCs, we collect a large volume of data on the characteristics of population covered by each unit. This dataset includes demographic, geographic and economic variables.

The compilation of these data is a difficult task, since some information is available only for municipalities and not for health zones. In addition, other desirable indicators such as educational and occupation levels of population or patient health status and wealth simply were not accessible. However, in our opinion, the six variables used in this study provide us with very helpful information (differentiated for each health zone) that can be interpreted as acceptable proxies for those unavailable data. Those variables are the crude birth rate (CBR), the elderly ratio (ER), dependency rate (DR), replacement rate (RR), population density (DENSITY) and the percentage of population employed in agriculture (AGRIEMP). Table 2 lists the Pearson correlation coefficients between these variables and the two synthetic output indices (INDACT and INDQUA). One observes that all six exogenous variables are correlated significantly with both indicators.

Table 2 Pearson correlation coefficients between exogenous and indexed output variables

Results

The empirical study consisted of two phases. In the first, initial efficiency scores were estimated without including the effect of the exogenous variables but explicitly distinguishing the implications between specifying an activity-oriented model or a quality-oriented one. Then, in the second phase, the efficiency scores were computed after incorporating the set of exogenous variables and correcting the scores obtained in the first stage.

DEA without exogenous variables

Since one of the principal objectives of this study was to evaluate the impact of including quality in the measurement of technical efficiency in the primary health care sector, in this first stage of the analysis two separate DEA models were calculated, one based on activity indicators and one on qualitative indicators. Both models include the four input variables described in the previous section. Nevertheless, the synthetic activity index INDACT is the only output included in the first model while the synthetic quality index INDQUA is the only output included in the second. In both cases, the DEA model has an input orientation and assumes variable returns to scale (VRS). Adopting VRS models [49] allows us to accommodate scale effects in the analysis in order to avoid the potential inefficiencies that may arise if the units were forced to assume a non-optimal scale of production.Footnote 11 The input orientation seemed to be the most appropriate option given that the demand for health services cannot be controlled, and regional administration managers can determine only those resources attributed to each PCC to provide those services adequately.

Columns 2 and 3 of Table 3 report the efficiency scores obtained with those two specifications.Footnote 12 The two models identify a similar number of efficient units (20 and 18, respectively) and present similar mean values. Nevertheless, there are some noteworthy divergences in the two rankings of the units, as reflected in the value of the Spearman correlation coefficient between them (0.549).Footnote 13 Indeed, further examination of the values shown in Table 3 highlights the difficulties that arise when activity and qualitative indicators are included in an efficiency analysis independently, since some units identified as efficient using the activity-output index present some of the lowest levels of efficiency when the quality-output index is selected (e.g. units 30, 33, 35, 63). The opposite situation also happens. For example, the units 14, 55 and 85 should not serve as a benchmark despite being fully efficient in the qualitative model, because they present a lower level of efficiency in terms of the activity model specification.

Table 3 Efficiency scores of primary care centres (PCCs) under alternative model specifications

This problem is even more evident if one focuses on the units used as references in each model (Table 4). According to the values given in the first two columns, it can be observed that some of the units among the main one referenced in the model based on activity indicators have less importance in the quality-output model (unit 17) or are even considered to be inefficient in that model (unit 29). This also happens for the quality-output model, where one of the main referenced unit (55) is considered inefficient in the activity-output model.

Table 4 Units used as a reference in more cases under alternative model specifications

These divergences would seem to be sufficiently important to merit the attention of researchers and policy-makers regarding the need to take both quantitative and qualitative aspects into account. In this way, it can be ensured that measures of the performance of health units in the primary health care sector are reliable and non-biased. The simplest way of including the two dimensions in the analysis is to run a new DEA model with the two previous synthetic indices, INDACT and INDQUA, used now as a dual output within a unique model (combined model). Column 4 of Table 3 reports the efficiency scores obtained with this alternative model specification.

According to the Spearman correlation coefficient reported in Table 5, the efficiency scores calculated with this new DEA model are very similar to those obtained with the previous activity model (0.921). However, some units undergo a slight improvement with the inclusion of the quality-output index in the combined model and even become efficient (units 14, 32, 53, 55 and 85). Likewise, those units that were efficient in terms of the activity model, but inefficient units in terms of the quality model are still efficient (4, 14, 55, 81 and 85). The same happens to those that were qualitatively efficient but quantitatively inefficient (units 30, 33, 35 and 63). The reason for this is that units that present clearly better results in one dimension are usually assigned weights with a zero value to the other. This is a common shortcoming in DEA models, which essentially ignores one of the dimensions in these cases.

Table 5 Spearman correlation coefficients among different model specifications

In order to overcome this limitation, we propose incorporating weight restrictions on the output measures in the combined efficiency model initially calculated. This approach allows one to ensure that both dimensions (activity and quality) are considered in the DEA results. Moreover, it ensures that units with a low value in one of those indicators cannot be placed in the boundary. In the present case, we acknowledge that such weight restrictions reflect the perceived relative importance of the two indicators, thus representing a value judgement. Health managers concerned about the results of the evaluation should therefore decide these restriction values. The contribution of the present work is to show how the establishment of such restrictions may affect the results. To this end, we tested three potential lower bounds on the ratios of the weights (10, 20 and 30 %, respectively), finding that they led to similar results.Footnote 14 Obviously, the average efficiency is lower when the constraint is higher. For the sake of simplicity, we present only the scores obtained using the intermediate value (20 %), which involves a sufficient level of restriction without to any great extent hampering the chance of any given unit to be considered efficient if it reaches a certain level of performance in one of the two dimensions.

A close examination of the efficiency scores calculated with this constrained combined model (Column 5 in Table 3) shows that a total of four units (4, 33, 35 and 63) that were fully efficient in the combined model (Column 4) become non-fully efficient when the weights are restricted. The consideration of the two dimensions also affects the identification of reference units. In particular, the constrained model identifies units 24 and 84 as the main references for the rest of the units (final column in Table 4), coinciding with the top-ranked reference units in the activity-output model (84) and the quality-output model (24).

In view of the result presented in this section, it seems clear that ignoring qualitative aspects in measuring technical efficiency of primary health care centres might lead to biassed and inappropriate results. One potential option to correct this misspecification problem within a non-parametric context could be the implementation of a constrained combined model that considers both quantitative and qualitiative output indicators.

DEA with exogenous variables

In this section, we consider the second stage of the empirical study where information about exogenous variables affecting the performance of units is assessed. In the context of our study, those factors are represented by the socio-demographic characteristics of the population served by the health centres. Under the reasoning just outlined, we estimate the efficiencies levels of each one of the PCCs of our study by means of incorporating the set of exogenous variables into the specification of a weight-constrained combined model. Specifically, we consider total slacks (radial and non-radial components) of the four input variables included in the model, and regress them on the six exogenous variables selected as a representation of the patient characteristics (CBR, ER, DR, RR, DENSITY, AGRIEMP). In order to avoid bias in the estimates that result from the censored normal Tobit regressions in the original four-stage model, we apply the enhanced method based on a bootstrap procedure developed in [14].Footnote 15 Table 6 presents the results.

Table 6 Estimated parameters of total slacks using a Tobit bootstrap procedure. The standard errors are shown in brackets

The analysis of the estimated parameters allows some preliminary inferences to be drawn about the influence of the patient characteristics on efficiency. Firstly, it is noteworthy that the influence of external factors varies across different inputs, although we can identify several significant parameters that enable us to claim that the correction of the initial scores to include the effect of these variables is totally justified. The elderly ratio (ER) has a significant (and positive) effect on the slack of every input variable (with the exception of NHLAB), although its impact is greater on PHARMA and HLAB1. This positive effect means that a higher proportion of elderly population increases the inefficiency. In contrast, the population density (DENSITY) has a significant negative effect on the slack of every input variable (with the exception of PHARMA), which means that health zones with a higher density, usually large cities, perform more efficiently (the slacks are lower). Likewise, it is worth noting that some exogenous variables have a weak impact of almost every input, such as the replacement rate (RR) and the crude birth rate (CBR), while other variables affect only some specific variables, like the dependency rate (DR), which has only a significant (and positive) impact on the variables representing medical and technical staff (HLAB1 and HLAB2). Finally, the proportion of the population employed in agriculture (AGRIEMP), which may be interpreted as a proxy for the low income and low educational level section of the population, has a negative effect on almost every input (positive effect on slacks), although this influence is smaller than other variables.

We next used the mean values of the parameters estimated by the bootstrap procedure to predict the total input slack for each unit based on the values of the exogenous variables.Footnote 16 These predictions adjust the primary output data according to the difference between maximum predicted slack and the predicted slack. The final stage consists of using the adjusted input data to run a new DEA model maintaining the restriction (λ = 20 %) for the output original values, so that it can be comparable to the previous model. The new efficiency scores allow one to distinguish the inefficiency that is attributable to management once the characteristics of the population being served are taken into account. Table 7 reports the efficiency scores obtained with this constrained four-stage bootstrapping model.

Table 7 Efficiency scores with and without exogenous variables

Comparing the values obtained from the constrained four-stage model with those calculated previously, one observes that there are some significant changes when socio-demographic patient characteristics are included in the analysis. In general terms, the average efficiency rises, although the patterns of specific cases are quite diverse. Many units have higher efficiency scores and even some of them become efficient (units 32, 40, 44, 59, 63, 68, 79, 81, 88 and 92). Most of these units are located in zones with low population densities and high elderly ratios, reflecting that, once the evaluation has taken into account that these units are operating in an unfavourable context, they obtain higher efficiency scores.

In contrast, other PCCs obtain lower scores when these variables are included (units 6, 14, 17, 20, 24, 27, 29, 30, 49, 52, 54, 55, 75, 76, 84, 85 and 90). Those units benefited from the first evaluation because the characteristics of the population they were serving were ignored. Many PCCs experiencing a decrease in their efficiency scores belong to the two relatively large cities of the region, Badajoz and Caceres.Footnote 17 Among the units enumerated above, two (24 and 84) stand out because they were the principal references in the constrained model without exogenous variables, but have now become inefficient in this new model (Table 8). Indeed, it can be noted that only units 32 and 81 maintain their position as main referents in the four-stage model.

Table 8 Reference units with and without exogenous variables

These results underline the importance of taking into account data about patient characteristics in the calculation of efficiency scores so that the evaluated units can be assigned production targets according to the context in which they are operating. In addition, the method employed allows us to identify which variables have a greater effect on input variables (for instance, the percentage of elderly population on the number of prescriptions) as well as correct the initial values of those variables based on unbiased parameters representing those effects.

Conclusions

The empirical investigation developed in this study contributes to extending the literature on the measurement of economic efficiency in primary health care. We do that by considering jointly two particularly relevant aspects of health care provision: the quality of care, and the effect of environmental factors on the performance of primary care centres. To date, and despite their relevance, these two aspects have not received much attention in the literature on the measurement of economic efficiency in the field of primary health care. Moreover, to our knowledge, this investigation presents the first attempt to combine both issues jointly within the same framework. By doing so, the methodological approach introduced in this study helps to minimize the bias of previous investigations.

As a first departure from previous literature, and using a newly constructed information system for the primary health care sector (APEX06), we were able to construct two different measures of output, accounting for both activity and quality features of health services. Hence, we overcome the criticisms associated with the use of solely quantitative indicators of output.

However, simply using quality measures as additional outputs can result in some serious modelling problems when DEA is the method used to measure efficiency. The main problem arises due to the possibility that this technique may assign weights of zero value to quality indicators and hence allow units to be identified as efficient, even though they actually have low levels of quality in their performance. In the present application, we dealt with this problem by incorporating weight restrictions in a combined DEA model including both quantity and quality indicators. This ensured that both dimensions are properly taken into account in the analysis. The results of our investigation show that the use of weight restrictions modifies the composition of the efficient frontier considerably, as well as which units were identified as references to be used as benchmarks for the inefficient units.

A second aspect in which the present study improves on previous research has to do with the specification procedure employed by modelling exogenous variables related directly to the population served by each health centre but over which it has no control. In this sense, our aim was not limited to identifying which variables may affect the performance of PCCs, but to explicitly include them in the calculation of the efficiency scores and so better measure PCC performance. To this end, we used a recently developed four-stage model that includes a bootstrap procedure to ensure unbiased estimates of the parameters. The semi-parametric structure of this model, in which the effect of each exogenous variable can be tested by using a set of Tobit regressions, allowed us to identify two variables as the main factors influencing PCC performance: the population density and the elderly ratio. Also, the analysis of specific cases showed units operating in the largest cities to be the most negatively affected by the correction of efficiency scores derived from the inclusion of exogenous variables in the analysis. We therefore believe that it is particularly important to include population socio-demographic characteristics in calculating the efficiency levels of primary health care centres. Otherwise, the results will most likely be affected by a major bias in their statistical and economic meaning.

In sum, observing the changes in the efficiency scores of the different PCCs considered in our study as the model specification was made progressively more sophisticated, we conclude that greater accuracy should always be required in both the specification and the estimation of economic models for the measurement of efficiency in the health sector. Specifically, in the field of primary health care, further research is still necessary in order to better understand how these centres operate. There is also a clear need for additional effort to be made in collecting reliable information about other aspects that may affect the performance of primary health care centres, and that were not considered in our evaluation due to the lack of available data. Some examples are the structure of financing, the methods used to determine payments to physicians, the economic level of municipalities served by the unit, and the proportion of immigrants in its target population. In this sense, the participation and implication of practitioners and decision-makers in retrieving data would be essential. Methodological approaches such as that described in the present work could facilitate work towards this goal and provide satisfactory analytical tools for decision makers in this sector.