Introduction

A number of new therapies for multiple sclerosis (MS) have been introduced in the past decade, and more are currently in development. Compared to the old and inexpensive symptomatic treatments, these disease-modifying drugs (DMDs) are costly, raising the question of the relationship between their cost and the health benefit they provide. Few disease areas have seen as much interest in economic evaluation as MS. The discussion has been triggered by the difficulty in assessing the cost-effectiveness in a disease where – as in most chronic progressive diseases – the major health benefit is not evident for some time. Treatment effects that can be measured in clinical trials or directly in clinical practice, such as the frequency of relapses or small changes in functional disability, represent partial rather than final outcomes. The full effect of disease-modifying treatments thus cannot be observed, but must be modelled. Similarly, the long-term effect on costs and quality of life (QOL) must be estimated in models, as it cannot be measured in the short and medium term.

Several such cost-effectiveness models were proposed in MS at the time of the introduction of currently available DMDs. They use somewhat different methodologies but essentially combine epidemiological data on the natural history of the disease, consumption of health care and other resources as well as productivity losses, QOL related to disease severity, and data on the effectiveness of treatments on relapses and disease progression. In these models DMDs were compared to no treatment, and the information on costs therefore came from studies performed prior to their introduction [17]. Thus the findings of these cost studies may no longer be entirely accurate, since the introduction of new therapies may have changed the management of the disease in several ways. Although more recent cost studies exist, these have focused on specific subgroups of patients treated with DMDs [8, 9] or have been performed outside Europe [10], and results cannot be related to the patient population in Europe.

The objective of this European-wide observational study was therefore to establish the current cost of MS, at all levels of disease severity, to provide an input to economic evaluation of new treatments in different countries. The study was performed in nine countries (Austria, Belgium, Germany, Italy, The Netherlands, Spain, Sweden, Switzerland and the United Kingdom), including more than 13.186 patients (Fig. 1), and used the same methodology rigorously in eight of the countries. The study in the United Kingdom used a slightly modified methodology, but the same basic data. Thus, differences in results are entirely due to differences in the provision of health care, traditions of usage and unit prices of the different resources rather than differences in study methods. For a summary of results, see Kobelt and colleagues [11].

Fig. 1
figure 1

Summary of patients in the study

Study approach

Methods for cost-of-illness studies are well developed. These studies relate costs to diseases and estimate the economic burden that a specific disease places upon society.

Depending on the research question, studies can be prevalence- or incidence-based. In prevalence-based studies, all costs are estimated for a patient population in a defined geographical area for a given period of time (generally 1 year). In incidence-based studies, lifetime costs are estimated for a patient contracting the disease. Both types of studies are useful for policy makers for planning and budgeting purposes, but the choice between the two is often limited by the disease itself. Incidence-based studies in MS are virtually impossible due to the long duration of the disease, and lifetime costs for an MS patient will have to be modelled by combining disease progression with costs at each level of the disease. Therefore, we used a prevalence-based approach but collected sufficient data to estimate costs for all levels of severity, thus allowing for the possibility of estimating lifetime costs.

Data collection for such studies can be “top-down” or “bottom-up.” In the former, aggregate figures on resource consumption related to a diagnosis code from available databases, national statistics and registries are used to estimate costs for a given prevalence sample. The limitations of this approach for studies in MS are obvious. Such databases seldom include all relevant costs and – except for disease-specific registries – they rarely provide differential diagnoses (e.g. disease course) or disease severity. Thus, the bottom-up method, where information is collected directly for a sample of patients either through chart reviews or patient questionnaires or a combination of both, is better suited for studies in MS.

Costs can be presented from different viewpoints, that of society or that of specific payers, private or public. In general, however, the perspective of society is adopted as the one that provides the most complete view. In the societal perspective, all costs, regardless of who ultimately pays, are included. Costs are calculated as full costs (opportunity costs), while transfer costs (taxes including value-added tax, reimbursements, pensions) are ignored. From the point of view of society these represent transfers between different members of society and are hence not a cost. In an additional analysis, intangible costs (costs due to suffering) are estimated.

From this overall perspective, it is possible to derive the perspective of payers by including only resources covered by payers and valuing them at the reimbursement tariff rather than at the full cost. Therefore, we collected data in all countries in a way that allows estimating costs from both perspectives. The main analysis is presented from the societal perspective, including direct health-care costs, non-medical costs, informal care and productivity losses.

Patient selection

In bottom-up studies, the selection of patients is clearly one of the most important factors. Finding a sample that precisely mirrors the prevalence of a disease is difficult, and prevalence estimates differ. However, this is less critical when the main objective is to estimate costs by levels of severity, where the key requirement is to include enough patients at all levels of the disease to provide stable estimates. Total cost of illness can then be calculated using estimates of prevalence at different levels of severity. Recruiting patients through hospitals, registries, patient associations or by advertising may select different samples. Selecting MS patients in neurological clinics may result in a sample biased towards early disease, particularly as current DMDs are indicated for relapsing disease and early treatment is advocated. Patients at the very high end of severity may be followed in other specialized centres. Also, the proportion of patients treated with DMDs may be over-represented. This might be similar for patients followed in MS registries, as the interest in measuring the outcome of these treatments will promote inclusion of patients on treatment. On the other hand, including patients participating in patient associations might result in a sample biased towards longer-standing and more severe disease, as these patients may be more likely to look for support. It is, however, also possible that this latter bias is not very prominent in MS, as patients are young and many search for information online and may therefore join associations rapidly.

Again, these biases are of limited importance, as it is unlikely that patients at the same level of disease severity recruited differently would differ in any other aspect. We recruited patients with the help of patient associations in most countries [6], and where this was not possible, in neurology clinics [3]. In the latter case, all patients in the files were invited, rather than patients coming for a consultation, in order to avoid selecting patients on the dependent variable of resource consumption. The final sample contained sufficient patients at all levels of disease severity in all countries to allow analysis (Fig. 2).

Fig. 2
figure 2

Disease severity in the sample

Data collection

The study is based on the methodology used in several earlier studies of MS in Europe and the United States [46, 10]. Information was collected directly from patients in a cross-sectional survey with an adapted version of the questionnaire used in the earlier studies. In particular, questions regarding the disease were reviewed and re-tested with a small number of patients at the Karolinska Institute in Stockholm, Sweden.

Experience in previous studies had shown the information on resource use from patients to be very reliable. In a similar study in Germany, hospitalization indicated by 200 patients recruited in a clinic was compared with the information in their hospital charts, and the difference in the mean number of inpatient days was found to be half a day [6]. Such a control requires, however, that patient identity is known, thus reducing the anonymity of the study. We therefore decided to rely entirely on the information received from patients.

The questionnaires were mailed to patients by the patient associations or the neurology clinics, and answers were returned to the data management centre in each country. The answers were entirely anonymous, but patients were asked to indicate in which area they lived, to enable an assessment of the geographical representation of the sample. Data were entered in a specifically designed and secure Internet site with logical and numerical controls to minimize errors. In addition, data entry was performed by a small number of trained data managers [13] in order to further ensure the quality of the data.

Resource use

Patients were asked to indicate their use of health-care and non-medical resources, informal care and their work capacity. The objective of the study was to estimate costs related to MS, not costs for patients with the disease. Only MS-specific resource consumption was therefore included. It is possible that patients with a severe disease also consume more resources for other diseases and thus have higher overall costs. In these cases it is generally difficult to separate what part of total costs relates to the disease that is being investigated and what part relates to comorbidities. For patients with MS, this is thought to be less of a problem, as the consequences of the disease are rather well defined. In addition, patients are in an age group where unrelated comorbidity is generally limited. We therefore chose to instruct patients to include only MS-related costs rather than perform a comparative study.

Resource use was collected retrospectively, and each type of resource was collected for a time period that would minimize recall bias. It is generally considered that 3 months yields acceptable data, but we used variable periods to make it easier for patients. Thus, the number of hours of informal care from family or home care was collected as the average per day during the past month, drugs for 1 month, major investments such as the purchase of a wheel chair or transformations to the house or the car for 1 year. Each question required a binary answer (yes/no) followed by details on the type and quantity of the resource. All resources that were expected to be used were specified in order to minimize errors due to spelling, and no open fields or free text (e.g. “other”) were permitted. This may have excluded some very rarely used resources but provided a clean and comparable dataset in all countries.

For all resources that were known or not expected to be covered by health insurance schemes (e.g. certain paramedical services, investments and devices, home help and transportation), patients were asked to indicate the total cost and how much of the amount they had to pay personally (out-of-pocket costs). An example of questions is shown in Fig. 3.

Fig. 3
figure 3

Sample questions

Health-care resources

Hospitalization was collected both as the number of admissions and the number of inpatient days or day admissions in different departments to allow analysis using both diagnosis-related groups (DRG) or per diem rates, while stays in rehabilitation centers or nursing homes were collected only as the number of days. Consultations were differentiated by type of physician, nurse or paramedical profession. Major examinations such as MRI or CT scans were not assumed to take place more than once per 3 months, and patients had to indicate only yes/no, while for blood tests the number of tests was to be provided. Similarly, DMDs required only a yes/no answer, while all other drugs were collected as the number of days taken during the past month. Non-prescription drugs were not specified, but patients were asked to indicate how much they had spent on these overall during the past month.

Other direct resources

Typical investments (e.g. stair and bed lift, ramps, rails) and devices (e.g. walking aids, wheel chairs) were listed by item, while transformation of the house or the car and small utensils were mentioned as categories. For assistance by community or social services, or help from family or friends, we collected the average hours per day and the number of days during the past month. Transportation was collected as average distance per trip and number of trips per month.

Work capacity

The questionnaire established the current employment situation, including normal working hours, and patients in employment were asked to indicate whether they had been on sick leave during the past 3 months. In addition, we enquired whether patients had to change their type of employment or reduce their normal working hours due to MS, and whether this had any financial consequences.

In many countries, long-term sick leave precedes permanent invalidity, and we therefore distinguished between these, asking patients for how many months they were on long-term absence and since what year they were on early retirement due to MS.

Quality of life (utility)

MS affects a number of body functions and provokes multiple symptoms that can affect patients’ well-being at different times during the course of the disease. Clinical measures are seldom adequate to fully express this effect, as they generally focus on individual symptoms. A number of instruments to measure health-related QOL, both generic and specific to MS, have been developed. They elicit patients’ subjective evaluation of the effects of a disease or a treatment and have become an important tool for clinicians in assessing outcome [12]. Most of these instruments are multidimensional and provide scores for different aspects of QOL, such as physical, mental and social well-being. They are therefore ill-suited for economic studies used for setting priorities in resource allocation in health care [13]. Such studies require a single outcome measure that is comparable across diseases. The quality-adjusted life-year (QALY) has been developed specifically for this purpose [14]. The QALY combines quantity and quality of life by weighing life years with a quality index, utility. In this framework, utility is defined as the preference of patients and/or the general population for given states of health. Utilities are expressed as a value on a cardinal scale anchored between 0 (death) and 1 (full health) and are measured using techniques from decision analysis (standard gamble or time trade-off [1517]) or health state classifications systems such as those developed for the EQ-5D [18] or the Health Utility Index [19]. QALYs are then calculated as an area under the curve, by adjusting time with the utility weight. For instance, if a certain health state resulted in a utility of 0.5, living 1 year in that state would give 0.5 QALYs and be equivalent to living 6 months at full health, as illustrated in Fig. 4.

Fig. 4
figure 4

Illustration of a quality-adjusted life-year. Quality-adjusted life years (QALYs) are calculated as an area under the curve, by adjusting time with the utility weight. For instance, if a certain health state resulted in a utility of 0.5, living 1 year in that state would give 0.5 QALYs and be equivalent to living 6 months at full health

We collected data to calculate utility scores using the EQ-5D [18], as the simplicity of this instrument makes it easy to use in mail surveys. It covers five domains of health-related QOL (mobility, self-care, usual activities, pain/discomfort, anxiety/depression) with three levels of answers (no/some/severe problems). The resulting 243 combinations of answers can then be translated into utilities via a social tariff established with the general population using decision-analytic methods (time trade-off) [20]. Although reference scores for the EQ-5D are available in a number of countries, the tariff for the United Kingdom is the only one in Europe based on decision-analytic methods (time trade-off) [14] and has been widely used, including in MS. In these studies, utility scores have been shown to be highly correlated with functional capacity of MS patients [46, 10]. There has been some discussion as to whether there are additional explanatory variables for utility, notably cognitive impairment. Unfortunately, it is quite impossible to measure this type of impairment in a mail survey. Instead, we decided to test for “fatigue,” measured with a visual analogue scale anchored between 0 (no fatigue) and 10 (severe fatigue). However, in multiple regression analyses, fatigue showed no additional significant effect on utilities, most likely due to the fact that functional disability and fatigue are strongly linked.

Intangible costs

Intangible costs (costs due to pain, grief, anxiety and social handicap) are difficult to measure, and there is no generally agreed-upon method for their calculation. They are therefore seldom calculated in cost-of-illness studies, while in cost-effectiveness studies they are included when outcome is expressed as QALYs. We have earlier proposed a method to estimate these costs [4, 10, 21, 22], where the loss of QALYs is calculated from the difference in utility scores between MS patients and the general population matched for gender and age (Fig. 5). By assigning a monetary value to a QALY, for instance the willingness to pay for a QALY gained in terms of reimbursement decisions, intangible costs can be calculated. The value for a QALY in Europe is generally assumed to be around € 50.000, although no official threshold value exists. As an alternative, the World Health Organization has suggested using three times the Gross National Product [23], and we have therefore calculated both values.

Fig. 5
figure 5

Utility loss of MS patients compared to the normal population. The loss of utility, by gender and by age group, is represented by the difference between the curves, in the sample from the United Kingdom. Utilities were measured with the EQ-5D and scores calculated using the UK tariff. By multiplying utility with time (1 year), QALYs lost in this sample during a year are estimated

Disease information

Disease information was limited to age at first MS symptoms, year of diagnosis, type of MS, exacerbations during the past 3 months and a self-assessment of disability.

Functional capacity

Functional capacity is assessed in all clinical and epidemiological studies, generally using the Expanded Disability Status Scale (EDSS) [24]. EDSS scores have been shown to be highly correlated with both costs and utilities [4, 7, 10, 21, 22, 25]. Originally, the EDSS is not self-assessed and therefore is not directly suited to be used in a mail survey. However, a number of alternative ways of assessment have been used, such as the Patient Determined Disease Steps (PDDS) instrument [26], validated for self-reporting to a large patient registry in the United States (NARCOMS) [27]. Similarly, the EDSS was self-assessed in earlier cost-of-illness studies, using a description of disease severity focusing on ambulation, based on the original description in the EDSS and verified against patient charts [4, 6]. This description was updated to combine elements from both the EDSS and PDDS and retested with a small group of patients, where it showed a good correlation (Fig. 6). The 11 levels of answer were then transformed into EDSS scores, keeping EDSS 6.5 as a separate level, as is often done in clinical trials.

Fig. 6
figure 6

Patient assessment of EDSS levels. Levels correspond to full EDSS points up to level 6. Level 7 represents EDSS 6.5, and levels 8, 9 and 10 hence correspond to EDSS 7, 8 and 9

Thus, while there may be some uncertainty in the EDSS scores, particularly in the early stages of the disease where differences are small, it should be borne in mind that there is also inter-rater variation in assessments performed by clinicians. The uncertainty in our study is reduced by the inclusion of large samples at all levels, and the validity of the scores is further confirmed by, for example, the utility results, which were almost identical across countries at given EDSS levels [11].

Disease course

Contrary to EDSS scores, patients appeared to have some difficulties in assessing their current disease course (i.e. relapsing-remitting MS [RRMS], secondary progressive MS [SPMS] or primary progressive MS [PPMS]), despite rather extensive description. This information is therefore only summarized in the results sections, as it is of limited importance within economic studies for two reasons.

First, at diagnosis, the majority of MS patients present with RRMS, which usually afflicts patients between the ages of 15 and 40 years. This form of MS is characterized by disease exacerbations that initially improve spontaneously or in response to treatment over several days or weeks. Over time, in a majority of patients, underlying disease progression and neurological damage continue to occur between relapses, resulting in SPMS. A minority of patients have PPMS at onset, often at a slightly higher age. However, this separation of MS into different disease types does not appear straightforward. The conversion from RRMS to SPMS is indeed not a well-defined event but rather a gradual process and therefore subject to interpretation [28]. An analysis of patients from the natural history cohort in Canada [29] revealed that patients converted to SPMS with scores between 1.0 and 6.0 on the Kurtzke Disability Status Scale (mean score 3.0) [30], 1 to 36 years after onset of the disease (mean 10.7 years), between the ages of 18 and 86 years (mean 40 years) [31]. At levels of functional disability between 3.0 and 5.0 there will thus be patients with both RRMS and SPMS disease types.

Second, we have shown earlier that costs and utilities of patients at the same EDSS level and not in relapse do not differ by MS course [31]. The same is true for patients on DMDs or not. Cost differences arise from a higher frequency of relapses earlier in the disease, or faster progression to higher levels of disability. These findings were confirmed again in the current study, and we therefore report results for the full sample and by level of disability rather than MS course.

Despite this, one could, however, argue that the disease course is of importance when considering treatment with DMDs, as this is indicated for patients with RRMS. However, the results in our study indicate that treatment is primarily provided at the lower levels of EDSS, where patients have more frequent relapses, regardless of the type of disease.

Relapses

The questionnaire contained a description of a typical relapse and asked patients to specify whether they had had a relapse in the past month or 3 months. However, we did not collect information on health care consumption due specifically to relapses, as it may be difficult for patients to separate between resources related to normal care and to a relapse. Rather, we estimated the cost of relapse from the difference in total costs between patients at the same EDSS levels with and without a recent relapse. The same method was used to estimate the loss of QOL (utility) due to an exacerbation.

Analysis

Prior to analysis, missing answers and outliers were systematically verified. For instance, when a patient had indicated having used a resource, but omitted the quantity, the mean quantity for users of the same resource was imputed. For items where the cost indicated by the patients had to be used, as no standard unit cost was available (e.g. transformations to the house), or not enough details were collected (e.g. non-prescription drugs), we set the limit to the sample mean plus one standard deviation.

Resources were multiplied with their unit cost in each country. Unit costs were obtained from a number of publicly available sources and telephone interviews and, if necessary, adjusted to 2005 prices using the consumer price index. Costs were then annualized by multiplication, with the assumption that a similar proportion of patients would use the same quantity of resources in any given month or quarter, and presented as mean cost (standard deviation) per patient. It is, however, obvious that mean results for the sample are highly dependent on the distribution of disease severity in the group, and results are therefore also analyzed by level of disability.

It must be borne in mind that even within similar patient groups, health-care costs are highly skewed, with a minority of patients consuming the majority of the resources. Within our studies, the large sample sizes reduce this problem somewhat but not totally. Hence, when such data are used to model lifetime costs or estimate cost-effectiveness, the cost per patient needs to be calculated with multiple regression analysis, controlling for age, potentially also gender and disease duration in addition to the level of disability, as all have a significant effect.

Valuation

Standardized methods for the calculation of costs were used. However, there are substantial differences in the unit cost of resources, adding to the difficulty of making cross-country comparisons. The calculation methods are provided below (Table 1), while details on the unit costs and the sources from which they were taken can be found in the country results. In the societal perspective, valuation was based on total costs, including any copayment or self-payment. In the payer perspective, tariffs excluding all copayments were used, and non-reimbursed items were ignored.

Table 1 Valuation of resources

While the methods for valuation of direct resources are well established, there are different approaches to calculating informal care costs and productivity losses.

Informal care can be considered a direct or an indirect cost. We have treated it as a direct cost in this study, with the reasoning that in a severely disabling disease such as MS, absence of care is not an option. Hence, if family members did not provide it, particularly in more advanced disease, health-care professionals would have to provide it or patients be institutionalized. Thus, the cost of family care can be established using the concept of loss of leisure time, as we have done in this study. Alternatively, a replacement cost can be used, where the hours of care are valued with the cost of a professional. This latter method generally provides higher cost, as not all hours provided by the family could necessarily be replaced due to limited availability of resources, both human and financial. When considering informal care as an indirect cost, calculations are usually based on the loss of productivity of working family members. This may underestimate the cost, as it assigns no cost to care provided by non-employed members. This is similar to the way in which productivity losses for patients are estimated, where, for example, patients over 65 years of age will have no costs. There is, however, a major difference, as informal care is truly an activity that would otherwise have to be performed by professionals, albeit in different quantity and quality and with potentially different outcome.

Estimates of indirect costs are more straightforward. Most studies use the human capital approach, where the production of an individual is valued at the market price (in this case, the gross wage including employers’ costs). Costs are hence calculated only for patients of working age. While this is sometimes considered discriminatory towards older patients, it corresponds to the reality where at a higher age, consumption is higher than production, as opposed to younger people, who produce more than they consume. There has been intense debate as to whether this method in fact overestimates costs as, particularly in times of high unemployment, a worker would be rapidly replaced and hence no production loss would occur [32]. The difficulty with this argument is that in a situation with structural unemployment, the professional qualifications may not be the same and patients may not be easily replaced. Also, it ignores the basic production capacity of an individual. However, this method of calculation (friction cost method) is advocated and mandatory in The Netherlands. We have used the human capital theory in this study, except for The Netherlands, where we have used both. Indirect costs are hence valued using the cost of employment based on the average gender-specific national income plus employers’ costs, considering the predominance of women with MS. It would have been possible to further adjust this income to age groups, or even by socioeconomic groups. We have ignored both, as the sample already covered a large age spectrum, and because we had decided not to ask patients about their income in the questionnaire.

Results

Although mean total costs per patient are calculated in each country and are presented in the subsequent papers, they have to be considered in the light of the distribution of disease severity in the respective samples. A better approach is therefore to refer to groups of patients at different levels of function (EDSS), or even costs by EDSS. This latter level of detail is what would be required to calculate the incidence cost, or to estimate the economic impact of treatments in the long term. The EDSS is by far the most important predictor of cost, but one should remember that patient-assessed EDSS scores are somewhat uncertain, and scoring by clinicians is subject to inter-rater variability. Also, the fact that costs are highly skewed means that mean costs cannot be used directly, but the variability of usage needs to be incorporated in any such analysis.

Across the countries, similarities – but also differences – are striking. Direct health-care consumption such as hospitalization and consultations is very different, reflecting the influence of the health-care systems and medical traditions. Countries with an early introduction of DRG payment for hospitalization have fewer admissions and shorter length of stays than countries where payment is or was until recently based on per diem rates. In countries where specialist consultations are limited to outpatient departments in a limited number of specialized centres, the number of medical visits is low, but their cost high, as they are often more extensive. In contrast to this, in countries with a high number of private specialists and relatively free access, the number of consultations is high, but their unit cost is often low. Clearly also, the use of DMDs is very different across Europe, ranging from a mere 20% in the sample in the United Kingdom to around 50% in other samples. On the other hand, the consumption of non-medical resources and the loss of production are very similar, as they appear to be driven mostly by the disease rather than the systems. Work capacity was comparable across countries, even at given levels of disability. The proportion of patients that required special devices or had to adapt their car or house to facilitate daily activities was virtually the same, as was the need for assistance such as home help and other services. A special case is informal care, where the need is clearly driven by the amount of support provided by social systems, where families make up for the lack of services. In addition, however, informal care is also influenced by family structure and tradition.

When comparing total costs, with all necessary caution, it appears that differences between countries arise mainly in the late stage of the disease, driven by differences in social support. This can be best appreciated when looking at the overall distribution of costs on different types of resources in the samples (Figs. 7, 8).

Fig. 7
figure 7

Total costs by EDSS score [adjusted for purchasing parity power (PPP)]. Patients are grouped into mild disease (EDSS score 0–3.5), moderate disease (EDSS score 4.0–6.5) and severe disease (EDSS score 7.0–9.5) and total mean annual cost per patient calculated from the societal perspective. Local currencies have been transformed using the GDP purchasing power parity index (OECD 2004)

Fig. 8
figure 8

Proportion of costs falling on different resources in different countries. Other direct costs include all medical care (hospitalization, consultations, tests, prescription and OTC drugs, excluding DMDs), services (home care, home help, transportation) and investments (devices, appliances, changes to house and car). The largest differences appear to be in informal care, with low usage in three countries. For Germany and The Netherlands this is explained by samples with somewhat milder disease, while for Sweden the reason is the generous provision of services

Conclusion

This study provides an overview of resource usage patterns, costs and QOL in patients with MS across Europe. In addition, it contains a wealth of data that will be useful in a number of ways, not the least of which is for modelling the cost of MS and for economic evaluation of new treatments. As always, however, the study also triggers a number of questions that prompt further research. Examples would be data on changes in labour force participation over time, as the structure and content of jobs have changed; the use and valuation of informal care in different countries and settings and the definition of what constitutes informal care and what is normal everyday support; the use of new treatments; the reasons behind different unit costs; and many more.