Introduction

Orthopaedic registers are increasingly established in all parts of the world and the large majority are used for collection of data from total joint arthroplasties [5, 17, 18, 21, 26, 27].

The Swedish Hip Register still serves as the best example for a well organised and functioning arthroplasty register. On the background of the Scandinavian experience with implementation and organisation of orthopaedic registries, the first reports of a national spine register also came from the northern parts of Europe [33]. The Swedish lumbar spine registry is not only the first but also the only national spine register that has reported its methodology and is continuously reporting its results in the peer-reviewed literature. Four key factors contribute to the success of the Swedish endeavour: A national health service and existing unique patient identifiers make possible the identification and follow-up of ‘migrating’ patients who change the centre of treatment, be it for reasons of dissatisfaction with outcome or for simpler reasons like relocation. Secondly, an established culture exists among the orthopaedic community of naturally reporting procedures and outcomes in registers. Thirdly, the data collection technology employed is easy to use and provides performance feedback to the surgeons. Lastly, the initiator’s conviction of the three most important aspects of any registry is ‘simplicity, simplicity and simplicity’ (P. Fritzell, personal communication).

Spine surgery and joint arthroplasty: the same needs?

The Scandinavian experience with the Christiansen total hip prosthesis in the 1970s and Boneloc cement in the mid-1990s underline the need for broad-based central registration of implants and materials after they have been marketed [23, 34]. In both cases, products were released based on laboratory results without proper assessment in the actual clinical arena. Thanks to the Scandinavian post-market surveillance systems in place, the inferiorly performing products were detected and withdrawn.

Whenever new technologies enter the market, they should be carefully surveyed and monitored since, as we have seen in Scandinavia, neither laboratory testing nor randomised controlled trials (RCTs) provide the final conclusive information and safety about the product performance in the multitude of the different clinical and post-market settings, i.e. in reality. Although innovations like surface replacement, navigation and minimal invasive technologies are currently introduced into the field of total joint replacement, a less frequent introduction of new inventions can be observed. Moreover, the arthroplasty sector can benefit from its long history and the experiences gained by the enormous number of treated patients over the past 40 years. In contrast, spine surgery is a rather young orthopaedic subspecialty and many questions regarding indications and optimum use of treatments and technologies still remain open or have only limited evidence [15]. Nevertheless, certain procedures like spinal-fusion surgery are increasingly being used. In the USA, the annual number of these types of interventions rose by 77% between 1996 and 2001 whereas the number of total hip and knee replacements only increased by 13–14% during the same interval [2]. Similarly, this trend can be observed by the development of the US market for spinal implants and devices with annual growth rates of 18–20% and an estimated overall value of $2 billion [24]. Despite the widespread use of spinal fusion, there are large geographic variations, which suggest a poor level of professional consensus on the indications [20]. This was confirmed by systematic reviews in 1999 and 2002, which concluded that there was no acceptable evidence for many indications of lumbar or cervical fusions [13, 16]. Similar to surgical techniques, widely used technologies like pedicle screws and intervertebral fusion cages were introduced without randomised trials or prospective cohort comparisons. Accordingly, opinion leaders in spinal surgery recommend cautious approaches towards emerging new techniques and devices, and closer scrutiny of spinal implants and their use for unapproved indications. They further suggest RCTs for new implants and indications and rigorous post-market surveillance for adverse events [10]. These proposals reflect that the need for outcomes research in spinal surgery is probably higher than for total joint arthroplasty. Whether this research is based on RCTs or on observational data largely depends on the research questions posed, the interventions and implants under study, the patients included, and other circumstantial factors.

The value of RCTs and observational data

After discussing the need for registration in spine surgery, we have to look at the value of the information that these data collections produce under ideal circumstances. A view is widely held that RCTs are the ‘gold standard’ for evaluation and that observational methods like prospective and retrospective cohort studies and case control studies have little or no value. Such a standpoint ignores the limitations of randomised trials, which may prove unnecessary, inappropriate, impossible, or inadequate. Many of the problems of conducting randomised trials could often, in theory, be overcome, but the practical implications for researchers and funding bodies mean that this is often not possible. The false conflict between those who advocate randomised trials in all situations and those who believe observational data provide sufficient evidence needs to be replaced with mutual recognition of the complementary roles of the two approaches. The attitude of ignoring the value of observational data limits our potential to evaluate health care and hence to improve the scientific basis of how to treat individuals and how to organise services [3].

The much underestimated gulf between scientific measurements based on RCTs and benefit measurements in the community was already recognised 30 years ago. In 1972, Archie Cochrane introduced the term ‘effectiveness’ to describe research results, and ‘efficiency’ to describe results obtained when a therapy is applied in routine clinical practice in a defined community [7]. As opposed to a controlled research setting, a vast variety of factors influence the treatment efficiency such as screening, diagnosis, place of treatment, length of stay, rehabilitation and optimum use of personnel and materials.

The limitation of RCTs: principles vs practice

The limitations of randomised trials can be derived from either the inherent nature of the method (a limitation in principle) or from the way trials are conducted (a limitation in procedure). The importance of this distinction is that while little can be done about the former, improvements in the conduct of randomised trials could, in theory, overcome some or all of the latter. As previously mentioned, there are four main reasons for why observational methods are needed: experimentation may be unnecessary, inappropriate, impossible, or inadequate.

Non-necessity

When the effect of an intervention is dramatic, the likely importance of unknown confounding factors is so small that they can be ignored. Examples are penicillin for bacterial infection, anaesthesia for surgical operations or immobilisation of fractured bones.

Inappropriateness

Randomised trials may be inappropriate in four situations:

  • They are often not large enough to detect infrequent adverse outcomes.

  • As a result of insufficient study size, they have difficulties evaluating interventions that are designed to prevent rare events. This may not so much be the case in orthopaedics than rather in accident prevention schemes or in care of newborn children (e.g. correct positioning to prevent sudden infant death syndrome).

  • They have short observational and follow-up periods and are very limited when the outcomes of interest are far in the future as is, for example, the case in implant loosening in total joint arthroplasty.

  • They neutralise (with randomisation) the effectiveness of an intervention that depends on the subject’s active participation, which, in turn, depends on the subject’s beliefs and preferences.

Impossibility

In short, seven obstacles can be listed that researchers have to face all too often:

  • Reluctance or refusal of clinicians and key players to participate [22, 31].

  • Ethical objections [14, 36].

  • There may be possible political obstacles if those who fund and manage health services do not want their policies studied [11].

  • There have been examples where researchers met legal obstacles in performing a randomised trial [6].

  • Some interventions simply cannot be allocated on a random basis [4].

  • Contamination can occur if a clinician is expected to provide care in more than one way. It is possible that each approach will influence the way care is provided to patients in the other arms of the study.

  • The scale of the task confronting the research community: an immense number of health care interventions are in use and most of them have several components. It will only ever be practical to subject a limited number of items to experimental evaluation [12].

The exact nature of the obstacles will depend on the cultural, political, and social characteristics of the situation and therefore, clearly, will vary over time.

Inadequateness

The external validity or possibility to generalise the results of randomised trials is often low [25, 28]. The results of drug trials can, in the main, be generalised to other doctors and settings. In contrast, the outcome of activities such as surgery, physiotherapy, psychotherapy and community nursing may be highly dependent on the characteristics of the provider, setting and patients. As a consequence, unless care is taken in the design and conduct of a randomised trial, a straightforward generalisation of results may not be possible. There are three reasons so as to why randomised trials in many areas of health care may have low external validity:

  • Health care professionals who participate may be unrepresentative. They may have a particular interest in the topic or be enthusiasts and innovators. The setting may also be atypical, a teaching hospital, for example.

  • The patients who participate may be atypical. All trials exclude certain categories of patients. Often the exclusion criteria are so restrictive that the patients who are eligible for inclusion represent only a small proportion of the patients being treated in normal practice [19, 35].

  • The treatment may be atypical. Patients who participate may receive better care, regardless of which arm of the trial they are in [32].

This list of restrictions for generalising RCTs compiled by Black in 1996 was recently complemented by Rothwell [28] with a multitude of additional restrictions that go beyond the scope of this article. However, some of his recommendations make obvious the importance of this broadly neglected issue. Two of them are:

  • an increased consideration of the issue of external validity in CONSORT [9] and Cochrane collaboration guidelines [8],

  • the International Committee of Medical Journal Editors should require a new section for all primary RCT reports or systematic reviews entitled ‘To whom do these results apply?’.

Spine registers: not an RCT alternative, but the observational adjunct

Observational studies like prospectively organised registries are no alternative to RCTs in terms of the level of evidence they generate. Nevertheless, we must appreciate the above listed problems and restrictions of RCTs, especially their limited external validity, and recognise observational data collections with a lower level of evidence but a higher feasibility of being closer to life, i.e. the circumstances in day-to-day clinical situations. If we manage to implement well-organised registries with data of high validity and representativeness, we can take advantage of findings that are easier to generalise and yet have an acceptable level of evidence. If all spine surgeons contribute their bits and pieces of information to a large data pool, a mosaic-like picture will form. Within its structural pattern, clinical interrelationships can occur. Rare adverse events may suddenly show relationships to certain patient characteristics or variants in surgical technique and assessment of long-term outcomes helps to separate effective from truly efficient treatments. An essential methodological prerequisite is a common terminology for reporting results and a sophisticated technology that networks all participants so that one central data pool is created and accessed. The Spine Society of Europe has faced the challenge of such a supranational endeavour [30] and recently refined its technical setup [1, 29] in order to cope with aspects of patient and user confidentiality. Information about the Spine Tango project can be found on the SSE website under www.eurospine.org—Spine Tango.