Introduction

Periprosthetic fractures (PPFs) are severe complications that are costly to society and lead to increased patient morbidity [1]. Complication rates are still unacceptably high and range between 18 % and 54 % [24]. Physicians are confronted with an increasing number of PPFs, particularly of the femur but also acetabulum, tibia and humerus [5]. This can be addressed to a steady growth of endoprosthetic treatments worldwide within the recent decades due to the use of prostheses for younger and more active patients and increased life expectation. The incidence of post-operative PPFs following total hip arthroplasty (THA) remained almost unchanged over years and range from 0.3 % to 5.4 % [6]. Taking a closer look at the framework of one single treatment centre, absolute numbers appear to be very low. Characteristics of PPFs are strongly inhomogeneous regarding emergence, localisation and degree, which can be seen in the numerous classification systems. This highlights existing difficulties in conducting compelling clinical analyses with valid study samples in a prospective manner. Causes and mechanisms of PPFs and their predisposing factors remain poorly understood [2, 6]. In fact, it appears impossible to analyse interactions of risk factors, such as bone quality, implants and patient-related characteristics (i.e., age, sex, diagnosis,…) [6]. Efforts were made to address this problem in different ways:

  1. 1.

    Retrospective and prospective register analyses

  2. 2.

    Literature reviews

  3. 3.

    Biomechanical in vitro analyses

  4. 4.

    Computer in silico models

Most in vitro analyses investigate PPF refixation [7]; few biomechanical analyses characterise implant PPF behaviour or resulting fractures regarding initiation, localisation and severity of bony damage. This may be due to the type of investigation. Most analyses include a complex simulation of implants in their biomechanical environment under physiological or pathological loading conditions. Nevertheless, such an experiment only represents an approximation of the in vivo situation, which is influenced by a considerable number of parameters. This review provides an overview of biomechanical methods analysing PPFs around hip, knee and shoulder joints, thus enabling the reader to scrutinise methods and their scientific results in a targeted way and with respect to the clinical situation.

Methods for in vitro analyses

Although PPFs have been well recognised for >60 years [8], it was a very long time before the first experiments were applied to investigate them. . Not until Lesh et al. [9], in 2000, used biomechanical experiments investigating risk effects of anterior femoral cortex notching during total knee arthroplasty (TKA). Since this inauspicious beginning, only a small number of implants and bones have been investigated—by <20 research groups worldwide [932]. The majority of such studies focused directly on experimental PPF creation (destructive testing) [931]. In some cases, PPFs are initiated for further testing [11] or to validate numeric models [10]. One study evaluated the risk of PPF in a nondestructive manner [32]. Few but complex loading scenarios have been developed. Global distribution of research groups experimentally investigating PPFs, in decreasing order of frequency, is Germany (10), Canada (4), United States (4), Australia (3), Norway (1), Switzerland (1) and the United Kingdom (1). A direct comparison is challenging if not impossible due to different research objectives and results. PPF sites, corresponding implants and their fixation, as well as main research subjects, are listed in Table 1.

Table 1 Biomechanical periprosthetic fracture (PPF) investigations in recent literature

Choice of experimental material

As surrogates for human bone, artificial materials—such as Sawbones® (Pacific Research Laboratories)—are frequently used to examine the biomechanical behaviour of implants and to simulate standardised post-operative conditions [34]. Artificial bones were used in five PPF experiments [1416, 21, 22] and three times preliminarily to human specimens for validation of experimental setups and/or surgical protocols [14, 16, 22]. Although artificial bones feature low variability for torsional and axial stiffness [35], their fracture pattern is not comparable with human bones [36]. This may be due to different materials (i.e. epoxy-glass laminate vs. collagen fibres with embedded hydroxylapatite crystals or closed-cell polyurethane foam vs. force-oriented trabeculae), as well as technical optimisation of mass density and stiffness distribution with respect to healthy human bones [36]. Donor bones are often affected by osteopenia or osteoporosis, as with many PPF patients [2]. Fresh-frozen (FF) cadaveric bones are most frequently used for PPF testing [9, 10, 1214, 1620, 2227, 2932], as they closely resemble mechanical properties of corresponding living bones. Although formalin-embalmed and FF bones show comparable mechanical characteristics—i.e. femora and their axial load resistance until fracture occurrence [36]—embalmed bones represent the least used material [11, 28]. Both specimens feature heterogeneities of interacting parameters, such as bone mineral density (BMD), bone mineral content (BMC), mechanical properties, structure, size and patient age or sex [1114, 16, 1820, 22, 23, 2732].

Specimen handling

To reduce the number of influencing variables, it is appropriate to select specimens that compare as closely as possible with one or more of these parameters [10, 18] or to cover remaining and relevant parameters with a later consideration in a multifactorial regression model [13, 18, 19, 22, 23, 30, 32]. Regarding PPF investigations, matched pairs of bones with side randomisation between researched subject and a control [9, 12, 14, 17, 19, 2224, 27, 2931] should be applied. Matching of bones can be performed with regards to BMD [10, 20, 22]. Since mechanical properties of bones depend on BMD [10] and BMC [13], one of these factors was examined in most studies prior to in vitro fracturing [10, 12, 13, 16, 18, 20, 22, 23, 2532]. In order to differentiate between normal and osteoporotic bone, T scores [28] or the Singh index [30] may be calculated. Presumably due to its cost, quantitative computed tomography (qCT) [13, 20, 2527] is less frequently used compared with dual-energy X-ray absorptiometry (DEXA) measurements [10, 12, 13, 16, 18, 19, 22, 23, 2832]. Studies using both qCT and DEXA showed that both have correlating results [13, 37] and can be used equivalently in PPF analyses. All studies [914, 1620, 2232] show sex imbalances, but just a fraction consider sex as a statistically influencing variable [13, 18]. Few studies performed power analyses [9, 16, 2123, 32], which are—particularly with regard to human specimens—important in order to obtain substantial impact. Specimens must be moulded into a fixture frame—as described below—with fast-curing synthetic plastic. Fixture and plastic may consist of square [12, 13, 17, 18, 2224, 29, 31] or tube [9, 10, 15, 16, 21, 27]; steel chambers and PU [12, 17, 18, 20, 2527, 29, 31]; or polymethylmethacrylate (PMMA) [911, 1316, 19, 2123] and can be reinforced by pins or screws [9, 15, 16, 19, 21, 25, 27, 30].

Measures and independent variables

Almost all research groups consider fracture loads as the main outcome parameter [931]. Loads are measured during testing and can be used as absolute values or can be normalised by donor-specific parameters, i.e. bodyweight [12, 17, 18, 29, 31], to improve comparability. Regarding a quasi-static load application—as most frequently carried out [927, 2932]—there are two different types of forces generally preferred: the load-to-failure to provoke the PPF; the ultimate failure load, indicating the maximum possible force prior to the actual PPF (cp. Fig. 1a and b). Despite no explicit indication by, i.e. published load curves, the ultimate fracture load seems to be more frequently related to hip resurfacings [2224, 27] than for stemmed prostheses and shoulder, as well as knee prostheses. Loads are measured with force sensors, and these forces in turn are vectors equipped with three dimensions. On-board sensors of biomechanically deployed materials-testing machines usually offer the same number of dimensions as applicable force directions, namely, one or two. Therefore, actual fracture loads may partly be nondetectable when the use of technical devices such as x-y-slides [10, 12, 14, 17, 18, 29, 31] do not prevent transverse forces.

Fig. 1
figure 1

Fictive data of force-displacement-failure curves exemplifying: a failure load; b ultimate failure load as a result of in vitro periprosthetic fracture (PPF) generation

Loading configurations

The emergence of post-operative PPFs is associated with a combination of events, such as overloading or trauma [2, 10, 1618, 25, 26]. Standardised loading conditions are missing, and there is no consensus on how bones with implanted prostheses should be loaded in vitro to achieve realistic PPFs. Some research groups—i.e. Hamburg [10, 25, 26], Toronto [13, 22, 23] or Heidelberg [17, 18, 31]—recognise their own loading conditions conspicuously as validated and use them repeatedly for different issues. Most groups try to simulate loadings based on activities of daily living (ADLs) and pick out a single load vector (static) from joint-loading curves (dynamic) [38]—i.e. for the hip at single-legged stance [15, 19, 22, 23] or at loading response [17, 18, 31, 32]. Regardless of whether testing machines with uniaxial (1 DOF) [10, 12, 17, 18, 2731] or biaxial (2 DOF) [11, 1316, 19, 21] loading mode are used, there is general agreement to experimentally provoke PPFs under combined loading conditions, forces and moments [911, 13, 15, 1719, 2128, 31, 32]. When using uniaxial testing machines, bi-planar alignment is needed to enable combined loading conditions. In contrast, the alignment of subjects in one body plane is sufficient when using biaxial testing machines [15, 19, 21]. Alignment may be performed by either of two fundamental procedures:

After initial fixation to spatial directions of a referring coordinate system shortly before testing [17, 18, 31]. Typical examples are coordinate systems from Bergmann et al. for hip joints [38] or from the International Society of Biomechanics (ISB) for the upper limb [39].

Direct combination of alignment and fixation of subjects regarding a desired position, as in accordance with the ISO 7206-4 implant-testing standard [10, 27, 28].

The advantage of the first procedure is that orientation with regard to bony landmarks is more precise compared with the second. Authors who report using the second option often prefer a so-called “anatomical” alignment—similar to the neutral position, referring to skeletal body planes [15, 21], whereas only one plane will be considered [9, 12, 13, 19, 2224, 32]; at least one resulting moment is missing in a uniaxial force production. To avoid this effect, bone alignment should be ensured in two planes [17, 18, 27, 28, 31]. Another approach is to align subjects “perpendicular” or “in parallel” to the loading vector [12, 14, 16, 20, 25, 26, 29, 30], sometimes to induce a completely different loading [20, 25, 26, 30] or a loading familiar to the investigated PPF event [12, 14, 16, 30]. Other authors attempt to create fracture patterns as close as possible to real PPFs and apply loads mainly responsible for each investigated fracture pattern—i.e. bending [9, 20, 25, 26] or torque [9, 11, 14, 16, 30]. What all studies have in common is that muscle forces are not considered experimentally, although they influence the loading behaviour of the bone [40]. The only femoral study simulating single-legged stance with a load-dependent contraction of the tractus iliotibialis by a lever arm, a wire cable and a pulley is from Wik et al. [32].

Interpreting results

Comparison with clinical fracture pattern

Comparison of artificial with clinically relevant PPF patterns may provide an adequate control when interpreting in vitro results. Comparable patterns provide evidence as to whether realistic fractures are obtained. Around hip stems, PPFs are classified using either anatomic locations with regard to the implant, a possibly pre-existing loosening, fracture type and/or a combination of these criteria [26]. Classifications by Johannson et al. [1] and Duncan and Masri (Vancouver classification) [33] are frequently used for comparisons, even though both employ cases of fractures induced by falls associated with axial-bending loads, direct-impact loads or torsional loads [16]. Thomsen et al. [31] induced Vancouver type A [33] and Johannson type I [1] fractures with cementless stems and Vancouver type C and Johannson type III fractures with cemented stems. These fractures are close to PPF patterns expected in traumatic events, even though conditional loading corresponds to walking. In contrast, clinically observed PPF patterns are only occasionally met when with the application of loads that are actually expected to be involved in traumatic events [19, 20, 25, 26]. A lateral four-point bending of the femur induces comparable PPFs, whereas torsional loading and ventral four-point bending do not. On the other hand, primarily induced torsional loads seem to cause Vancouver type B fractures [11, 15, 21]. Jakubowitz et al. [17, 18] observed longitudinal femoral splitting due to experimentally subsiding stems and therefore could not provoke clinically observed PPFs. They concluded that splitting may be covered by implants during radiological assessments and therefore remain undetected as a PPF when the stem is still fixated. Since PPFs can emerge from a lost stem [41] and longitudinal fractures are rarely found clinically, this splitting is considered as a pre-PPF condition [17, 18]. In regards to the femoral-neck-preserving Silent HipTM (DePuy), Bishop et al. [10] validated their in vitro results by clinically observed medial calcar cutouts. Since femoral-neck fractures were consistently observed in hip resurfacing, a corresponding PPF classification was never published. These fractures could be reproduced in their entirety in vitro [13, 2224, 27, 28]. Schlegel et al. [27] performed an initial characterisation of these resurfacing-induced fractures and formed subgroups witin their in vitro analysis.

Although they reported that in vitro loading can only be seen as aspects of physiological loads, Lesh et al. [9] produced a high comparability with clinically supracondylar PPFs frequently occurring in ventral femoral notching. Shawn et al. [30] came to similar conclusions when loading notched femora by axial torsions. Both authors reported good comparison with clinical PPF patterns published by Culp et al. [41] but did not consult an established classification reported by Rorabeck and Taylor [42].

Comparison of fracture loads

Comparison of in vitro PPF loads with clinical data is not possible. However, in vivo joint loadings may serve as reference values. Reported joint forces and torques acting on a femoral head during ADLs are not higher than 870 % body weight (BW) and 25 nm [38]. A tibial knee implant will be loaded up to 3372 N (walking) or 5165 N (jogging) and up to 10.5 nm [43]. Typical values of up to 123 % BW and 0.5 % BWm can be seen in the humeral head, although they comprise less demanding ADLs, such as lifting a coffee pot [44]. To attain a better level of comparability, we agree with calculating mean fracture loads and standard deviation (SD) or range.

There are few results reported for cemented hip stems. Morlock and co-workers [20, 26] reported 4692 ± 183 N concerning a four-point bending test and an ExeterTM stem (Stryker) in FF femurs. For the MS-30TM (Zimmer), Thomsen et al. [31] reported 7541 (2845–10,000) N when taking the withstanding FF femurs into account (machine-dependent 10 kN-constraint). Rupprecht et al. [25, 26] found fracture torques of 41 ± 9 nm for cemented ExeterTM stems within FF femurs. In contrast, 117 (89–133) nm were reported by Brew et al. [11] using ExeterTM stems and embalmed femurs. Values of 44 ± 13 nm and 105 ± 39 nm could be found for debonded compared with fixed EnduranceTM stems (DePuy) and FF femurs [16]. In the same study Harris et al. found values of 26 ± 4 and 41 ± 4 nm using Sawbones® (Model 1121) [16]. Ginsel et al. [15] found lower median values for the ExeterTM stem with a regular proximal body (157 nm) than for a large one, with a higher offset (237 nm) implanted in Sawbones® (Model 3403). With the same in vitro setup, Morishima et al. [21] described comparable values, with 156 nm for regular stems but a difference in distally shortened stems (132 nm). Reduced offset of the ExeterTM stem leads to lower PPF torques (137 nm) compared with an extended offset (180 nm) [15, 21]. PPF loads for cemented stems are therefore always higher than in reported in vivo stresses [38]. Smila data applies to hip resurfacing. Davis et al. [13] found PPF loads of 5743 (1820–10,292) N for the BHR® (Smith & Nephew) and FF specimens. PPF loads in males were twice as high as for females (7387 N vs. 3155 N; p < 0.01). Using the same method of in vitro setup and resurfacing implants, Olsen et al. [23] found 7012 ± 2619 N to be necessary for PPF emergence. Davis et al. [13] repored a neck BMD of 0.68 g/cm2 and Olsen et al. [23] of 0.87 g/cm2. Richards et al. [24] found ultimate PPF loads of 6218 (3888–10,940) N for neutral compared with 7185 (2428–13,122) N for valgus orientation (CCD + 16° ± 4°) BHR®, whereas BMD values were comparable. Schlegel et al. [27] also used FF femurs for the ASR® (DePuy). For strong, impacted implants, PPF load of 8873 (4398–9840) N was smaller compared with 9237 (5241–15,302) N for regular impacted implants. Using the ASR®, Schnurr et al. [28] performed the only cyclic in vitro loading test. Ignoring number of cycles, they found PPF loads of 6000 N for regular and up to 3000 N for osteoporotic specimens (T score less than −2.5).

Although cementless fixation is one of the leading PPF risk factor [2], such hip implants have been investigated much less frequently in in vitro studies. Jakubowitz et al. [17] described no differences, with PPF loads of 4825 (2651–7368) N for the short-stemmed Mayo® hip (Zimmer) and 5545 (3294–8102) N for the normal-stemmed CLS® (Zimmer). These values correspond to normalised PPF loadings of 751 (256–1669) and 855 (311–1655) % BW. Correlations exist between absolute PPF loads and BMD for the CLS® but not for the Mayo® stem. PPF loads of 5308 (3216–7647) N are described for CLS® in FF femurs from individuals <70 years old at death and 2519 (1725–3951) N in femur specimens from individals >77 years [18]. These results correspond to relative PPF loads of 654 (311–1075) and 445 (194–961) % BW. Kannan et al. [19] used FF femurs and found no differences in PPF torque loads between a loose and well-fixed ABG® stem (Stryker), with, respectively, 64 ± 20 and 65 ± 19 nm and rotational angles of 18 and 16°. Regarding the neck-preserving Silent HipTM, Bishop et al. [10] found PPF loads of 1724 (1095–2337) N. For the metaphyseal-fixating BMHR® (Smith & Nephew), Olsen et al. [22, 23] reported ultimate values of 4474 (1377–7363) N for a notched femoral stem and 4438 (1360–7011) N for the control group using FF femurs, with no differences [22]. They revealed differences (p < 0.01) in ultimate loads and PPF behaviour between the cemented BHR® and the cementless BMHR® [23]. FF femurs fractured at 7012 (2167–11,736) N for BHR® and at 5434 (1497–9504)TN for BMHR®. Most PPF loads of cementless hip implants are far below ADL loadings, i.e. a step to keep from falling (870 % BW [38]). In vitro PPF torque loads are around three times higher than during ADLs—at least for the cementless ABG® stem [19]—which actually precludes rotatory ADL events from PPF risk factors.

For the cemented medial unicompartimental Oxford Knee® (Biomet), Clarius et al. [12] describe PPF loads of 3900 (2300–8500) N for standard and 2600 (1100–5000) N for incorrect implantations with sawing defects at the dorsal tibial cortex [12]. Regarding cemented and cementless tibial components, Seeger et al. [29] reported 3617 (700–7000) N and 1950 (200–4300) N [29] as PPF loads. As already seen for hip stems, cementless tibial knee components showed PPF loads below (half of) in vivo loadings. As indicated by ranges, there seem to be risks for cemented components. Regarding femoral knee components, Lesh et al. [9] found differences (p ≤ 0.01) in PPF bending and torsional loads between regular and anteriorly notched femurs. Their reported values were 11,813 ± 1980 N and 9690 ± 2130 N, respectively, as well as 135 ± 35 and 82 ± 28 nm. Similar torsional values were found by Shawen et al. [30]. Torsional PPF loads differed (p < 0.01) between regular (144 nm) and anteriorly notched femurs (99 nm). In addition, correlations between PPF loads and distal and proximal femur BMDs are described. Although reports of bending by Lesh et al. [9] are not readily comparable with ADL loadings, all supracondylar PPFs showed in vitro torque loads beyond in vivo moments of tibiae [43].

Flurry et al. [14] performed the only shoulder PPF study. They used FF humeri and rectangular Promos® stem (Smith & Nephew) and compared them with the cylindrical Univers® (Arthrex). There were no differences in torque PPF loads, with 9 (3–23) and 16 (10–20)nm, respectively. As a function of body weight, these values could be well topped by in vivo loadings [44].

Discussion

Despite clinical increase of PPFs [2], experimental efforts investigated these fractures during the last 15 years only. It can be assumed that the late beginning of these efforts depended on improved knowledge of in vivo joint loadings, since both events approximately coincide. Although the number of research groups performing such experiments remains limited, considerable variations in experimental protocols are found. Differences are primarily related to type of bones used, investigated implant/fixation system, type and orientation of load application and inclusion of donor and/or bone-related data. With the exception of Schnurr et al. [28] (cyclic loading) and Wik et al. [32] (tractus iliotibialis simulation), all studies have three points in common: (i) they are carried out quasi-statically; (ii) they disregard any influencing muscle forces, as stated by Kassi et al. [40], and (iii) they always simulate a situation directly after implantation [17, 18]. This is not surprising, as for point (i) forces and torques leading to PPFs were mostly traumatic, single overload events rather than fatigue bone failures [2, 5, 45]; for point (ii), cost and time would be out of proportion to the value of all individual research questions, and for point (iii)—i.e. a later implant condition than the initial post-operative situation—it was not possible to simulate the situation in vitro. However, experiments represent an approximation to the in vivo situation, which is influenced—inter alia—by the mentioned limitations.

Three basic experimental approaches can be deducted from literature reports determined by each investigator’s view as to how PPFs can be provoked or the type of experimental PPF pattern. The first attempts to simulate ADL loadings to induce regular loading conditions to implants [13, 1719, 2224, 27, 28, 3032]. The second uses hypothetical pathologic loadings, such as torsional stresses, to simulate traumatic events [9, 11, 1416, 21]. The third attempts to produce clinically comparable fracture patterns, and therefore, the experimental protocol is conceived from associated implant loadings [12, 20, 25, 26, 29]. It stands to reason that these various differences in experimental approaches result in a vast complexity of attained results. Apart from intragroup values achieved within the same test setup, results between studies are not readily comparable. Comparing intragroup values is what PPF in vitro analyses were actually designed for, and it is an essential source of information when at least one independent variable can be varied in order to examine its impact on PPF loads and/or patterns. The important advantage of in vitro PPF analyses compared with clinical results is a detailed control or—even better—a reduction of disturbances, such as BMD [10, 12, 13, 16, 1820, 22, 23, 2532] or age [13, 18, 32], to avoid superposition of results by known PPF risk factors such as reduced BMD or greater age [46]. Clinical differences—such as between loosened vs. fixed hip stems [16]—are just comprehensible with in vitro experiments, since clinicians are not able to detect whether the implant was loosened prior to fracture. Nevertheless, these aspects indicate the lack of clarity regarding the extent only particular in vitro results are transferable into clinical practice. Second, these aspects play a considerable role in assessing suitability regarding particular experimental approaches used to answer questions arising from clinical PPFs.