Abstract
Purpose
The objective of this study was to analyze the interobserver reliability and intraobserver reproducibility of the new AOSpine thoracolumbar spine injury classification system in young Chinese orthopedic surgeons with different levels of experience in spinal trauma. Previous reports suggest that the new AOSpine thoracolumbar spine injury classification system demonstrates acceptable interobserver reliability and intraobserver reproducibility. However, there are few studies in Asia, especially in China.
Methods
The AOSpine thoracolumbar spine injury classification system was applied to 109 patients with acute, traumatic thoracolumbar spinal injuries by two groups of spinal surgeons with different levels of clinical experience. The Kappa coefficient was used to determine interobserver reliability and intraobserver reproducibility.
Results
The overall Kappa coefficient for all cases was 0.362, which represents fair reliability. The Kappa statistic was 0.385 for A-type injuries and 0.292 for B-type injuries, which represents fair reliability, and 0.552 for C-type injuries, which represents moderate reliability. The Kappa coefficient for intraobserver reproducibility was 0.442 for A-type injuries, 0.485 for B-type injuries, and 0.412 for C-type injuries. These values represent moderate reproducibility for all injury types. The raters in Group A provided significantly better interobserver reliability than Group B (P < 0.05). There were no between-group differences in intraobserver reproducibility.
Conclusions
This study suggests that the new AO spine injury classification system may be applied in day-to-day clinical practice in China following extensive training of healthcare providers. Further prospective studies in different healthcare providers and clinical settings are essential for validation of this classification system and to assess its utility.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
With industrialization, modernization of the transport and construction industries, and the evolution of sports, there is a growing incidence of traumatic injuries. Globally, spinal injuries constitute a significant proportion of traumatic musculoskeletal injuries. Evidence suggests that 75–90% of spinal fractures occur in the thoracal and lumbar regions, most commonly involving the junction (T10–L2) [1–3]. To promote communication between physicians, guide treatment decisions, improve patient outcomes, and further research, several thoracolumbar spine injury classification systems had been proposed. However, none of them are universally accepted or have attained widespread clinical use. Therefore, in 2003, the AOSpine Spinal Cord Injury and Trauma Knowledge Forum proposed a new AOSpine thoracolumbar spine injury classification system [4]. This classification is based on the evaluation of three basic parameters: (1) morphologic classification of the fracture; (2) neurological status; and (3) clinical modifiers.
A spinal fracture classification that is universally adopted should be comprehensive, clinically relevant, and demonstrates adequate reliability and reproducibility. Previously proposed thoracolumbar spine fracture classification systems, such as the Denis classification system [5], are not comprehensive and have low reliability and reproducibility [6]. The Mageral classification system is comprehensive [7], but it is too complicated to achieve universal acceptance in clinical practice. The Thoracolumbar Injury Classification System (TLICS) [8] proposed by the American spinal injury study group in 2005 requires magnetic resonance imaging (MRI) to demonstrate compromise of the posterior ligament complex, restricting its use to acute trauma settings and developing countries [9, 10]. In addition, this classification system has poor reliability in identifying injury to the posterior ligamentous complex [9].
The new AOSpine thoracolumbar spine injury classification system is the most recent thoracolumbar spine fracture classification system. It includes the merits of the multiple classification systems that are available in the literature and refines them. The AOSpine thoracolumbar spine injury classification system is based on a computed tomographic (CT) scan rather than magnetic resonance imaging. As a revision of the original Magerl AO classification system, it simplifies morphologic classification of the fracture, includes an evaluation of neurological deficit, and accounts for the presence or absence of important medical conditions that may affect treatment decisions. The AOSpine thoracolumbar spine injury classification system has good interobserver reliability and intraobserver reproducibility [4, 11, 12]. Kepler et al. developed a spine injury score for the AOSpine thoracolumbar spine injury classification system (TL AOSIS), and confirmed that the system was ideal for the establishment of a globally accepted treatment algorithm for thoracolumbar trauma [13].
Currently, reports on the reliability and reproducibility of the AOSpine thoracolumbar spine injury classification system by Chinese spinal surgeons are scarce. The objective of this study was to determine if the AOSpine thoracolumbar spine injury classification system can be reliably applied by Chinese orthopedic surgeons with different levels of experience in spinal trauma.
Methods
Patient population
This retrospective study included patients with acute, traumatic thoracolumbar spinal injuries treated at XXX Hospital between January 2015 and October 2015. Patient records were provided by the hospital database. This study was approved by the Institutional review board.
Patients were included if they had: (1) acute, traumatic thoracolumbar spinal injuries and (2) complete clinical records with imaging. Exclusion criteria were: patients with nontraumatic thoracolumbar fractures, including pathological bone fractures (i.e., fractures associated with spinal tumors and infections) and osteoporotic fractures. A consecutive series could not be used for this study, as a broad spectrum of spinal injuries was analyzed.
Procedure
Anteroposterior and lateral radiographs, as well as CT scans (axial images, sagittal reconstructions, and coronal reconstructions) were rated by six orthopedic surgeons who were divided into two groups according to their level of training in spinal trauma. Group A included three orthopedic surgeons who had 2 years of clinical experience in spinal trauma. Group B included three orthopedic surgeons who were postgraduates with 1 year of clinical experience in spinal trauma. The training of orthopedic surgeons in China includes 5 years of undergraduate study in the medical sciences to obtain a Bachelors degree and ≥6 years of graduate study to specialize and obtain a Master’s degree and doctorate. After obtaining a Bachelors degree and passing the National Medical Licensing Examination, it is possible to practice medical care in a hospital. In the current study, Group A included surgeons who had obtained a Masters degree, and had practiced in spinal trauma for 2 years. Group B included surgeons who were first year postgraduate students applying to take a Master’s degree in orthopedic (spine) surgery.
Cases were graded on two different occasions, one month apart. On the second occasion, the order of the cases was scrambled using a random number generator to avoid recall bias. When multiple injuries were present, the level of injury to be graded was designated. For A-type injuries, to ensure that the raters were assessing the same injury, only cases with single vertebral body injury (disregarding B and C coding) were included. For B-type and C-type injuries, only the most severe injury was considered; however, concurrent A-type or B-type injuries at the same level were graded.
Statistical analysis
Statistical analysis was conducted using the SPSS version 22. The Kappa coefficient (κ) was used to assess the interobserver reliability and intraobserver reproducibility of the classification system for the most severe injury type (i.e., A, B, or C) and subtypes for A-type and B-type injuries. Kappa coefficients were interpreted according to Landis and Koch [14], whereby κ values of 0.00–0.20 were defined as slight agreement or reproducibility; 0.21–0.40 were defined as fair agreement or reproducibility; 0.41–0.60 were defined as moderate agreement or reproducibility; 0.61–0.80 were defined as substantial agreement or reproducibility; and 0.81–1.00 were defined as almost perfect agreement or reproducibility. ANOVA was used to evaluate differences between the orthopedic surgeons in Group A and Group B. Statistical significance was reached at P < 0.05.
Results
This study included an analysis of 109 cases of acute, traumatic thoracolumbar spinal injuries (Table 1).
Interobserver reliability
The overall Kappa coefficient for all cases was 0.362, which represents fair reliability. The Kappa statistic was 0.385 for A-type injuries, 0.292 for B-type injuries, and 0.552 for C-type injuries (Table 2). These values represent fair reliability for A- and B-type injuries and moderate reliability for C-type injuries. Kappa coefficients by fracture subtypes are shown in Table 2.
Intraobserver reproducibility
The Kappa coefficient for intraobserver reproducibility was 0.442 for A-type injuries, 0.485 for B-type injuries, and 0.412 for C-type injuries. These values represent moderate reproducibility for all injury types. Kappa coefficients by fracture subtypes are shown in Table 3.
Between-group comparison
Interobserver reliability was significantly better in Group A compared to Group B (P < 0.05) (Table 4). There were no significant between-group differences in intraobserver reliability (P > 0.05).
Discussion
In the current study, we retrospectively reviewed 109 patients with acute thoracolumbar spine fracture using the new AOSpine thoracolumbar spine injury classification system. It is known that morphologic classification of a spinal fracture is an important but challenging parameter to evaluate. In practice, not all clinical assessments and pre-operative plans are conducted by the most experienced surgeons. In fact, a well-designed classification system must show adequate reliability and reproducibility in residents as well as attending doctors. Therefore, we investigated whether the AOSpine thoracolumbar spine injury classification system can be reliably applied by Chinese orthopedic surgeons with different levels of experience in spinal trauma.
A spinal injury classification system should be comprehensive enough to include different patterns of spinal trauma and should demonstrate adequate reliability and reproducibility [15]. Previous classifications, including the Denis classification [5, 6], the Magerl classification [6, 16–18], and the TLICS [8], have shown low interobserver reliability, making their widespread adoption difficult. Our results demonstrated fair interobserver reliability for the morphologic grading of fracture type in the new AO spine injury classification system, including fair interobserver reliability for A-type and B-type injuries and moderate interobserver reliability for C-type injuries. Interobserver reliability was lower in the current study compared to that reported by the group of surgeons that developed the classification system, but it is usually difficult for independent studies to duplicate the reliability of classification systems as originally reported [16, 17]. The interobserver reliability of the new AO spine injury classification system for the morphologic grading of fracture type and subtype in the current study was also lower than previously reported by Urrutia [15] and Kepler [11]. The raters in the latter studies were more experienced in spinal fractures than the raters in the current study, and they may have previously used the Magerl classification or the TLICS classification in clinical practice, which could explain our discrepant results.
In accordance with the findings in the current study, Vaccaro [4] and Urrutia [15] found lower interobserver agreement when classifying B-type injuries compared to A-type or C-type injuries. This observation confirms that accurately evaluating the posterior or anterior tension band is challenging, as was previously reported for the Magerl [19, 20] and TLICS classification systems [9, 10, 21]. The new AO spine injury classification system uses CT scans to evaluate morphologic classification of the fracture. Therefore, diagnosis of posterior ligamentous complex damage relies on clinical examination, X-ray, and CT scan. X-ray and CT scan are considered useful in the diagnosis of bone injury [22, 23]. However, it has been suggested that X-ray and CT scan have limited utility in the diagnosis of ligamentous injuries [24]. As reported by the Spine Trauma Study Group [25], certain indirect factors indicate the presence of complex lesions, including vertebral translation, interspinous space greater than in adjacent levels (over 2 mm according to Daffner [26], facet joint diastasis seen in CT scan, local kyphosis without vertebral injury (over 20° according to Nagel [27], facet joint diastasis seen in X-ray, palpable interspinous gap, spinous avulsion, and vertebral compression exceeding 50% without lesion of posterior wall. Rajasekaran et al. confirmed that CT was necessary for all injuries for accurate classification based on the new AOSpine thoracolumbar classification system, and compared to MRI, and no significant difference was found in terms of assessment of fracture stability or management with the exception of improved identification of B2 fractures [28]. Therefore, it is necessary for spinal surgeons to be trained in the new AO spine injury classification system and have certain clinical experience when evaluating the integrity of the posterior wall of the vertebral body and the posterior or anterior tension band by X-ray and CT scan.
In the current study, the observers were divided into two groups: Group A (higher level of experience level) and Group B (lower level of experience level) according to the surgeon’s clinical experience. It has been suggested that a spinal surgeon’s level of experience does not substantially influence the application of a classification system or interobserver reliability [4, 15]. In contrast, our results suggest that the interobserver reliability was significantly better in Group A than Group B. In accordance with Sadiqi [29], significant differences were not observed between Group A and Group B in intraobserver reproducibility. Group A achieved interobserver reliability >0.55, which may be considered the minimal level for a classification system to be useful [30]. These data suggest that the new AO spine injury classification system may be applied in day-to-day clinical practice in China following extensive training of healthcare providers; however, this classification system is associated with a steep learning curve.
Vaccaro [8] recognized the limitations of MRI, namely, the relatively poor reliability associated with the identification of posterior ligamentous complex injuries, and acknowledged that a classification system heavily dependent on MRI would be unlikely to gain widespread use in the developing world. Guen Young Lee [31] demonstrated that the reliability of MRI for assessing posterior ligamentous complex integrity according to the TLICS was fair to moderate (κ = 0.440 for the first and 0.389 for the second review). However, MRI shows higher sensitivity, specificity, and accuracy in distinguishing ligamentous lesions versus CT [32, 33] and may reduce the risk of failure to diagnose a posterior ligamentous complex injury and associated late deformity [17, 19]. Furthermore, MRI can be useful for diagnosing subtle posterior ligamentous complex injuries, particularly in situations, where fracture displacement on presentation is not representative of maximal displacement at the time of injury. In addition, MRI is often helpful in determining the location and severity of neurological compromise and identifying injury to non-bony structures. Evidence suggests that posterior ligamentous complex integrity plays an important role in fracture stability [34]. In addition, evaluation of neurological status is critical for a complete assessment of a patient’s functional status and eventual prognosis, and is an important factor when making decisions about the need for surgery. Therefore, completing an MRI examination is helpful for young surgeons to establish definitive diagnoses and makes appropriate therapeutic decisions for patients with acute spinal injuries.
This study was associated with several limitations. First, we only investigated the reliability and reproducibility of the morphologic classification of the fracture using the new AO spine classification system. Second, our observers were all young orthopedic surgeons; a study, including surgeons with a greater level of experience, will be informative. Third, we did not verify the presence of a posterior ligamentous complex injury by MRI or surgically. Finally, our study was a preliminary retrospective study based on radiology. To identify fracture types that should be managed conservatively or with surgery and minimize the limitations associated with the current study, prospective randomized control trials in different healthcare providers and clinical settings are warranted.
Conclusions
It is well known that a universally accepted classification system should be reliable and reproducible, has prognostic implications, predicts probabilities of complications, and guides treatment decision making [35]. In the current study, our results showed relatively low overall interobserver reliability and intraobserver reproducibility and demonstrated that the spinal surgeon’s level of experience does substantially influence the classification and interobserver reliability of the new AOSpine thoracolumbar spine injury classification system in young Chinese spinal surgeons. However, the new AO spine injury classification system surpassed the minimal level for a classification scheme to be considered useful in the more experienced surgeons in this study. These data suggest that the new AO spine injury classification system may be applied in day-to-day clinical practice in China following extensive training of healthcare providers.
References
Hu R, Mustard CA, Burns C (1996) Epidemiology of incident spinal fracture in a complete population. Spine (Phila Pa 1976) 21:492–499
Wood KB, Buttermann GR, Phukan R, Harrod CC, Mehbod A, Shannon B, Bono CM, Harris MB (2015) Operative compared with nonoperative treatment of a thoracolumbar burst fracture without neurological deficit: a prospective randomized study with follow-up at sixteen to twenty-two years. J Bone Joint Surg Am 97:3–9
Gertzbein SD (1992) Scoliosis Research Society. Multicenter spine fracture study. Spine (Phila Pa 1976) 17:528–540
Vaccaro AR, Oner C, Kepler CK, Dvorak M, Schnake K, Bellabarba C, Reinhold M, Aarabi B, Kandziora F, Chapman J, Shanmuganathan R, Fehlings M, Vialle L (2013) AOSpine thoracolumbar spine injury classification system: fracture description, neurological status, and key modifiers. Spine (Phila Pa 1976) 38:2028–2037
Denis F (1983) The three column spine and its significance in the classification of acute thoracolumbar spinal injuries. Spine (Phila Pa 1976) 8:817–831
Wood KB, Khanna G, Vaccaro AR, Arnold PM, Harris MB, Mehbod AA (2005) Assessment of two thoracolumbar fracture classification systems as used by multiple surgeons. J Bone Joint Surg Am 87:1423–1429
Magerl F, Aebi M, Gertzbein SD, Harms J, Nazarian S (1994) A comprehensive classification of thoracic and lumbar injuries. Eur Spine J 3:184–201
Vaccaro AR, Lehman RA Jr, Hurlbert RJ, Anderson PA, Harris M, Hedlund R, Harrop J, Dvorak M, Wood K, Fehlings MG, Fisher C, Zeiller SC, Anderson DG, Bono CM, Stock GH, Brown AK, Kuklo T, Oner FC (2005) A new classification of thoracolumbar injuries: the importance of injury morphology, the integrity of the posterior ligamentous complex, and neurologic status. Spine (Phila Pa 1976) 30:2325–2333
Harrop JS, Vaccaro AR, Hurlbert RJ, Wilsey JT, Baron EM, Shaffrey CI, Fisher CG, Dvorak MF, Oner FC, Wood KB, Anand N, Anderson DG, Lim MR, Lee JY, Bono CM, Arnold PM, Rampersaud YR, Fehlings MG (2006) Intrarater and interrater reliability and validity in the assessment of the mechanism of injury and integrity of the posterior ligamentous complex: a novel injury severity scoring system for thoracolumbar injuries. Invited submission from the Joint Section Meeting on Disorders of the Spine and Peripheral Nerves, March 2005. J Neurosurg Spine 4:118–122
Rihn JA, Yang N, Fisher C, Saravanja D, Smith H, Morrison WB, Harrop J, Vacaro AR (2010) Using magnetic resonance imaging to accurately assess injury to the posterior ligamentous complex of the spine: a prospective comparison of the surgeon and radiologist. J Neurosurg Spine 12:391–396
Kepler CK, Vaccaro AR, Koerner JD, Dvorak MF, Kandziora F, Rajasekaran S, Aarabi B, Vialle LR, Fehlings MG, Schroeder GD, Reinhold M, Schnake KJ, Bellabarba C, Cumhur Oner F (2016) Reliability analysis of the AOSpine thoracolumbar spine injury classification system by a worldwide group of naive spinal surgeons. Eur Spine J 25(4):1082–1086
Azimi P, Mohammadi HR, Azhari S, Alizadeh P, Montazeri A (2015) The AOSpine thoracolumbar spine injury classification system: a reliability and agreement study. Asian J Neurosurg 10(4):282–285
Kepler CK, Vaccaro AR, Schroeder GD, Koerner JD, Vialle LR, Aarabi B, Rajasekaran S, Bellabarba C, Chapman JR, Kandziora F, Schnake KJ, Dvorak MF, Reinhold M, Oner FC (2015) The Thoracolumbar AOSpine Injury Score. Global Spine J 6(4):329–334
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Urrutia J, Zamora T, Yurac R, Campos M, Palma J, Mobarec S, Prada C (2015) An independent interobserver reliability and intraobserver reproducibility evaluation of the new AOSpine Thoracolumbar Spine Injury Classification System. Spine (Phila Pa 1976) 40:E54–E58
Oner FC, Ramos LM, Simmermacher RK, Kingma PT, Diekerhof CH, Dhert WJ, Verbout AJ (2002) Classification of thoracic and lumbar spine fractures: problems of reproducibility. A study of 53 patients using CT and MRI. Eur Spine J 11:235–245
Blauth M, Bastian L, Knop C, Lange U, Tusch G (1999) Inter-observer reliability in the classification of thoraco-lumbar spinal injuries. Orthopade 28:662–681
Aebi M (2010) Classification of thoracolumbar fractures and dislocations. Eur Spine J 19(Suppl 1):S2–S7
Schnake KJ, von Scotti F, Haas NP, Kandziora F (2008) Type B injuries of the thoracolumbar spine: misinterpretations of the integrity of the posterior ligament complex using radiologic diagnostics. Unfallchirurg 111:977–984
Leferink VJ, Veldhuis EF, Zimmerman KW, ten Vergert EM, ten Duis HJ (2002) Classificational problems in ligamentary distraction type vertebral fractures: 30% of all B-type fractures are initially unrecognised. Eur Spine J 11:246–250
Whang PG, Vaccaro AR, Poelstra KA, Patel AA, Anderson DG, Albert TJ, Hilibrand AS, Harrop JS, Sharan AD, Ratliff JK, Hurlbert RJ, Anderson P, Aarabi B, Sekhon LH, Gahr R, Carrino JA (2007) The influence of fracture mechanism and morphology on the reliability and validity of two novel thoracolumbar injury classification systems. Spine (Phila Pa 1976) 32:791–795
Parizel PM, van der Zijden T, Gaudino S, Spaepen M, Voormolen MH, Venstermans C, De Belder F, van den Hauwe L, Van Goethem J (2010) Trauma of the spine and spinal cord: imaging strategies. Eur Spine J 19(Suppl 1):S8–S17
Bagley LJ (2006) Imaging of spinal trauma. Radiol Clin North Am 44:1–12 (vii)
Vaccaro AR, Rihn JA, Saravanja D, Anderson DG, Hilibrand AS, Albert TJ, Fehlings MG, Morrison W, Flanders AE, France JC, Arnold P, Anderson PA, Friel B, Malfair D, Street J, Kwon B, Paquette S, Boyd M, Dvorak MF, Fisher C (2009) Injury of the posterior ligamentous complex of the thoracolumbar spine: a prospective evaluation of the diagnostic accuracy of magnetic resonance imaging. Spine (Phila Pa 1976) 34(23):E841–E847. doi:10.1097/BRS.0b013e3181bd11be
Vaccaro AR, Lee JY, Schweitzer KM Jr, Lim MR, Baron EM, Oner FC, Hulbert RJ, Hedlund R, Fehlings MG, Arnold P, Harrop J, Bono CM, Anderson PA, Anderson DG, Harris MB (2006) Assessment of injury to the posterior ligamentous complex in thoracolumbar spine trauma. Spine J 6:524–528
Daffner RH, Deeb ZL, Goldberg AL, Kandabarow A, Rothfus WE (1990) The radiologic assessment of post-traumatic vertebral stability. Skeletal Radiol 19:103–108
Nagel DA, Koogle TA, Piziali RL, Perkash I (1981) Stability of the upper lumbar spine following progressive disruptions and the application of individual internal and external fixation devices. J Bone Joint Surg Am 63:62–70
Rajasekaran S, Vaccaro AR, Kanna RM, Schroeder GD, Oner FC, Vialle L, Chapman J, Dvorak M, Fehlings M, Shetty AP, Schnake K, Maheshwaran A, Kandziora F (2016) The value of CT and MRI in the classification and surgical decision-making among spine surgeons in thoracolumbar spinal injuries. Eur Spine J. doi:10.1007/s00586-016-4623-0
Sadiqi S, Oner FC, Dvorak MF, Aarabi B, Schroeder GD, Vaccaro AR (2015) The Influence of Spine Surgeons’ Experience on the Classification and Intraobserver Reliability of the Novel AOSpine Thoracolumbar Spine Injury Classification System-An International Study. Spine (Phila Pa 1976) 40:E1250–E1256
Sanders RW (1997) The problem with apples and oranges [editorial]. J Orthop Trauma 11:465–466
Lee GY, Lee JW, Choi SW, Lim HJ, Sun HY, Kang Y, Chai JW, Kim S, Kang HS (2015) MRI inter-reader and intra-reader reliabilities for assessing injury morphology and posterior ligamentous complex integrity of the spine according to the Thoracolumbar Injury Classification System and Severity Score. Korean J Radiol 16:889–898
Grunhagen J, Egbers HJ, Heller M, Reuter M (2005) Comparison of spine injuries by means of CT and MRI according to the classification of Magerl. Rofo 177:828–834
Lee HM, Kim HS, Kim DJ, Suk KS, Park JO, Kim NH (2000) Reliability of magnetic resonance imaging in detecting posterior ligament complex injury in thoracolumbar spinal fractures. Spine (Phila Pa 1976) 25:2079–2084
Oner FC, van Gils AP, Faber JA, Dhert WJ, Verbout AJ (2002) Some complications of common treatment schemes of thoracolumbar spine fractures can be predicted with magnetic resonance imaging: prospective study of 53 patients with 71 fractures. Spine (Phila Pa 1976) 27:629–636
van Middendorp JJ, Audige L, Hanson B, Chapman JR, Hosman AJ (2010) What should an ideal spinal injury classification system consist of? A methodological review and conceptual proposal for future classifications. Eur Spine J 19:1238–1249
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None of the authors has any potential conflict of interest.
Rights and permissions
About this article
Cite this article
Cheng, J., Liu, P., Sun, D. et al. Reliability and reproducibility analysis of the AOSpine thoracolumbar spine injury classification system by Chinese spinal surgeons. Eur Spine J 26, 1477–1482 (2017). https://doi.org/10.1007/s00586-016-4842-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00586-016-4842-4