Introduction

Ultrasound of the wrists is an increasingly used technique of detecting abnormalities of median nerve and related structures in patients with carpal tunnel syndrome (CTS) [1]. Increased nerve cross-sectional area (CSA) and other abnormalities like intraneural echogenic signal changes or vascularity or abnormalities of related structures like flexor tenosynovitis or bowing of the carpal ligament are important ultrasound findings [2]. One important factor hampering the routine use of median nerve sonography is the difficulty of determining the CSA threshold, which varies widely (9 to 15 mm2) according to various reports [2,3,4,5]. Ratio between the median nerve CSA at the proximal forearm and the carpal tunnel was reported to improve accuracy; however, reliability testing and control groups were lacking in the reports [6,7,8].

Several additional measures like median nerve flattening ratio (FR), median nerve echogenicity and vascularity, as well as palmar retinacular bowing, and carpal tunnel volumes have also been looked at. Most studies concur that the best discriminatory ultrasound criteria to diagnose CTS is CSA at proximal carpal tunnel (CSAp) [9,10,11,12]. One previous ultrasound study had also shown some importance of the distal carpal tunnel CSA (CSAd) [13].

In the present study, we report results from a large prospective cohort of patients of CTS and controls and develop and validate a composite ultrasound (US) score for diagnosis of CTS. Also, we report prediction of treatment response after US-guided carpal tunnel injection based on baseline ultrasonography.

Methods

This prospective observational study was done at the Department of Rheumatology, IPGME&R, Kolkata, India, from January 2017 to March 2019. Patients with symptomatic CTS were included. A diagnosis of CTS was made based on typical clinical symptoms and a positive nerve conduction study (NCS). NCS were performed according to the American Academy of Neurology standards [14]. A difference of > 0.4 ms between the median and ulnar sensory peak latencies or a prolonged median distal motor latency of > 4 ms was taken as confirmatory electrophysiological evidence of CTS. Patients who received local corticosteroid injections within the past 6 months or who had undergone previous carpal tunnel release surgery were excluded. Patients who came in wrist ultrasound examinations for evaluation of rheumatological indications or for evaluation of hand arthritis without any definite symptoms of CTS and had negative concurrent NCS examinations for CTS were taken as control subjects.

All patients underwent ultrasound assessment of both carpal tunnels by one of the authors (RPG, 3 years of musculoskeletal ultrasound experience) using a high-resolution linear array 18 MHz transducer (Esaote MyLab 25 Gold, Genoa, Italy). Representative images are presented in Figures S1 and S2. Wrists were examined in supine neutral position with the hand and wrist resting on a table and the fingers partially flexed. Transverse and longitudinal US scans of the volar distal forearm, wrist, and carpal regions were performed. The cross-sectional areas were measured proximally at tunnel inlet at the level of pisiform (CSAp) and distally at the tunnel outlet near the hook of hamate (CSAd). The CSA was measured using a continuous tracing method along the inner border of the epineurium. Other measures noted were: presence or absence of flexor tenosynovitis; presence or absence of bowing of the transverse carpal ligament defined as height > 4 mm from the palmar apex of the retinaculum perpendicular to a tangential line drawn between the most volar aspects of the pisiform and scaphoid bones at the tunnel inlet; intraneural echogenicity changes defined as non-visibility of normal neural fasciculation; and the presence or absence of intraneural vascularity at any part of the median nerve.

Inter-rater reliability was tested for CSAp, CSAd, and the other ultrasonographic parameters like bowing, flexor tenosynovitis, and intraneural echogenicity changes. For inter-rater reliability, three different authors (two rheumatologists (PG and RPG) and one radiologist (DL)) independently assessed all the measures described above. They were blinded to each other’s observations while taking the measurements. For inter-rater reliability, the same measures were taken by each of the specified authors on the same machine on 18 different patients. Inter-rater reliability was measured for absolute agreement with a two-way mixed effect models and reported as intraclass correlation (ICC (3,3)). We interpreted ICC values as follows: values < 0.5 indicate poor reliability; values between 0.5 and 0.75 indicate moderate reliability; values between 0.75 and 0.9 indicate good reliability; and values > 0.90 indicate excellent reliability.

ICC (3,3) for CSAp was 0.92 (95% CI: 0.79–0.97, p < 0.001), for CSAd it was 0.88 (95% CI: 0.72–0.95, p < 0.001), for bowing it was 0.89 (95% CI: 0.77–0.96, p < 0.001), for flexor tenosynovitis it was 0.593 (95% CI: .085–0.84, p = 0.015), and for intraneural echogenicity changes it was 0.63 (95% CI: 0.2–0.85, p = 0.006).

Following additional variables were also noted: age in years, gender, duration of symptoms in months, and body mass index (BMI in kg/m2 taken both as a linear variable and as categories).

Statistical analysis

Eight potential risk factors were considered (age, gender, BMI, CSAp, CSAd, bowing, flexor tenosynovitis, intraneural vascularity, and intraneural echogenicity changes) for model development. Baseline characteristics of those with and without CTS were compared by Mann-Whitney U test or Chi-squared test or Fisher’s exact test. Receiver operating characteristic (ROC) curve were drawn for CSAp and CSAd, and optimal cut-offs were determined for diagnosis of CTS. For entry into model, adjusted odds ratios (aOR) were constructed by logistic regression or linear regression after correction for age and sex, and variables with p value < 0.2 were included for multivariable model selection. Different models for predicting CTS were compared, using the area under the receiver operating characteristic (ROC) curve (AROC), the Hosmer–Lemeshow (HL) χ2 statistic (useful agreement measure between events predicted and events observed: an HL χ2 statistic < 20 represents good calibration with a p ≥ 0.01) [15], the Akaike information criterion (AIC) [16], and the Bayesian information criterion (BIC) [17], to assess goodness of fit. Dividing the beta coefficient for each variable in the final model by the absolute value of the lowest beta coefficient of the entered variables and then multiplying by 2 and rounding to whole numbers derived a simple scoring system. Optimal diagnostic cut-off was determined in a cohort of 26 patients and 16 controls by ROC analysis [18]. Model validation was done in cohort 39 subjects (24 with CTS and 15 without). Diagnostic sensitivities and specificities were calculated for the US Composite Score and also the determined cut-offs for CSAp and CSAd.

Prediction of relapse based on baseline demographics and US data was done with Cox proportional hazard ratio tests. Analyses were conducted using SPSS ver. 21 (IBM, USA) and R version 3.6.0 (released on 26/04/2019).

Ethical statement

Our study complies with the Declaration of Helsinki and was reviewed by the Institutional Ethics Committee of IPGME&R, Kolkata, and the ethics committee has approved the research protocol (ethics no. IPGME&R/IEC/2018/042), and prior written informed consent has been obtained from all the subjects.

Results

Description of the cohort

The entire cohort consisted of 141 patients (85.8% (121/141) females, mean age 46.5 ± 8.6 years, median disease duration 12 months (interquartile range (IQR): 6–24)) with CTS and 99 control subjects (77.8% (77/99) females, mean age 44.3 ± 11.9 years). Prevalence of overweight and obesity among the CTS patients were 33.3% (47/141) and 7.8% (11/141), respectively, and among control subjects were 27.3% (30/99) and 12.1% (12/99), respectively. Among comorbidities, RA was the commonest being present in 11.3% (16/141) of the patients with CTS and 12.1% of control subjects (12/99) followed by T2DM (5.7% (8/99) of CTS and none in controls), hypothyroidism (2.1% (3/141) of CTS patients and 1% (1/99) of controls), SpA (1.4% (2/141) of CTS patients and 1% (1/99) of controls), and others (1 each of hand osteoarthritis, systemic lupus erythematosus, unclassified connective tissue disease, chronic urticarial, and interstitial lung disease, all among patients with CTS). We examined 479 wrists, among which 257 had carpal tunnel syndrome and 222 did not. Bifid median nerve was present in 4, all with CTS and 1 had persistent median artery.

Derivation of US score

Of the 479 wrists, the first 400 examined scans (209 with CTS and 191 without CTS) were used for derivation of US score for diagnosis of CTS. The ROC analysis–derived optimal cut-off values for CSAp and CSAd were 9.5 mm2 and 10.5 mm2, respectively. For inclusion in the multivariate model, the following levels of diameters (mm2) of median nerve were taken at both proximal and distal carpal tunnel: ≤ 8, 8–≤ 12, > 12. Univariate associations of individual independent variables are shown in Table S1. Variables with p < 0.2 in the univariate logistic regression adjusted for age and sex were included into the multivariate model. Four different models were tested with stepwise removal of BMI, gender, and intraneural vascularity (Table 1). Model 4 was retained as the final model as it had equivalent discriminative ability compared with the previous models (AROC) and good calibration (HL chi-squared) and the lowest information criteria. The final model with beta coefficients, odds ratios, and point scores assigned to each category is given in Table 2. The cut-off value for the composite US score for a diagnosis of CTS was composite score ≥− 1.

Table 1 Diagnostic characteristics of logistic regression model development
Table 2 Final logistic regression model including US scoring for CTS (Model 4 in Table 1)

External validation of US score

External validations of the three metrics (composite US score, CSAp > 9.5 mm2 and CSAd > 10.5 mm2) were tested on a cohort of 40 subjects (24 with CTS and 16 without) examining 79 wrists (48 with CTS and 31 without) with CTS. Diagnostic performances of the composite index and individual US parameters are given in Table 3. Among all the US parameters, the composite US score had the highest diagnostic accuracy (89.87%, 95% CI: 81.02 to 95.53%).

Table 3 Results from external validation cohort: comparison of individual criterion and the composite US criteria

Response to treatment with US-guided carpal tunnel injection

Treatment response

Treatment responses from 88 injections were available with a median duration of follow-up of 6 months (IQR: 3–12). No initial response was seen in 7.95% (7/88, 95% CI: 3.91–15.52), and satisfactory response was observed in 69.32% (61/88, 95% CI: 59.04–77.98). Relapses occurred in 30.86% (25/81, 95% CI: 21.9–41.6) after a median time to relapse of 2 months (95% CI: 1.4–2.6).

Prediction of response

Association of satisfactory response and relapse with demographic and US variables are given in Tables S2 and S3. Initial satisfactory response was predicted by presence of flexor retinaculum bowing (odds ratio (OR): 5.43, 95% confidence interval (CI): 1.45–20.3, p = 0.012). In multivariate Cox proportional hazard ratio model (Fig. 1; Table S4), relapse was predicted positively by age (hazard ratio (HR) 1.168, 95% CI: 1.076–1.268, p = 0.0002), male gender (HR: 8.102, 95% CI: 2.394–27.422, p = 0.0007), and presence of bowing of flexor retinaculum (HR: 46.982, 95% CI: 5.048–437.293, p = 0.0008); and negatively by overweight or obese BMI (HR: 0.238, 95% CI: 0.064–0.892, p = 0.0332). The hazard functions are plotted from the Cox regression model with two of the most significant variables: gender and bowing while keeping age at the median, depicted in Figure S3. It is indicating that the survival probabilities for relapse free follow-up for male subjects are significantly lower compared with the female subjects. Also for both male and female patients, the presence of the bowing significantly decreases the survival probabilities of relapse-free follow-up after an initial successful steroid injection.

Fig. 1
figure 1

Forest plot of hazard ratios (HR) of prediction of relapse of CTS symptoms after an initial response to local steroid injection from the Cox proportional hazard model. Details of the model can be obtained from text and Table S4. X-axis—probability of relapse (individual data points depicted as black diamonds are actual HR values and horizontal bars are the confidence intervals (CI)) as horizontal error bars. Dotted vertical line depicts HR = 1. Leftwards vertical lines depict the consecutive HRs 2 through 10. Y-axis depicts the predictor variables. Bowing bowing of flexor retinaculum, CI confidence interval, CSAp proximal median nerve cross-sectional area in mm2 either ≤ 8 or 8–12, CSAd distal median nerve cross-sectional area in mm2 either ≤ 8 or 8–12, HR hazard ratio, p p values

Discussion

Our data indicate a high diagnostic accuracy and reliability of composite US score in the diagnosis of CTS. Determination of CSAp along with additional criteria like bowing, flexor tenosynovitis, and intraneural echogenicity changes are important components of this score. We also observed that while majority of patients respond initially well to US-guided corticosteroid injection, almost 1/3 experience an early relapse. Satisfactory initial response was predicted by presence of flexor retinaculum bowing. Bowing also predicted relapse of symptoms after initially good response.

We compared different ultrasound methods to determine median nerve swelling and other carpal tunnel abnormalities detectable on the US to diagnose CTS. Majority of earlier studies investigated the median nerve at the scaphoid-pisiform level (CSAp), whereas CSAd was less commonly assessed [2]. In a recent study, Dejaco et al. compared various techniques and areas of CSA measurements and concluded that CSA at its maximal shape is probably the simplest way to investigate median nerve swelling in clinical practice and has a high diagnostic accuracy and good reproducibility. Also a recent systematic review favored this technique because the site of maximal median nerve swelling varies between CTS patients [2, 19, 20].

Ultrasound is now considered a credible, dynamic, and real-time alternative test to NCS in the diagnosis of CTS, although a US CTS diagnostic criterion is still controversial. Ultrasound clearly has advantages in terms of patient comfort, accessibility, time, cost, and determining secondary causes of CTS. The more widespread use of ultrasound as a primary or alternative investigation rests largely on improving its diagnostic accuracy.

The CSA of the median nerve, the most commonly applied criterion for diagnosing CTS on ultrasound, is however imperfect as a sole criterion due to several reasons, like, its relation to body size, physiological variability, and possible ethnic differences, among others [14]. Therefore, instead of applying a single diagnostic criterion, as in majority of previous studies, we sought to define a composite criterion to enhance accuracy [6, 9,10,11, 21, 22].

As a standalone criterion, both CSAp and CSAd were tested. A CSAp of > 9.5 mm2 achieved 79% sensitivity, 84% specificity, and 84% diagnostic accuracy. In previous reports, there is a wide variation in the reported cut-off values for CSAp varying from 10 to 14 mm2 [2, 5, 9, 14, 21, 23]. Sensitivity and specificity varied in these studies from 38.2 to 97% and 57 to 97%, respectively. Though the present cut-off is lower than the previously reported cut-off values, our reported diagnostic accuracy is well within the previously reported range. Whether this is related to ethnicity or not is an open question. CSAd is a less commonly studied criterion and one previous study reported that CSAd cut-off value of > 14 mm2 yielded sensitivity, specificity, and accuracy of 63.6%, 100%, and 78.9%, respectively. Our reported CSAd criterion of > 10.5 mm2 yielded sensitivity, specificity, and accuracy of 45.8%, 93.5%, and 64.6%, respectively.

Previous studies on retinacular bowing showed results varying from no bearing on the diagnosis of CTS [24, 25] to being an useful discriminatory sign [12, 14, 27,28,29,30]. Sarria et al. found and yielded a sensitivity of 62.5 to 81.3% and a specificity of 64.3 to 96.9%, respectively [27]. We used a cut-off of 4 mm of bowing at carpal tunnel inlet, and that gave us a sensitivity and specificity of 68.75% and 93.55%, respectively. Intraneural vascularity and signal changes had very low sensitivity but 100% specificity. To garner a higher diagnostic accuracy, we combined these criteria into a single US composite score along with age generated by a well-designed statistical technique. The composite score had superior diagnostic accuracy to all other criterion in external validation. This is a simple linear score that can be calculated at the bedside and be implemented with high accuracy.

We also described the prognostic utility of baseline sonography after US-guided corticosteroid injection for CTS. We did not use any standardized questionnaire for evaluation of response to carpal tunnel injection, and this is a limitation of this study. However, our main outcome measure was relapse or reappearance of initial symptoms of CTS after an initial improvement deemed significant by the patient.

Previous studies have reported that local glucocorticoid injection improves symptoms in more than 75% of patients [30]. It can be appreciated that, as long as the origin of the carpal tunnel syndrome is not known with certainty, result from different studies may vary to a large extent. However, most studies report high relapse rates, ranging from 28 to 68% within the first 12 months after local glucocorticoid injection [31,32,33,34,35]. A small previous study from India on mild CTS reported a lower relapse rate of 16.6% over 12 months [30]. In these studies, patients with severe CTS had a higher relapse rate compared with mild CTS (> 80% versus 20–60%). Previously identified predictors of relapse were age > 50 years, duration > 10 months, persistent paresthesia, triggering of digits, and a positive Phalen’s test in ≤ 30 s and electrophysiologically moderate to severe CTS [36, 37].

There are few studies reporting on ultrasonographic predictors of outcome of CTS injection. In the study of Meys et al. [38], the CSA of median nerve at the wrist was negatively correlated with better outcome of carpal tunnel steroid injection. The authors postulated that less CSA might indicate a less severe stage of CTS and consequent better response. In contrast, in a Korean study [39], patients with larger CSA of median nerve showed better response to the steroid injection. In our study, on the other hand, CSAp or CSAd was not associated with initial response or relapse. There were some methodological differences among these studies. Our study and the study by Meys et al. [38] measured CSA at pisiform, and the Korean study measured CSA at the proximal tunnel at the site of estimated maximum diameter. Average CSAp was 12.7 mm2 in our study, 12.5 mm2 in the study by Meys et al., and 13.9 mm2 in the Korean study. And other ancillary features like bowing of flexor retinaculum or flexor tenosynovitis etc. were not evaluated in both of the previous studies. And finally, they evaluated with either univariate analysis or did not use Cox proportional hazard model for long-term response prediction. In the present study, short-term satisfactory improvement post-injection was predicted by presence of bowing only. More pronounced bowing may indicate pronounced edema of the carpal tunnel, soft tissue accumulation within the carpal tunnel inlet or perineural edema, which would be more easily amenable to treatment with local steroid injection. Interestingly, we found that bowing also predicted relapse of carpal tunnel syndrome after initial satisfactory response. This may reflect deposition of proteoglycans as a second mechanism of bowing. This may not be readily amenable to single injection of glucocorticoids in the carpal tunnel. A recent high-resolution US study of CTS injection also reported that CSA of the median nerve was not useful to monitor patients with CTS treated with corticosteroid injections [40]. Another study using median nerve elastography and high-resolution ultrasound showed that elastographic measures of the carpal tunnel rather than that of the median nerve itself was correlated with response to injection [41]. Contents of the carpal tunnel of patients suffering from CTS contain disarranged and degenerated type 1 collagen fibers, with increased and irregular connective tissue rich in type 3 collagen, fibroblasts, and high concentration of glycosaminoglycans [42]. Steroid injection in the carpal tunnel decreases the amount of fibrinogen as well as the rate of fibroblast proliferation and slow down glycosaminoglycan synthesis and increase the solubility of collagen by the second and third weeks [43, 44]. These reduce the pressure within the carpal tunnel, decompress the nerve, and provide clinical relief. Presence of bowing of flexor retinaculum, a direct expression of increased extracellular matrix within carpal tunnel is therefore a natural choice of US marker of post-injection improvement. Since majority of these changes after corticosteroid injection occur in the extracellular matrix in structures in the carpal tunnel around the nerve, we hypothesize that CSA measurements, as shown in our results, would be inadequate to predict long term outcomes. Since patients bowing also predicted relapse of CTS symptoms, it might be a useful measure for injection responsiveness. Also, whether patients with baseline bowing or persistent bowing after initial glucocorticoid injection would benefit from a second injection is an open question.

Our study has several limitations: we did not measure certain indices like flattening ratio, anterior-posterior diameter of the median nerve, and CSA at the distal forearm (pronator quadratus). However, despite these, we developed a succinct score to correctly diagnose CTS with approximately 90% accuracy. On the other hand, previous studies have shown inconsistent results with these measures. We did not use predefined questionnaires for evaluation of CTS severity or response measurement. Also, we had an issue with attrition, as not all patients who were injected could be contacted for response measurement. Also, the definition of relapse was patient reported. However, 88 wrists were followed up for median 6 months and maximum duration of follow-up was 30 months. Another issue was that control patients all attended the department for investigation for rheumatic diseases. This could include a potential source of bias. However, all of the controls were carefully investigated clinically and electrophysiological, and after negative results in both, they were included as controls. Finally, a negative scoring may appear counter-intuitive at a first glance. However, the intercept in the detailed final statistical model was significant, and we could not glance over it for the sake of simplicity as it might reflect unexplained variance of the dependent variable and inclusion of the intercept value improved score diagnostics.

In conclusion, measurements of median nerve CSAp, CSAd, and ancillary radiological features like retinaculum bowing, flexor tenosynovitis, and intraneural echogenicity changes are important US diagnostic hallmarks of CTS. We have developed a US composite diagnostic score for higher diagnostic accuracy. Initial response to corticosteroid injection was satisfactory in up to 70% of cases. Bowing of flexor retinaculum predicted initial beneficial injections. However, the relapse of symptoms occurred in up to 30% of cases, which was predicted by bowing.