Introduction

Morton’s neuroma is a very common cause of forefoot pain. Morton’s neuroma is a benign lesion of the plantar digital nerve that usually involves the second and third intermetatarsal space. It is not a true neuroma and histologically it consists of perineural fibrosis, local vascular proliferation, oedema of the endoneurium and axonal degeneration [1]. The presence of Morton’s neuroma is one of several causes of metatarsalgia, therefore diagnostic imaging is often requested when clinical examination is not straightforward. Both ultrasound (US) and magnetic resonance imaging (MRI) are believed to be sensitive and reliable means of evaluating patients with metatarsalgia and diagnosing Morton’s neuroma [2, 3]. In addition, US and MRI have a major effect on both diagnostic thinking and therapeutic decisions by clinicians when Morton’s neuroma is suspected [4]. Reduction in healthcare resources is critical to warrant sustainable access to medical diagnoses and treatments for the majority of patients. For this reason, diagnostic accuracy is not the only parameter driving the choice of a diagnostic modality. To the best of our knowledge, the published studies dealing with Morton’s neuroma and diagnostic imaging do not clarify if US and MRI are comparable. Methodological heterogeneity, technological differences due to hardware and software developments, different imaging parameters and different cut-off points in diagnostic tests may result in inhomogeneous results among studies, limiting a true ‘evidence-based’ choice between US and MRI in the management of Morton’s neuroma. We believe that this meta-analysis is necessary and it will influence clinical practice because wide variability in local practice and expertise is still determines which test is performed, but this approach is not evidence based. Therefore, the aim of this study was to compare the diagnostic value of US and MRI in Morton’s neuroma with a systematic meta-analytic approach.

Methods

We followed the guidelines defined by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [5]. The protocol of this study was published on PROSPERO (International Prospective Register of Systematic Reviews; protocol number: CRD42014009866) on 21 May 2014 (http://www.crd.york.ac.uk/PROSPERO/).

According to the PICOS approach [5], the ‘PICOS’ questions pertinent to the purpose are: patients (P)—over the age of 18 years with symptomatic Morton’s neuroma undergoing surgery; intervention (I) US and MRI; comparison (C)— histopathological results; outcome (O)— the raw data: true positive, false positive, true negative, and false negative results based on the reference standard of histopathological confirmation from surgery or tissue biopsy; study type (S)—diagnostic accuracy test.

The rationale for including symptomatic Morton’s neuroma is that this clinical condition supports surgical treatment, not only conservative treatment.

Search strategy

We tried to identify all relevant studies that assessed the diagnostic accuracy of US and MRI for Morton’s neuroma. A literature search using PUBMED (http://www.pubmed.org), Embase (http://www.embase.com.proxy.medlib.iupui.edu/search), ISI Web of Science (http://apps.webofknowledge.com), SpringerLink, ScienceDirect and Cochrane library (http://www.thecochranelibrary.com) was performed independently by two reviewers (Alberto Tagliafico and Bianca Bignotti) with the assistance of a hospital librarian up to 1 April 2014. Manual revision of the reference lists was also performed to eventually integrate the initial search with additional studies; screening of abstracts from recent conferences was also carried out. We did not consider it necessary to contact authors for additional data.

The search strategy included the following terms: ‘Morton neuroma’, ‘Morton’s neuroma’ in combination with ‘ultrasound’, ‘magnetic resonance imaging’ and ‘diagnosis’. The species was defined as ‘Humans’. The detailed search strategy in PubMed is presented in Supplemental Appendix S1 .

Inclusion criteria

Studies were included if they met all the following criteria:

  1. 1.

    Patients older than 18 years with symptomatic metatarsalgia with, or suspected of having, Morton’s neuroma.

  2. 2.

    US and/or MRI used for diagnostic purposes.

  3. 3.

    Presence of an acceptable reference standard (surgery or pathology).

  4. 4.

    If at least one pair of the absolute numbers of true-positive results and false-negative results or true-negative results and false-positive results were available or could be derived adequately. To include true-positive results, false-positive results, true-negative results and false-negative results in a meta-analysis, all four should be available.

  5. 5.

    Languages: only publications in English were included.

  6. 6.

    Exclusion criteria: (1) case reports or case series, review articles, letters, comments; (2) duplicate publication; (3) less than ten cases confirmed by reference standard; (4) post-surgical studies. No publication date restriction was used.

Study selection

Two authors (Alberto Tagliafico and Bianca Bignotti) independently reviewed article titles and abstracts for study selection, based on the pre-defined inclusion criteria. The same authors (Alberto Tagliafico and Bianca Bignotti) independently read the full text of that studies included in the screening and eligibility evaluation process. Disagreements arising during each phase of the study selection were resolved in consensus. If consensus could not be reached, a clinical expert (Carlo Martinoli) was asked to resolve any disagreements. If the reviewer’s selection was ‘unclear’ with regard to any question, that particular question was resolved by rereading the text, or by discussion.

Data extraction

Two authors (Alberto Tagliafico and Bianca Bignotti) independently extracted the data from eligible studies. Discrepancies were resolved by consensus. The following variables were extracted from each study: first author, journal and publication year, country of the study, study designation (retrospective or prospective), study population demographic characteristics (also percentage of women), number of patients with reference standard (surgery), the diagnostic imaging modality used and, when specified, its technical characteristic, the mean duration of Morton’s neuroma, the total number of lesions that were to undergo surgery, number of Morton’s neuromas found, numbers of true-positive (TP), false-positive (FP), false-negative (FN) and true-negative (TN) findings.

Risk of bias

The quality assessments of the eligible study were evaluated independently by two authors (Alberto Tagliafico and Bianca Bignotti) using the Quality Assessment of Studies of Diagnostic Accuracy Studies (QUADAS-2) checklist, which comprised four domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias and the first three in terms of concerns regarding applicability. The two authors then discussed the results of their quality assessments. Disagreements were resolved by consensus. The results of quality assessments were recorded in a QUADAS-2 form that was retrieved from the Web page http://www.bris.ac.uk/quadas/quadas-2.

Data synthesis and statistical analysis

Sensitivity and specificity were extracted from the included studies where reported or calculated as follows: sensitivity was calculated as TP/(TP + FN), where TP is the number of true-positive findings and FN is the number of false-negative findings while specificity was calculated as TN/(TN + FP), where TN is the number of true-negative findings and FP is the number of false-positive findings.

Since a consistent heterogeneity between studies was detected, meta-analyses to obtain a pooled estimation for sensitivity and specificity were performed using a random effects model with the DerSimonian-Laird estimator after a transformation according to the Freeman-Tukey Double Arcsine Transformation.

Analyses were run separately in two subgroups according to modality (US or MRI).

Forest plots were constructed to graphically illustrate the sensitivity and specificity values with the corresponding 95 % confidence intervals (CIs) calculated using the score (Wilson) approach [6].

Heterogeneity between studies was quantified using the I2 measure, while for testing differences (heterogeneity) between the subgroups (US vs MRI; prospective vs retrospective) Cochran’s Q-test was adopted.

Furthermore, if ignoring the possible (negative) correlation between sensitivity and specificity within studies could be misleading, a bivariate meta-regression model of sensitivity and specificity with realization of a summary ROC curve (sROC) was not performed since most of the studies had no information on TN, FP and specificity, having a specific focus on positive and diseased patients.

Stata (v.11; StataCorp, College Station, TX, USA) was used to run the meta-analysis.

Results

Figure 1 shows the results of the study search and screening. A total of 277 articles were identified from the database searches. After primary title and abstract screening, a total of 23 studies were submitted for a full-text review and 14 eligible articles were included in the systematic review [2, 4, 718].

Fig. 1
figure 1

Flowchart depicting the inclusion and exclusion of identified studies

The meta-analysis was conducted based on 14 studies that assessed sensitivity of US and/or MRI. The characteristics of the 14 included studies are shown in Table 1. Assessment of the methodological quality of the included papers by the QUADAS-2 tool is shown in Table 2.

Table 1 Characteristics of the 14 studies included in the final analysis
Table 2 Overall risk of bias for each of the domains of patient selection, index test, reference standard, flow and timing

The domain of ‘flow and timing’ was the only domain to potentially contribute a high risk of bias in the papers evaluated. However, we believe that for Morton’s neuroma this domain could be considered of less importance since this condition is not an acute one. Unclear risk of bias in the papers by Olivier et al. [9], Perini et al. [12], Lee et al. [2] and Torres-Claramunt et al. [17] was noted in the domain ‘patient selection’. This issue may have influenced diagnostic accuracy but it is difficult to assess to what extent. In the papers by Owens et al. [10] and Pastides et al. [11], the index text was not described in detail. The remaining QUADAS-2 domains were all felt to be at low risk of bias for all studies.

From 14 eligible studies, 36 % (5/14) were published during 1989–1999 and 64 % were published during 2000–2012.

Most of the studies were performed in the UK (5/14), Switzerland (2/14) and in USA (2/14). Six studies enrolled participants prospectively and eight studies enrolled participants retrospectively. The six prospective studies were published before 2004. The patient population in individual studies varied from nine to 100 patients. US sensitivity was studied in five studies, MRI sensitivity in three studies and both modalities in six studies. All studies used surgery as the reference standard for diagnosing Morton’s neuroma.

US vs MRI

A high sensitivity in diagnostic testing was observed both for US (SE (95 % CI) = 0.91 (0.83 – 0.96)) and MRI (SE (95 % CI) = 0.90 (0.82 – 0.96)), with no significant differences between the two modalities (Q test for heterogeneity between subgroups; p = 0.88) (Fig. 2).

Fig. 2
figure 2

Sensitivity of ultrasound (US) and magnetic resonance imaging (MRI)

Specificity was obtained from only six studies, three on US modality, two on MRI modality and one (Sharp et al. [15] ) for both modalities.

In all three studies on MRI diagnosis [4, 15, 18], test specificity was 1.00 with a pooled estimation of 1.00 (95 % CI: 0.73–1.00) while the pooled specificity was 0.854 (95 % CI: 0.41–1.00) for US diagnosis. No heterogeneity was observed between studies and between the two diagnostic subgroups (p = 0.25).

Prospective vs retrospective studies

A total of six studies were prospective and seven were retrospective. Among the prospective studies four had data on US diagnosis and six among the retrospective studies.

For US (Fig. 3), pooled sensitivity was 0.92 (95 % CI: 0.81–0.99) for prospective and 0.87 (95 % CI: 0.75–0.96) for retrospective studies, but without significant differences between the subgroups (p = 0.49). However, for MRI (Fig. 4) pooled sensitivity was 0.93 (95 % CI: 0.78–1.00) for prospective and 0.90 (95 % CI: 0.79–0.98) for retrospective studies (p = 0.76 for differences between subgroups).

Fig. 3
figure 3

Pooled sensitivity of ultrasound (US) for prospective and for retrospective studies

Fig. 4
figure 4

Pooled sensitivity of magnetic resonance imaging (MRI) for prospective and for retrospective studies

Discussion

In clinical practice, history and clinical examination can provide suspicion of the presence of Morton’s neuroma. Clinical suspicion is present if the patient has pain or tingling on the plantar aspect of the foot, worsened whilst wearing tight shoes and relieved by rest. Clinical examination may reveal mild tenderness on palpation around the affected web space, and sensory impairment in between the toes of the affected area [11]. Mulder’s click test describes an audible or palpable ‘click’ with pain when side compression is applied to the metatarsal heads. The Tinel sign may also be present on US and can help in diagnosing Morton's neuroma [3, 1923].

Diagnostic imaging is useful in confirming the diagnosis, especially in cases where the clinical diagnosis is equivocal or the patient complains of pain around several web spaces, and to influence subsequent surgical treatment [11, 18]. In clinical practice, US and MRI are used. US has the advantage of being relatively inexpensive compared to MRI, is less time consuming and allows real-time localization and visualization of pain (ultrasonographic Tinel sign) [21, 22]. US is believed to be user dependent.

MRI is more expensive than US and more time consuming. It does, however, have the advantage of providing static, non-operator dependent, reproducible images that can be interpreted by several clinicians. MRI can also visualize all surrounding soft tissues [24, 25].

This meta-analysis summarizes the available evidence on the value of US compared to MRI in the diagnosis of Morton's neuroma. To our knowledge, this is the only meta-analysis on this topic. There are still few studies comparing US and MRI for the diagnosis of Morton's neuroma and it is currently difficult to draw a firm conclusion on the preferred imaging technique for diagnosis of Morton's neuroma. In particular, most of the existing studies have been performed in centres where one modality (US or MRI) was preferred.

As a consequence, few comparisons between US and MRI were made in the same study; only six of studies compared US and MRI in the same study. However, Perini et al. [12] used only a 0.2 T MRI scanner, whereas Lee et al. [2], Fazal et al. [7] and Torres-Claramun et al .[17] used US probes with frequencies below 12 MHz. In addition, only very few healthy controls were included in the examined papers, making it difficult to understand the specificity of the findings.

Regarding the publishing dates of the papers we acknowledge that some papers are relatively old. We did not use a filter to exclude studies on the basis of publication date. Six of the 14 papers included in the final analysis were published before the year 2001. In these papers the US and MRI machines used were different from those normally used in 2014. US frequencies were below 12 MHz in the paper by Redd et al. [14] and Quinn et al. [13]. We were not able to extrapolate the US frequency adopted in the papers by Sobiesk et al. [16] and Oliver et al. [9]. In the two papers by Zanetti et al. [4, 18] the MRI used had a 1.0 T magnetic field. Ultrasound technology has however improved a lot and US probes with frequencies above 12 MHz are now widely used to assess the plantar aspect of the foot [20, 23]. MRI systems have also evolved and it is likely that the majority of feet are now evaluated with 1.5 T scanners. We were not able to find any study dealing with Morton's neuroma using a 3.0 T MRI system.

In spite of the above-mentioned differences among the studies evaluated, no heterogeneity was observed between studies and between the two subgroups of diagnostic modalities. This observation strengthens the results of this meta-analysis.

With regard to quality assessment, the domain of ‘flow and timing’ was the only domain to potentially provide a high risk of bias in the papers evaluated. Considering that Morton's neuroma is not an acute medical condition, we believe that the importance of this bias is limited. In general, the risk of bias in this study could be considered low.

A comparison of sensitivity of US and MRI showed a high sensitivity both for US (0.91) and for MRI (0.90), with no significant differences calculated between them. Specificity was obtained from only six studies, three on US, two on MRI and one (Sharp et al. [15]) on both modalities. In all three studies on MRI [4, 15, 18] test specificity was 1.00.

Comparing the study design, there were a total of six studies that were prospective and seven that were retrospective. For both US and MRI, pooled sensitivity was similar with no significant differences between retrospective or prospective designs. We were not able to find any prospective studies after 2004 to include in the analysis. It may be interesting to assess if the recent technological improvements in US and MRI can improve the diagnostic accuracy data in a prospective comparative study.

A strength of this study is that we excluded all the studies with no surgical reference standard. In addition, several studies were excluded because it was not possible to extrapolate the data to assess diagnostic accuracy. Another advantage of this study is that we had more than one observer for the literature research, data extraction and analysis.

Publication bias is a known drawback of meta-analyses. Studies with favourable results have a higher likelihood of being published, creating an inherent selection bias during a literature review. This factor has to be considered and it was not possible to assess to what extent the presence of publication bias may have influenced the final results. In addition, another limitation of this study is that it was not possible to assess specificity in all the studies due to the nature of published studies. We did not find any prospective studies after 2004 and this could be another limitation because the technology has improved since then. This meta-analysis shows that US and MRI are equally accurate, according to this technology. New prospective studies with 3 T MRI and the later generation of US for identification of Morton’s neuroma are not present in the literature. It is possible that a new prospective study will open new insights into the diagnostic work-up of metatarsalgia.

In summary, MRI and US could be considered equivalent in diagnosing Morton's neuroma. US is as accurate as MRI. These results, combined with the lower cost for US, suggest that US may be the most cost-effective imaging method for Morton's neuroma if the examiner has properly been trained. For centres without specific US expertise, MRI can be used as well.