FormalPara Key Points

Micro RNAs (miRNAs) are very promising biomarkers, especially when measured from blood or other easily accessed body fluids. They have potential as minimally invasive disease markers.

Often, miRNAs are not specific for a certain disease but are discovered as markers for different pathologies.

Marker signatures, e.g., from arrays or high-throughput sequencing, seem to be specific but require significantly more validation and replication.

At the same time, technologies for quantifying miRNAs in clinics are developed and maturing such that an application seems feasible. However, many challenges remain to be solved during the translational process of miRNAs from bench to bedside.

1 Introduction

The importance of microRNAs (miRNAs) became evident at the end of last century. Among the best-studied miRNAs are lin-4 and let-7 from Caenorhabditis elegans; lin-4 was first reported in 1993 [1]. In 2001, Ambros [2] highlighted the role of these “tiny regulators with great potential”. Over the past 3 decades, research on miRNAs has gained rapid traction. Today, the main reference database is the miRBase, which was first published in 2004 with 506 miRNAs from six organisms [3]. The tenth release, published in 2008, contained 5071 miRNA precursors from 58 species [4]. Some of these precursors had been annotated with two forms: minor and major mature miRNAs. A total of 5922 mature miRNA sequences were included in miRBase version 10. Version 20, available in 2013, contained 24,521 miRNA loci from 206 species. The number of mature miRNAs increased to 30,424 [5]. The latest version of the miRBase—version 21, available since June 2014—contains 28,645 precursor miRNAs, expressing 35,828 mature miRNA products, in 223 different species.

Driven by the success of high-throughput sequencing, miRNAs have been associated with numerous human pathologies. Besides tissues and cell cultures, body fluids have also become potential sources for miRNA biomarkers. While the translational process of biomarker profiles from bench to bedside is starting to gain momentum, there is increasing discussion of the challenges to using miRNAs in diagnosis, prognosis, and therapy. Examples include acute myocardial infarction (AMI) [6], bladder cancer [7], and endocrine cancers [8]. A substantial bias, impacting miRNA signature discovery, was also reported for high-throughput techniques [9]. MiRNA iso-forms that stem from one miRNA locus are predominantly modified by the addition or subtraction of nucleotides resulting in 5′ or 3′ iso-forms or by editing of nucleotides, resulting in internal iso-forms, making the situation even more complex for sequencing data. miRNA iso-forms have been frequently neglected [10], especially with respect to their diagnostic information content. Previously, we reviewed potential confounding factors, including the nature of miRNAs, the source of miRNAs, and technical issues [11]. Here, we focus on the specificity and the degree of verification of both single miRNAs and complex miRNA signatures. Since the aim of miRNA biomarker discovery is non- or minimally invasive testing that allows diagnoses without taking biopsy material from patients, we specifically included studies on serum, plasma, blood cells, and microvesicles. We refer to these miRNAs from body fluids as ‘circulating miRNAs’.

2 Specificity of Single Micro RNAs (miRNAs) for Single Diseases

2.1 Specificity of Single miRNAs

For an initial insight into the specificity of single miRNAs for diseases, we pursued two different approaches. First, we analyzed the second version of the human miRNA disease database (HMDD) [12], which contains 10,368 entries for 572 miRNAs and 378 diseases, with 3511 manuscripts. Focusing on circulating miRNA markers from this database, we found 556 interactions between miRNAs and diseases. The majority of these (512) were unique associations, i.e. they were only reported in one manuscript each. For illustrative purposes, Fig. 1 shows the complex network of these 512 interactions between 240 miRNAs (represented as blue nodes) and 76 pathologies (represented as orange nodes). This graph highlights that many miRNAs are not specific for single diseases but are frequently observed in various pathologies. The complete list of interactions is provided in Table 1 in the online Electronic Supplementary Material (ESM). Among the least specific miRNAs, which are in the center of the network, were mir-20a, mir-122, mir-21, mir-155, mir-17, and mir-126. Notably, several diseases are redundant in this database, e.g. the HMDD contains the categories kidney diseases, acute kidney failure, chronic kidney failure, and kidney neoplasms. There is certainly an overlap between the categories, and future analysis may ideally consider miRNAs in a more detailed and disease-specific manner.

Fig. 1
figure 1

Human miRNA Disease Database (HMDD) disease network. All circulating markers were downloaded from the HMDD and a bipartite graph was drawn. The node size approximates the number of connections for the respective microRNAs (miRNAs)/genes. The core miRNAs that are detected in many diseases are in the middle of the graph. miRNA nodes are shown in blue, disease nodes in orange

In a second approach, we focused on ten of the most relevant diseases in developed countries: coronary artery disease, stroke, chronic obstructive pulmonary disease (COPD), diabetes mellitus (DM), Alzheimer’s disease, arthritis, lung cancer, colorectal cancer, breast cancer, and prostate cancer. For these ten diseases, we systematically searched PubMed for precursors of 491 known miRNAs in combination with the three keywords “blood”, “serum”, and “plasma”. This approach has the following limitations: mature forms are not explicitly queried, different histopathological types of diseases (e.g., small-cell lung cancer [SCLC] and non-small-cell lung cancer [NSCLC]) are not considered, and false positives may be retrieved as a result of the query. In total, out of the 14,730 (10 × 491 × 3) PubMed searches, 2256 requests (15 %) yielded at least one hit. Altogether, 4962 hits were reported, corresponding to 1165 unique manuscripts. To represent the results graphically, we generated a matrix with each row containing the combination of a miRNA and a disease and each column containing the number of hits for “blood”, “serum”, or “plasma” for the respective combination. Figure 2a is a heat map containing the 27 combinations of miRNAs and diseases, where at least 20 different manuscripts in at least two of the three body fluids were discovered. The results of the two analyses revealed a substantial overlap. Among the miRNAs that were found by the PubMed search and by the analysis of the HMDD were mir-21, mir-155, mir-17, and mir-126. In particular, mir-21 and mir-155 fulfilled the above criteria for five and four of the ten different diseases, respectively. We focus on these two miRNAs as an example.

Fig. 2
figure 2

Heat map of associations between microRNAs (miRNAs) and diseases in blood, plasma, and serum. The color gradient represents the number of studies that have been found with this combination of keywords in PubMed. Included are the 27 associations that are found in at least two of the three biological sample types and where at least 20 manuscripts were reported in PubMed. The arrows highlight the two examples that are in Fig. 3

The manuscripts reported the down-regulation of miR-155 in the blood of patients with coronary artery disease [13, 14], ischemic stroke [15], or type 2 DM [16]; up-regulation in blood [17] and down-regulation in serum of patients with rheumatoid arthritis (RA) [18]; down-regulation in the serum of patients with lung cancer [19]; up-regulation in the plasma of patients with lung cancer [20]; up-regulation in the serum of patients with colorectal cancer [21]; and up-regulation in the serum of patients with breast cancer [22]. Results reported for miR-155 were inconsistent: one study [19] showed dysregulation of miR-155 in the serum but constant levels in the plasma whereas another study [20] indicated up-regulation of this miRNA in the plasma of patients with lung cancer. These differences may be due to the different cohorts used for early-stage lung cancer tests.

Manuscripts reporting on miR-21 described up-regulation in the plasma of patients with vulnerable coronary artery disease [23]; up-regulation in the serum or plasma of patients with RA [18]; up-regulation in the serum of patients with stroke or arteriosclerosis [24]; up-regulation in the serum of patients with COPD [25]; down-regulation in the plasma of patients with type 2 DM [26]; up-regulation in the plasma of patients with pediatric type 1 DM [27]; lack of dysregulation in the plasma of patients with type 2 DM (following adjustment for confounding variables such as age, sex, and others) [28]; up-regulation in the serum of patients with non-alcoholic fatty liver disease [29]; up-regulation in the plasma of patients with RA [18]; up-regulation in the plasma of patients with lung cancer [20, 30, 31]; up-regulation in the plasma of patients with colorectal cancer [31, 32]; up-regulation in the plasma of patients with breast cancer [31, 33]; up-regulation in the plasma of patients with prostate cancer [34, 35]; and no dysregulation in the plasma of patients with prostate cancer [31]. As miR-21 is involved in many diseases, the serum level of this miRNA has been recognized as a potential broad-spectrum biomarker for the detection of solid cancers [31, 36]. Our literature review indicates that miR-21 may even serve as a general disease marker, for not only cancer but also non-cancer diseases. Figure 3 summarizes the up- and down-regulation patterns of these two miRNAs in serum, plasma, and blood. Both the general HMDD and the PubMed analysis demonstrated that many miRNAs are not specific for single diseases, as demonstrated for miR-21 and miR-155.

Fig. 3
figure 3

Pattern of dysregulation in various diseases for mir-155 and mir-21. Red arrows mean up-regulation, green arrows down-regulation, and grey arrows indicate that manuscripts explicitly mentioned that data were not dysregulated. B blood and blood cells, CAD coronary artery disease, COPD chronic obstructive pulmonary disease, P plasma, S serum

One drawback of the general HMDD and PubMed analysis is the high degree of heterogeneity of the considered studies, e.g., differences in sample handling, profiling techniques (microarrays, next-generation sequencing [NGS], reverse transcriptase quantifiable polymerase chain reaction [RT-qPCR]), and analytical approaches. In addition, the HMDD relies predominantly on miRNA precursors and not mature forms. To overcome these challenges, we previously conducted a multi-centric study on different cancer and non-cancer pathologies [37]. In detail, we considered 454 blood samples from 13 diseases that have been measured for 863 human miRNAs. All samples were collected and handled according to standardized operating procedures. The results of these analyses showed a lack of specificity for single markers. The more specific signatures were largely concordant with the aforementioned findings of the HMDD and PubMed investigation. Even the direction of dysregulation between different diseases were well-matched between our study and data reported by others. Extending our analysis to 1049 patients with 19 different cancer and non-cancer diseases further confirmed these observations [38]. Specifically, miR-144* (now referred to as hsa-miR-144-5p) was down-regulated in the blood of almost all patients considered, including those with cancer and those with non-cancer diseases.

2.2 Specificity of Signatures

We recognized, as have others, that in terms of genes and proteins, although single markers are not necessarily specific, the combination of markers shows an improved disease specificity [38]. Several miRNAs that are combined for us as a biomarker—in extreme cases, very small numbers such as three miRNAs [39] —are referred to as an miRNA signature. miRNA signatures seem to be more robust biomarkers than single miRNAs and are more likely to adequately reflect the complexity of disease phenotypes. The use of signatures has improved not only the diagnostic specificity but also the overall predictive power in analyzed diseases, including meningioma [40], ovarian cancer [41], breast cancer [42], heart failure [43], melanoma [44], and Alzheimer’s disease [45], among others. However, careful validation is mandatory, especially in the case of complex miRNA patterns.

3 Reproducibility of Single miRNAs and miRNA Patterns

3.1 Single miRNAs

Some associations between a particular miRNA and a disease have been investigated in multiple manuscripts, e.g., the up-regulation of miR-1 in the serum and plasma in AMI [4648] and the up-regulation of miR-133 in AMI [46, 48, 49]. In terms of the link between miR-133 and AMI, Eitel et al. [50] described significant correlations with infarct size, microvascular obstruction, and myocardial salvage index. However, miR-133a levels did not independently predict clinical events. Several studies report correlations between miR-155 serum levels and breast cancer, including the above-mentioned up-regulation of this miR-155 in this cancer type [22, 5153]. One study specifically reports up-regulation of miR-155 in patients with metastases [53]. Independent studies show elevated levels of miR-21 in the serum ([54, 55]) and plasma of patients with breast cancer [31, 33] or NSCLC [5658]. Another reproduced miRNA–disease association was found between up-regulated miR-199a-3p and gastric cancer [59, 60]. However, the latter two studies have been published by the same group and are awaiting further independent validation.

In summary, results obtained for single miRNAs in the same specimens seem to be largely concordant. Non-concordant results have been reported in a few cases. For example, up-regulation [26] as well as a lack of up-regulation [28] have been reported for miR-21 in patients with type 2 DM. Furthermore, we found diverging results with respect to miR-21 in prostate cancer. While plasma levels of this miRNA were reported to be up-regulated in prostate cancer [34, 35], an independent study identified miR-21 as a non-specific noninvasive biomarker for lung cancer, colorectal carcinoma, and breast cancer but explicitly not for prostate cancer [31] (see also Fig. 3). Generally, the studies are difficult to compare since even minor variations in sample handling, for example permitting small platelet contamination, can severely impact the results [61]. Additionally, general aspects of the study set-up, including the patient recruitment criteria must be carefully evaluated prior to any comparison of different miRNA studies.

3.2 miRNA Signatures

The comparison of different studies is far more challenging for miRNA signatures than for single markers. Differences in profiling techniques, sample handling (e.g., storage time [62], blood collection tubes [63]), and statistical analysis often impede direct comparison between studies. miRNA signatures are frequently generated by high-throughput approaches such as microarray or parallel sequencing followed by low- to medium throughput validation using, for example, RT-qPCR. We, and others, have performed case–control studies for various diseases, including Alzheimer’s disease [45], heart failure [43], tuberculosis [64], pancreatic cancer [65], multiple sclerosis [66], neuromyelitis optica [67], lung cancer [68], and melanoma [69, 70]. Following the discovery phase, there are currently several studies underway to validate miRNA signatures in independent cohorts. We verified our initial Alzheimer’s disease signature [45] developed from samples collected in the USA by using a German cohort [71]. Although absolute miRNA levels varied between the two cohorts, there was a good match between the overall patterns of dysregulated miRNAs. In particular, when techniques such as NGS are applied, the cohort sizes remain limited in most cases. The study on Alzheimer’s disease included, for example, 103 patients with Alzheimer’s disease, 77 unaffected controls, and 110 disease controls (e.g., mild cognitive impairment). For lung cancer patients, we successfully validated a signature that was originally generated by microarrays [68] by using high-throughput RT-qPCR [72]. However, there are also reports of failed attempts to reproduce miRNA signatures. Sapre et al. [73] initially suggested that plasma profiling of miRNAs at radical prostatectomy allows for prostate cancer prognosis. They also reported the failure to validate the expression signature in an independent cohort [73]. This study included plasma samples from 33 patients with high-risk prostate cancer, plasma samples from 37 patients with low-risk or indolent cancer, and urine samples from 38 patients.

Curated databases for storing circulating miRNAs and disease relations, similar to miRandola [74] or the HMDD [75], could facilitate the detection of miRNAs best suited for disease diagnosis. Failed verification approaches should be included in such databases. The results of longitudinal studies that monitor disease progression—including under treatment—should also be included to improve biomarker discovery.

4 Clinical Value of miRNAs

In many cases, miRNAs have been proposed as minimally invasive markers for the early detection of diseases. Unaffected individuals are frequently used as control cohort without including early- and late-stage disease. However, validation studies have indicated that the dysregulation of miRNAs is often observed in late disease stages. Margue et al. [70] analyzed different stages of patients with melanoma. They analyzed miRNA expression profiles using qPCR arrays. Characteristic signatures with excellent prognostic scores were only discovered in patients with late-stage melanoma. Similarly, Eichelser et al. [53] found down-regulation of miR-155 only for metastatic breast cancer. Furthermore, miRNAs that correlate well with phenotypes do not necessarily have value as a biomarkers. The aforementioned miR-133 that has been associated with AMI correlates well with infarct size, microvascular obstruction, and myocardial salvage index but does not independently predict clinical events [50]. As for other biomarkers, confounding factors must be considered. For example, the expression of miRNAs in blood is known to depend on the age and sex of patients [38]. Training condition also impacts the miRNA signature. While short-term effects of physical exercise in the blood of athletes seems to have a limited influence [76], strength and endurance athletes show divergent plasma patterns [77]. For the time being, only some of all miRNA studies will be suitable for translational studies towards clinical requirements.

5 Future Trends and Challenges Towards Clinically Reliable miRNA Biomarkers

5.1 New Measurement Devices

High-throughput techniques are frequently biased. Willenbrock et al. [78] analyzed synthetic pools of 744 miRNAs, which were sequenced and hybridized using different concentrations for comparison. The authors found that measures based on microarray expression correlated best with the true RNA concentration. Furthermore, microarrays were sensitive, and the reproducibility was similar to that for sequencing. Hafner et al. [79] also reported a bias in sequencing of miRNAs that was introduced by RNA-ligase. When we analyzed blood-borne biomarkers, we also observed a substantial bias between arrays and sequencing, mainly depending on the miRNA sequence [9, 62]. Bias in sequencing data may affect the prediction of the abundance not only of known miRNAs but also of false-positive novel miRNAs. Londin et al. [80] proposed 3707 novel human miRNAs, which is more than double the number of human miRNAs deposited in miRBase at that time. However, technical and bioinformatics issues may have increased the number of predicted miRNAs, likely beyond the number of true miRNAs [81]. Thus, improved computational approaches [82] with in-depth validation are required to find true new miRNAs.

The abovementioned issues, in combination with complex experimental workflows and long turn-around times limit the application of microarrays and high-throughput sequencing in clinical routine diagnosis. Low- and medium-throughput approaches sicj as RT-qPCR are easier to employ in a routine protocol. This notwithstanding, there are also potential hitches associated with RT-qPCR [83]. Recently, other formats have been proposed to allow for a quick and robust miRNA analytic. Hofmann et al. [84] proposed a double-stranded ligation assay that yields results in 30 min. Liu et al. [85] presented a Mach–Zehnder interferometer-based approach with an even shorter turn-around time of 15 min. Labib and Berezovski [86] reviewed electrochemical-based detection strategies. We built and tested an immunoassay format for measuring miRNAs. With a turn-around time of 3 h, this approach was slower than the other abovementioned methods, but it runs on the standard immunoanalyzer systems available in many diagnostic laboratories [87]. Although these technical innovations for measuring miRNAs are mostly at the proof-of-concept phase [88], it is likely that the equipment will mature in the near future to allow for an accurate, robust, and quick miRNA profiling.

5.2 Bioinformatics Aspects

The bioinformatics analysis of miRNA data represents a substantial challenge, especially for high-throughput profiling approaches. A recent review [89] found that 192 different tools are being used in diverse areas of miRNA research. Besides the general challenge of big data analysis in miRNA and other non-coding RNA research [90], solutions with concise result representation are required to further facilitate the interpretation of miRNA data. In this context, web-based services represent a reasonable solution, for example, for the analysis of miRNA iso-forms, so-called iso-miRs, from NGS data. miRNA iso-forms can strongly influence pathologies, e.g., in breast cancer [91]. We recently reported on a web-based service that allows visual inspection of statistically analyzed results [92]. These issues are not specific for miRNA expression profiles, as they are also inherent to other high-dimensional molecular data analyses that likewise require sophisticated computational tools. Particularly in the context of oncology, key issues, starting with pre-processing of data prior to clinical application of high-dimensional statistical analysis, are under investigation [93].

5.3 miRNA Profiles from Blood Cells and Microvesicles

miRNA patterns can be generated from blood components, including specific blood cells [94, 95] or exosomes, as well as from serum, plasma, and whole blood. Exosomes, which are microvesicles of around 40–100 nm in diameter, are relevant for intercellular communication [96]. They are present in many body fluids and released from almost all cell types [97]. Besides proteins [98], RNA particles including miRNA are components of exosomes, as reviewed by Sato-Kuwabara et al. [99]. The role and importance of exosome-derived miRNA, including for diagnostics, has been researched and reviewed extensively, for example by Zhang et al. [100]. Similarly, the role of exosomes as biomarkers for cancer as well as neurodegenerative disorders has been reviewed extensively [101]. Liu and Lu [102] reviewed the potential of exosomal miRNAs with a focus on RT-qPCR. Although the crucial role of exosomes for cell–cell communication is generally agreed upon, there are numerous unsolved issues, such as the miRNA content of exosomes. Chevillet et al. [103] quantified the number of exosomes and the number of miRNA molecules in plasma and other biological sources and found less than one molecule of a given miRNA per exosome. Assuming homogenous distribution of miRNAs in the exosome population, 100 exosomes would carry a single miRNA copy.

In addition, the analysis of specific blood components can be challenging and difficult to implement in clinical practice. Baranyai et al. [104] recently compared ultracentrifugation and size exclusion chromatography (SEC), both of which are used to isolate exosomes. The authors found that, although selection by SEC is feasible, this method results in only a low vesicle yield compared with the technically more demanding isolation by ultracentrifugation.

Although miRNA diagnostics with blood components have high diagnostic potential, the understanding of miRNA load in vesicles is currently limited, and the technical demands of isolating vesicle-borne miRNAs complicate translation into routine clinical care.

6 Conclusion

miRNAs are valuable biomarker candidates. We, and others, have discussed the challenges that could potentially affect the process of translating circulating miRNAs from bench to patient care. These include technical considerations, e.g., bias in high-throughput approaches, and novel fast methods that are not sufficiently matured to be employed in routine applications. Another aspect is the degree of specificity and reproducibility. Apparently, selected miRNAs can be measured reproducibly. However, single miRNAs do not usually allow specific diagnosis of a disease. In contrast, complex miRNA signatures seem to be more disease-specific, but are less frequently validated and independently reproduced. Specific diagnosis also seems to be possible using blood cell sub-types and microvesicles, including exosomes. However, purifying specific blood cells or vesicles from patient specimens adds a time-consuming and experimental step.

A multiplexed test of a small set of miRNAs appears to be a likely and reasonable scenario for large-scale application of blood-borne miRNAs in clinical settings.