One of the most feared complications following gastrointestinal surgery is anastomotic leakage (AL). AL is a complication with potentially dire consequences. Studies have shown an increased length of hospitalization, increased health expenses, increased risk of recurrence, reduced mobility, and reduced survival [1, 2]. AL is a multifactorial complication associated with several risk factors: Male gender, blood transfusion, steroids, nonsteroidal anti-inflammatory drugs, bacterial microbiome, and lack of surgical experience [3,4,5,6]. However, one of the most important reasons for AL is inadequate visceral perfusion [7,8,9]. Thus, securing good perfusion of the anastomosis is a vital piece of the puzzle towards reducing the rate of AL.

Traditionally perfusion assessment has been performed visually and manually by surgeons. They rely on the color of the tissue, peristalsis, bleeding from the resection line, and palpation of the mesenteric pulse to determine if the visceral perfusion is acceptable [10]. However, these methods are limited when performing minimally invasive surgery [10]. Furthermore, this traditional assessment is subjective to interpretation by the surgeon, which is demonstrated in studies showing that surgeons’ sensitivity and specificity for predicting AL is low [11]. Hence, there is a need for a new method to evaluate visceral perfusion.

A strong contender for this method is fluorescence angiography (FA) with indocyanine green (ICG). FA can be divided into two methods: Visual fluorescence angiography (V-ICG), where the surgeon assesses the fluorescence signal, and quantitative fluorescence angiography (Q-ICG), where a computer algorithm interprets the fluorescence signal. In Q-ICG, the fluorescence signal is translated into a fluorescence–time curve. Subsequently, a computer algorithm can calculate different Q-ICG parameters (Fig. 1). The various Q-ICG parameters then reflect the perfusion of the examined tissue [12, 13].

Fig. 1
figure 1

An illustration of the different Q-ICG parameters and how they are calculated. The parameters are divided into two categories: (1) The Intensity category: Fmax Maximum fluorescence intensity, R-Fmax relative maximum fluorescence intensity, and plateau intensity. (2) Inflow category: ttp Time-to-peak, slp slope, norm slp normalized slope, t1/2max time to 50% of the maximum fluorescence intensity, TR time ratio, Tmax time from ICG injection to maximum fluorescence, and T0 time to first fluorescence signal

Reviews have found that surgeries assisted by V-ICG tend towards a lower risk of AL in both esophageal and colorectal surgery [14,15,16,17]. However, the reviews considered the results biased, as they mainly included retrospective- and non-randomized studies. Two randomized-controlled studies exist, 1 found a significantly lower rate of AL in patients who underwent V-ICG compared with the controls (14.4 vs. 25.7%, p = 0.04) when performing low colorectal anastomoses [18]. On the contrary, the other found no significant difference between V-ICG and controls in the AL rate following colorectal resection [19]. Hence, several reviews of V-ICG have concluded that an objective (quantitative) methodology is needed to overcome the inherited observer bias of V-ICG [12, 14, 15, 17]. However, no gold standard regarding the optimal methodology and Q-ICG parameters have been established.

In this systematic review, we aimed to explore which Q-ICG parameters express the most robust association to clinical endpoints as re-resection or the occurrence of AL and search for the optimal methodology when performing Q-ICG.

Materials and methods

Search strategy

This study followed the PRISMA statement for systematic reviews, and the protocol was submitted to PROSPERO (reference number: CRD42020151477). This study required no approval from an institutional review board nor written consent. The search strategy was guided by the characteristics of our PICO-questions: Population: Clinical studies. Intervention: Q-ICG visceral perfusion assessment. Comparison: Different Q-ICG methodologies and parameters. Outcomes: Studies had to report a re-resection or the occurrence of AL.

We aimed for a very general search string as not to risk excluding relevant studies. The search string was constructed using the PubMed search builder. Subsequently, the following databases were searched: PubMed, Embase, Scopus, and Cochrane. The search was performed on the 16th of April 2020. The following search string was used:

“(Fluorescence OR fluorescence angiography OR laser fluorescence videography OR fluorescence-Assisted Resection and Exploration OR near-infrared fluorescence angiography OR NIR fluorescence angiography OR angiography OR FLARE OR near-infrared imaging OR SPY OR near-infrared imaging OR video fluorescence OR Enhanced-Reality Video fluorescence) and (Indocyanine Green OR ICG OR Q-ICG) and (Quantitative Perfusion OR Perfusion angiography OR Perfusion monitoring OR Perfusion anastomotic OR Quantitative OR quantitative assessment OR Q-assessment OR Microcirculation)”

The selection process was assisted by an online tool (Rayyan®) [20]. Two investigators screened studies for inclusion and exclusion criteria. Any conflicts that arose during the evaluation were discussed in the author group to reach consensus. Finally, snowballing inclusion was performed through reference screening.

Eligibility criteria

The inclusion criteria were as follows: Studies had to use a numerical Q-ICG parameter from the time–fluorescence curve with relation to re-resection or AL. Studies had to assess the perfusion of gastrointestinal organs during a surgical procedure. Only English language studies published in peer-review journals were accepted. The exclusion criteria were: Wrong study design (case reports < 5 subjects, experimental animal studies, reviews, conference abstract, and editorials) and ex vivo studies.

Quality assessment

The quality assessment was performed by two authors independently. Studies were evaluated using the Newcastle Ottawa Scale (NOS) for retrospective cohort studies [21].

Statistics

Normally distributed data were given in mean ± SD and non-normal distributed data as median with range. P values < 0.05 were considered significant. An unpaired 2-sided T test was performed with IBM SPSS Statistics (v. 25, SPSS Inc, IBM, Chicago, IL, USA).

Results

Study characteristics

The initial search yielded a total of 1216 studies. Subsequently, 28 studies were included for a full-text screening, of which 13 were accepted for final inclusion [13, 22,23,24,25,26,27,28,29,30,31,32,33]. An overview of the process is depicted in a PRISMA flow diagram (Fig. 2).

Fig. 2
figure 2

PRISMA flowchart depicting the literature search

The 13 studies included a total of 1918 patients. Q-ICG and V-ICG were performed in 1150 patients, while the remaining 768 patients were controls. The number of patients in the studies ranged from 9 to 657. The clinical studies were divided into the following fields: Two investigating esophageal surgery (n = 46) [27, 29], ten colorectal surgery (n = 1863) [13, 22,23,24, 26, 28, 30,31,32,33] and 1 gastrointestinal trauma surgery (n = 9) [25]. All studies were published during the period from 2015 to 2020. The surgical techniques varied among the studies; however, most of the studies used a laparoscopic technique (Table 1).

Table 1 Descriptive characteristics of clinical studies

Bias evaluation

The results of the NOS-bias evaluation showed a low to moderate level of evidence. Only four studies had a high level of evidence [13, 23, 30, 32], while the remaining nine had a poor level of evidence. Studies mainly scored low due to a lack of multivariate analysis, inadequate blinding, and a description of follow-up. (Table 2).

Table 2 Quality assessment of clinical studies according to the Newcastle Ottawa Scale

Q-ICG methodology

Q-ICG was performed either intra-or postoperatively based on video recordings. Most studies used video recordings; however, two studies did perform intraoperative Q-ICG, both assisted by the SPY elite system [25, 31]. One study performed V-ICG transanally and examined the mucosal side of the anastomosis [22]. The remaining examined the serosa side of the anastomosis or tissue to be anastomosed.

The studies analyzed Q-ICG parameters of both the intensity category: Maximum fluorescence intensity (Fmax) and relative maximum fluorescence intensity (R-Fmax),) and the inflow category: Time-to-peak (ttp), slope (slp), normalized slope (norm slp), time to 50% of the maximum fluorescence intensity (t1/2max), time ratio (TR), time from ICG injection to maximum fluorescence (Tmax), and time to first fluorescence signal (T0) (Fig. 1) [12, 13, 22, 30]. The relative Q-ICG parameters were calculated by selecting two or more region(s) of interest (ROIs). Many different FA systems were used, and the most popular was the Spy-Elite system, utilized in four [23,24,25, 31] of 13 studies. The most popular Q-ICG parameters were the intensity-based (Fmax or R-Fmax), and 6 studies only explored 1 of these 2 parameters [23,24,25, 30,31,32]. A detailed description of the applied Q-ICG parameters is listed in Table 1.

The dosing of ICG was heterogeneous as 9 of 13 studies used different dosing regiments. Most studies used a fixed dose ranging from 1.25 to 10 mg, while only two studies used a dosage depended on body weight [13, 22]. Finally, the timing of fluorescence measurement was only reported in seven studies [13, 24, 26,27,28, 31, 33]. Five studies had a recording period of approximately five minutes following the administration of ICG [13, 26,27,28, 33]. The remaining two studies only performed Q-ICG once at a predetermined interval. These studies sampled a Fmax value once at 60 s post-ICG injection [24, 31].

Clinical endpoint–Re-resection

In all but two studies, the decision to perform a re-resection was based on V-ICG and not Q-ICG [25, 31]. In one study investigating laparoscopic colorectal resection with intraoperative Q-ICG using Fmax with no fixed cut-off, four patients (5.2%) had re-resection followed by an uncomplicated postoperative course [31]. In the other study, the repair of nine military trauma-related bowel injuries was assisted with perioperative Q-ICG. Arbitrary cut-off values of R-Fmax > 25% or Fmax > 6AU were used, resulting in re-resection in 3 patients (33.3%) [25].

In four studies, Q-ICG was performed post-surgery to determine if patients who underwent re-resection could be distinguished from the patients without re-resection by a Q-ICG system. In one study, no significant difference in Fmax was observed at the re-resection site in colorectal patients (169 vs. 192AU, p = 0.11) [23]. Another study reported a lower Fmax in colorectal patients with re-resection but did not include a statistical analysis [24]. A study analyzed Fmax and ttp in the gastric phase of esophagectomy but did not find a significant difference in re-resection patients [27]. Finally, a single study found a significantly lower Fmax in colorectal patients who had re-resection (51.7 vs. 82AU., p < 0.05) [32].

Clinical endpoint–anastomotic leakage

Seven studies (n = 652) analyzed Q-ICG parameters in patients with AL and compared them with patients without AL [13, 22, 26, 28,29,30, 33]. The studies analyzed the following Q-ICG parameters: Fmax, slp, ttp, t1/2max, TR, Tmax, and T0. Six studies investigated colorectal AL [13, 22, 26, 28, 30, 33], and 1 looked at venous anastomotic failure during esophagectomy [29] (Table 3).

Table 3 Comparison of Q-ICG parameters in AL and non-AL patients

The Fmax parameter was examined in five studies (n = 317) [13, 26, 29, 33]. However, only one study (n = 112) had a significant difference in Fmax, comparing AL patients with non-AL patients (38.14 vs. 91.1AU, p < 0.001) [33].

The slp parameter was examined in three studies (n = 220). Two studies found a significantly different slp value between the AL and no-AL group (0.98 vs. 3.6 AU/s, p = 0.009) [33] and (0.7/s AU vs. 2.5AU/s, p = 0.001) [13]. However, Hayami et al. (n = 22) who selectively included only patients with a high risk of AL did not find a significant difference (3.4 vs. 5.5 AU/s, p = 0.27) [26].

The t1/2max parameter was examined in four studies (n = 246). Three studies found significantly different t1/2max values between the AL and non-AL group (p = 0.001; < 0.001; < 0.001) [13, 29, 33]. However, again, Hayami et al. did not find a significant difference (13.3 vs. 7.8 s, p = 0.12).

The ttp parameter was examined in six colorectal studies (n = 674) [13, 22, 26, 28, 30, 33]. Three studies found significantly different ttp values between AL and non-AL patients (p = 0.04; 0.001; 0.01) [13, 22, 33]. The remaining three studies did not find a significant differences (p = 0.09; 0.85; 0.33) [26, 28, 30]. Unlike the other studies, one study only performed FA before the creation of the anastomosis [28].

The T0 parameter was examined in three studies (n = 159) [26, 28, 33]. One study did not find any statistical differences between patients with or without AL; neither did they report any numerical T0 values [33]. Another study found a significant longer T0 in patients with AL (64.3 s) compared with non-AL patients (18.3 s, p = 0.002) [26]. Finally, one study found a significant longer T0 in patients with AL (37.5 s) compared with non-AL patients (11.0 s, p = 0.03) [28]. Data for the remaining Q-ICG parameters are listed in Table 3.

Q-ICG cut-off values–anastomotic leakage

Three studies performed a ROC analysis as a method to establish Q-ICG cut-off values associated with AL [13, 29, 33]. One study examining colorectal resections (n = 86) calculated an area under the curve (AUC) of 0.12, 0.96 and 0.93 at the following cut-off values slp < 0.7 AU/s, t1/2max > 18 s and TR > 0.6 for AL [13]. Subsequently, the authors performed a multivariate analysis and calculated an odds ratio of 130.8 for AL if TR > 0.6 (p = 0.002). Another colorectal study (n = 112) also performed a ROC analysis and found that Fmax and slp performed better than both ttp and t1/2max (no statistical data published) [33]. Furthermore, this study performed a sensitivity analysis and proposed the following cut-off values for anastomotic leakage: Fmax > 52AU, ttp > 57 sec, t1/2max > 14 s, and slp < 2.1 AU/s [33]. Finally, an esophageal study (n = 26) found an AUC of 0.82 for venous anastomotic failure at a cut-off value of t1/2max > 9.6 s [29].

Discussion

This systematic review found that Q-ICG can identify patients with significantly increased risk of AL; however, only if performed with the correct methodology and Q-ICG parameters. On one hand, studies that analyzed the inflow parameters (slp, T0, and t1/2max) could significantly differentiate between AL patients and non-AL patients in most of the studies. The ttp parameter, while also an inflow parameter, had conflicting results as three studies found statistical evidence for an association with AL [13, 22, 33], and three did not [26, 28, 30]. On the other hand, the intensity parameters (Fmax and R-Fmax) were only significantly associated with AL in 1 of 6 studies [33]. Besides, the Fmax parameter could not consistently differentiate between patients with re-resection and those without re-resection. Finally, a few studies performed a ROC analysis and found the inflow parameter (t1/2max) performed best with AUCs of 0.96 [13] and 0.82 [29] at identifying AL and venous anatomic failure, respectively.

The optimal parameter(s) and methodology for Q-ICG for visceral perfusion evaluation

The foundation of Q-ICG rests upon the hypothesis that Q-ICG parameters accurately reflect the underlying visceral tissue perfusion.

This hypothesis is supported by the results of several exploratory animal studies [12, 34,35,36,37]. These studies have found a significant linear correlation between various Q-ICG parameters (ttp, slp, and Fmax) and levels of local lactate or radioactive microspheres under both normal [12] and compromised perfusion conditions [34,35,36,37]. One study directly comparing several Q-ICG parameters in a porcine model found that the Fmax parameter had less robust correlation than the slp and norm slp parameters (Fmax r = 0.78, p = 0.037; slp r = 0.97, p = 0.001, and norm slp r = 0.96, p = 0.004) [12].

Inflow versus intensity parameters

The animal studies found that both inflow and intensity Q-ICG parameters do correlate with visceral perfusion given the controlled settings of animal experiments. However, when evaluating the results of the clinical studies of this review, the inflow parameters perform better than intensity parameters. The following section explores some likely reasons for the poor performance of the intensity parameter.

The inherent problems with the intensity parameters quickly become noticeable in a clinical setting. The problems arise when either the ICG plasma concentration, amount of excitatory light, camera distance, or camera angulation is changed as intensity parameters are influenced by all [13, 38]. Achieving a uniform plasma concentration across different patients is near impossible due to differences in metabolization, vascular status, and distribution volume [39]. However, using a body-mass-adjusted dosage can reduce the impact of these factors. Unfortunately, only two of the included studies did this [13, 22], while the remaining used varying static dosages (1.25–10 mg) or did not disclose their dosage.

Furthermore, Fmax is susceptible to timing-bias as ICG with time will begin to enter even ischemic tissue due to capillary diffusion/retrograde flow [13, 40], which may have been the case in two studies that only sampled the Fmax value at once at 60 s following injection [24, 31]. One might argue that a R-Fmax parameter would be less sensitive to changes in ICG plasma concentration, camera distance/angulation, and vascular status as each patient becomes their own reference. However, R-Fmax still suffers the problem with capillary diffusion/retrograde flow, and the selection of representative reference ROIs. One clinical study that analyzed R-Fmax could not find a difference in the AL rate between R-Fmax > 50% patients compared with Fmax < 50% patients at the anastomotic site [30]. This combination of factors is likely to be the reason for the poor performance of Fmax concerning clinical endpoints.

On the contrary, a significant strength of inflow parameters is that variables that massively impact the intensity parameters have far less of an impact on the inflow parameters. Inflow parameters are more resilient because they depend on the timing of “perfusion events” rather than the “numerical intensity values” determined by the Q-ICG system (Fig. 1). Another strength of the inflow parameters is that unlike the intensity parameters, they are not subject to the capillary diffusion/retrograde flow problem. It has been reported that the introduction of V-ICG initially increased the rate of surgical complications. The increase may have been a result of capillary diffusion/retrograde flow leading surgeons to misdiagnose the fluorescence angiogram as adequate perfusion even though it was judged to be deficient by traditional perfusion assessment [13].

The superiority of inflow parameters is supported by theoretical work in computer models. One study identified the inflow phase of the FA to carry most information about tissue perfusion [41]. In another study, the gastric conduit was modeled, and the “time- to-20% of max intensity” was the best way to discriminate between sufficient and impaired perfusion with an AUC of > 0.85 [42].

Finally, while T0 is defined by the inflow phase of the fluorescence–time curve, it suffers from an ambiguous definition as “time to first fluorescence signal.” Thus, it can be difficult to quantify due to background noise. Furthermore, the T0 parameter is like to be unstable in a clinical setting as it is altered by peripheral vs. central venous catheter, infusion speed, the height of the person, and vascular status [43]. Hence, we suggest exploring 1 of the other inflow Q-ICG parameters (TR, ttp, slp, and t1/2max).

Challenges and perspectives for Q-ICG in the future

Q-ICG has the potential to become a viable surgical tool; however, Q-ICG is faced with a couple of challenges.

Firstly, the camera and target ROI must be stationary to allow for quantification. A stationary camera can be achieved by either fixating the laparoscope/ICG camera in a mechanical holding arm or by locking the camera during robotic surgery. The target can be kept stationary by inducing a brief period of apnea. Another method would be the use of motion tracking, which could correct the image position based on the motion of the camera or the target [44]. Secondly, repeated perfusion measurements are valuable in many different clinical situations. Unfortunately, it takes approximately 15–20 min for the fluorescence signal has decreased to a negligible level. Thus, rapidly repeated Q-ICG and especially V-ICG measurements may not be viable as the increased fluorescence intensity oversaturates the image causing problems for both the naked human eye and computer software. However, this challenge might be overcome by utilizing normalization as one study showed that Q-ICG measurements could be reliably repeated if based on the normalized slp [45]. Thirdly, Q-ICG still lacks verified procedure-specific cut-off values as these are necessary for swift objective decision making. However, there was little agreement on cut-off values and what parameter to use between the colorectal studies included in this review. Thus, we suggest a focus on the inflow parameters and relative inflow parameters. Relative Q-ICG parameters are not well explored in either animal or clinical studies. However, it offers multiple benefits as the patients become their own control, and cut-off values based on relative parameters are more valid when comparing different patients [38].

Fourthly, there is a lack of commercial Q-ICG systems. This lack of industry-supported FA systems with Q-ICG options may be a consequence of legal concerns regarding their accountability for clinical decisions based on their Q-ICG systems. Fortunately, it seems that these concerns have been put to rest as newly developed FA systems do support intraoperative Q-ICG: SPY-Elite® [46] and VisionSense® [47]. However, these systems still only employ intensity parameters. Thus, we urge manufactures of FA systems to incorporate a method for easy and intuitive intraoperative Q-ICG measurement of inflow parameters. A final challenge is that some surgeons believe that surgical experience and the naked human eye are superior to technology-based techniques for perfusion assessment [48]. However, evidence suggests that surgeons assisted by a Q-ICG system can identify anastomotic sites with superior perfusion compared with unassisted surgeons [40, 49]. Furthermore, intraoperative Q-ICG might be more assessable than some surgeons believe, as it was found that a real-time intraoperative Q-ICG with a plug-in device was both feasible and had an excellent usability score during gastroesophageal resection [50].

Limitations

A limitation of this review is that only 2 of 13 studies [25, 31] performed intraoperative Q-ICG. Hence, it was not possible to directly compare Q-ICG with V-ICG or traditional perfusion assessment. However, intraoperative Q-ICG has been shown to increase sensitivity and specificity, positive and negative predictive values compared with V-ICG and traditional perfusion assessment for identifying ischemic bowel [51]. Furthermore, while unfortunate that only a few studies utilized intraoperative Q-ICG, the focus of this review was to identify the determining factors for whether Q-ICG does reflect clinical endpoints.

Another limitation was the heterogeneity between the studies with regards to the patient population, surgical procedures, ICG methodology, Q-ICG parameters, and endpoints. This is reflected in the bias evaluation, where only four of 13 studies achieved a high level of evidence (Table 2). Furthermore, most of the studies had relatively small patient populations, and only three studies included a historical control cohort [24, 30, 32]. No RTC studies were present, and only one study performed propensity matching to ensure that the historical cohort and the V-ICG cohort were comparable [32]. Propensity matching had a substantial impact on the difference in AL rate between V-ICG patients and the control cohort (AL rate in control cohort 6.9% before matching and 14.7% following matching). Q-ICG measurements were only repeated in two studies following re-resection in the studies looking et al. rates [29, 30]. Consequently, re-resected patients brought false Q-ICG values into the analysis of the AL rates, thus introducing bias. Finally, all included studies only looked at the arterial phase of perfusion. When dividing the mesentery, the venous outflow may also be impaired, leading to venous congestion, which also may contribute to poor anastomotic healing [52].

This comprehensive list of biases and differences between studies is a global challenge for fluorescence angiography during surgery. A recent meta-analysis exploring V-ICG in colorectal cancer reported similar results with a moderate level of bias and an intrinsic heterogeneity between the studies [53]. Ultimately, the authors believe that this heterogeneity is a consequence of the lack of a gold standard within the field of fluorescence angiography.

Conclusion

The results, while heterogenous, all seem to point in the same direction. Q-ICG parameters based only on fluorescence intensity parameters (Fmax and R-Fmax) are unstable and do not reflect clinical endpoints. In contrast, the inflow parameters (ttp, slp, T0, and t1/2max) are resilient to much of the variance that occurs when performing Q-ICG. The inflow parameters also correlate better with perfusion in animal models and have a stronger association with clinical endpoints. Regarding the Q-ICG methodology, we recommend that future studies utilize a body-mass-adjusted ICG administration and a fixated camera setup. We believe that standardization of the methodology is necessary to establish a gold standard within the field of Q-ICG.