figure a

Introduction

Central serous chorioretinopathy (CSC) is characterized by serous retinal detachment (SRD) and/or pigment epithelial detachment (PED) most often at the macular region. It is usually associated with fluid leakage through a defect in the retinal pigment epithelium into the subretinal space. CSC generally resolves spontaneously within a few months with minimal sequelae, but chronic disease may lead to widespread RPE damage, photoreceptor death, and permanent vision loss [1]. It is a leading cause of vision threat in the middle-aged male individuals [2].

Optical coherence tomography (OCT) and fundus fluorescein angiography (FFA) are well-established diagnostic tools to detect CSC. OCT is useful to provide objective, quantitative, but static morphologic information of the retina for disease diagnosis and treatment monitoring [3]. FFA provides detailed information on the structural and hemodynamic changes of the chorioretinal vasculature to detect leakage and blockage. Sometimes it could help distinguish CSC from other retinal diseases by the classic FFA patterns of gradually expanding leakages as “ink-blot” or “smoke-stack” [4,5,6].

Typically, CSC is a self-limited process; hence, observation is the appropriate first-line approach [1, 2]. In other cases when refractory, persistent, and recurrent CSC appears, treatment options like anti-vascular endothelial growth factor (VEGF) therapy, verteporfin photodynamic therapy (PDT), laser photocoagulation, and subthreshold micropulse laser (HSML) should be considered [5]. Although PDT stands out as the best available option with positive randomized controlled trial data to support its use [7], laser photocoagulation has solid evidences to hasten the resolution of SRD as long-lasting SRD could increase photoreceptor damage [8,9,10]. Considering the risk of scar and CNV induction brought by laser photocoagulation, the need to precisely identify the CSC leakage point(s) on FFA is paramount for safety and effectiveness. Accurate detection of the CSC leakage point(s) on FFA is therefore crucial given its diagnostic and therapeutic values.

The literature is dominated by papers on using artificial intelligence (AI) to detect common retinal lesions from color fundus photographs, including diabetic retinopathy [11,12,13,14,15,16], age-related macular degeneration [17, 18], and glaucoma [19, 20]. Similarly, AI-assisted detection of macular lesions from OCT images, such as epiretinal membrane [21, 22], macular edema [21,22,23,24,25], macular hole [22, 23], and age-related macular degeneration [23,24,25,26], has been well reported. Nonetheless, there is paucity of literature on the use of AI for dynamic lesion detection from FFA images. Since FFA is a time sequence-based imaging procedure, the lesion detection is more valuable when taking the dynamic change into consideration [27].

This study aims to use a deep learning algorithm (DLA) to automatically detect leakage points from dynamic FFA images to aid clinical diagnosis and improve treatment accuracy.

Methods

The key steps in the detection of FFA leakage points are summarized in Fig. 1. We used two networks, one for leakage segmentation and the other for macula and optic disk (OD) region segmentation. The macula and OD were used as the fixed anatomic landmarks, serving as the reference points to localize the leakage points. Then a false-positive elimination procedure was applied by analyzing the change of the spatially corresponding leakage points.

Fig. 1.
figure 1

The procedure for detecting CSC leakages in the FFA sequences. Abbreviations: FFA, fundus fluorescein angiography; OD, optic disk; AGN, attention gated network; CSC, central serous chorioretinopathy

Dataset and annotation

This was a retrospective cross-sectional study of patients from the Eye Center at the Second Affiliated Hospital of Zhejiang University from November 2017 to July 2019. All patients underwent FFA using tabletop systems HRA-II at 30° (Heidelberg Engineering, Heidelberg, Germany) at 768 × 768 pixels. The diagnosis of CSC was made based on clinical, FFA, and OCT findings. Patients with prior treatment (e.g., laser treatment) or other fundus diseases (e.g., diabetic retinopathy, age-related macular degeneration, choroidal neovascularization) were excluded. FFA images of poor quality due to extensive retinal hemorrhage and media opacities were excluded.

Three ophthalmologists (KJ, YX, and YW) separately delineated the borders of the leakage points on FFA. The final ground truth map was generated by assessing the interobserver consistency and finally achieving consensus through discussion. A binary map for each FFA image was created using the flood-fill algorithm, an algorithm which identified the annotated borders and then replaced the area within and outside the borders with two specified colors. The manual annotation results provided the ground truth for algorithm training and evaluation. The FFA images in this study were normalized digitally for luminosity and contrast using an enhancement algorithm. The OD region and macula region were then annotated similarly using the same method.

Leakage detection

We used an algorithm called attention gated network (AGN) for leakage segmentation, which was constructed by integrating an attention gate model on a convolutional neural network (CNN). The aim of attention gate is to suppress feature activations in irrelevant regions. In comparison to the standard CNN model, AGN is proved to achieve better performance in a variety of tasks on medical image analysis [28].

Multiple patches were generated from different locations of the retina using a sliding window with a size of 256 × 256 pixels and a stride of 128 pixels (demonstrated in Fig. 1). As a result, each FFA image was cropped into 25 patches. The patches were then input into the AGN. Figure 1 shows the architecture of AGN. The successively connected convolutional layers transformed the input image and propagated output information into the next layer. Four attention gates were used to filter the features propagated through the skip connections. Cross entropy was used as the loss function and stochastic gradient descent (SGD) was used as the optimization algorithm. The batch size was set to 64. The initial learning rate was set to 0.01 and dropped to 10−3 and 10−4 at the 40 and 50 epochs. The weight decay coefficient and the momentum are set to 10−4 and 0.9, respectively. The resulting output was a probability map indicating the probability value (from 0 to 1) of each pixel on the FFA image.

Then a range of threshold value (from 0 to 1) was applied to create a detection map by thresholding the probability values in the probability map. Each pixel which exceeded the threshold value would be estimated to be a positive pixel. Then each connected component by the positive pixels was considered to be a detected lesion. Thus, every threshold value had a corresponding result of detection map.

Three metrics were used to evaluate the model performance across different thresholds—dice, precision, and recall. They are calculated based on true positives (TP), false positives (FP), and false negatives (FN), respectively. The following equations define how these parameters relate to the three metrics:

$$ {\displaystyle \begin{array}{c} Precision= TP/\left( TP+ FP\right)\\ {} Precision= TP/\left( TP+ FP\right)\\ {} Dice={2}^{\ast } TP/\left({2}^{\ast } TP+ FP+ FN\right)\end{array}} $$

TP expresses the region presented both in the detection result and the ground truth. FP expresses the region detected as a leakage but not presented in the ground truth. FN expresses the region presented in the ground truth but not detected by the model. Recall has the same meaning as TP rate, while precision is the ratio of TP divided by the sum of the TP and FP. Dice represents the degree of similarity (contact ratio) between the detecting result of the proposed model and the ground truth. A precision-recall curve (PR curve) was plotted to represent model performance, with its X and Y axes defined as recall and precision, respectively.

Elimination of FP based on time sequence

To eliminate FP, an elimination procedure is needed. This procedure involves the following steps.

The view of every frame in an FFA report varies in general, so the macula and the OD were used as the fixed anatomic landmarks, serving as the reference points to localize the leakage points. Thus a U-net [29] was built to segment the borders of the macula and OD simultaneously. We chose the cross entropy as the loss function in the macula area and OD area. The optimizer of the model is adaptive moment estimation (Adam). The batch size was set to 4. The initial learning rate was set to 0.01 and dropped to 10−3 and 10−4 at the 45 and 60 epochs. The weight decay coefficient and the momentum are set to 10−4 and 0.9, respectively. The center of the detected macula region was estimated to be the center of the fovea. A circle with a radius of 100 pixels centered on the OD was masked as an area of exclusion, because the leakage points of CSC are predominantly located around the macula region. Its exclusion was able to improve the detection performance by reducing sources of false targets.

Each identified leakage is marked with the indices α, m, n, and S. α is the angle between the line from the center of the macula to the leakage (its distance is measured as m) and the line from the center of the OD to the leakage (its distance is measured as n). S represents the area of the particular leakage point. In each FFA report, the same leakage point in different images can be identified by calculating the differences in the angle and distance indices (Fig. 2), and the change of the leakage area can be illustrated by the differences in the area index. A true leakage point is defined if meeting both the following two criteria: (a) appearance in spatially corresponding regions in the consecutive imaging frames and (b) difference of the area index between the first and last appearance of a leakage meeting or exceeding 500 pixels. The value of 500 pixels was determined by the repeated trials during training. Choosing a value that is too small results in the weakening ability to eliminate FP, and a value too big will inversely cause the misidentification of FP. We set a list of optional arguments including 200, 300, 400, 500, 600, and 700 pixels. The value of 500 pixels performed best through the assessment of FP detection rate.

Fig. 2.
figure 2

Illustration of the registration pattern, showing how a leakage point is localized and how to identify the spatially corresponding regions in different FFA frames. Two presented FFA images are from the same FFA sequence. The top image shows the earlier frame from 0:35.13, and the bottom image shows the later frame from 5:35.48. Each identified leakage (green region) is marked with (α, m, n, S), where α is the angle between the line from the center of the macula (the red region) to the leakage (measured as m) and the line from the center of the OD (the blue region) to the leakage (measured as n) and S represents the area of the leakage area. The leakage points are spatially aligned when the conditions listed in the dashed box are met, which means they are the same leakage point on the retina

Results

We retrospectively collected 2104 FFA images from 291 FFA sequences of 291 eyes (137 right eyes and 154 left eyes) from 262 patients (mean age 44.7±7.9 years old; male and female ratio 2.73:1). The whole dataset was segmented into training set (1229 FFA images from 169 FFA sequences), validation set (439 FFA images from 61 FFA sequences), and test set (440 FFA images from 61 FFA sequences). Table 1 describes the demographics of the study subjects.

Table 1 Population characteristics of the population studied

The models were trained in a system with two NVIDIA 2080Ti (11GB RAM) graphics card and an Intel Core i7 processor. During the test, one NVIDIA 2080Ti (11GB RAM) graphics card was used and the average execution time required for one single test (including the detection of optic disk, macula, and leakage regions in one FFA sequence, as well as its FP elimination part) was approximately 900 ms.

Results of leakage detection

The training set consisted of 1229 FFA images from 169 FFA sequences. Each FFA image was cropped into 25 patches. Figure 3 shows the precision-recall curves of the proposed algorithm before the FP elimination. Depending on different preset threshold value, the performance in detecting leakage varies. When the threshold value increases, the precision increases, while the recall decreases. The precision-recall curve shows the trade-off between precision and recall for different threshold. At a threshold of 0.5, the dice value was 0.79.

Fig. 3.
figure 3

The precision-recall curves of the proposed algorithm before the FP elimination

In particular, the lesion level performance at the threshold of 0.5 was conducted by calculating the number of TP results, FP results, and FN results per case, which were determined by the overlapping region between the detection result and the ground truth. In 37 of 61 cases (60.7%), the detected leakages perfectly matched the ground truth. The detection results of the other 24 cases (39.3%) were listed in Supplementary Table 1. The dice value on the lesion level was 0.811.

Results of FP elimination

In the segmentation of the macula and the OD by U-net, the training set consisted of 1229 FFA images from 169 FFA sequences. The dices of macula region and OD region were 0.92 and 0.85, respectively. To present the effects of elimination procedure in a direct and intuitive manner, detection maps are obtained at a threshold of 0.5 before and after FP elimination for illustration. Figure 4 shows an example which is accurately detected, and Figure 5 shows two examples to illustrate the effect of the elimination procedure on the detection maps. Supplementary Table 1 summarizes the cases with improvement in the number of missed or wrongly detected leakage points after the elimination of FP set at threshold of 0.5. After the elimination procedure, 57 of 61 cases (93.4%) were accurately detected, with 27 of 31 FP lesions eliminated. The dice value on the lesion level improved to 0.949.

Fig. 4.
figure 4

Representative segmentation results of the proposed method. Four original FFA images of case 13 are showed in the top row, with the four images displayed in the second row showing the corresponding detection results before FP elimination. The larger image on the right illustrates the detection result after FP elimination. The green regions indicate the identified leakage, while the blue regions indicate the segmented OD and the red regions indicate the segmented macula

Fig. 5.
figure 5

Illustrative examples of the FP elimination procedure in case 18 (a) and case 30 (b). The detected regions indicated by the yellow arrows got eliminated in the FP elimination procedure

The experiment also carried out the qualitative comparison of segmentation results for leakage, optic disk, and macular region versus three ophthalmologists. As shown in Fig. 6, the proposed algorithm exhibits superior performance in the segmentation of these three regions, which is comparable to the human experts.

Fig. 6.
figure 6

Qualitative comparison of segmentation results for leakage, optic disk, and macular region versus three ophthalmologists

Discussion

We proposed a novel method to automatically detect CSC leakages that occur during dynamic FFA imaging. The dice value on the lesion level reached 0.949 at the threshold of 0.5, up from 0.811 before the FP elimination. Fifty-seven of 61 cases (93.4%) were accurately detected. The improvement in detection performance suggests that the applied FP elimination procedure resulted in a satisfactory outcome. Some typical examples (cases 18 and 30) which benefited from the elimination procedure are presented in Fig. 5 to show the improved detection results. The detected areas were removed by the FP elimination procedure for these 2 cases as they did not meet the inclusion criteria for a true leakage point. The use of a single-frame detection strategy is limited by the problem of wrong or missed detection. The DLA used in this study yielded superior detection rates as it accounted for changes in the lesion with time in dynamic FFA imaging by using spatially corresponding regions.

To investigate the FPs that failed to be eliminated in this study, we manually reviewed the remaining FPs to determine their cause(s). Figure 7a shows an example with several closely spaced leakage points that fuse together in the late phase of angiography. Figure 7b shows an example with the wrongly detected leakage point first appearing in the last FFA frame. A possible explanation for this phenomenon is the use of unevenly paced image frames which are not uncommon in real-world clinical imaging. Applying a standardized imaging protocol may address this problem.

Fig. 7.
figure 7

Two illustrating examples showing the FP failed to be eliminated by the proposed algorithm. a In case 07, several closely spaced leakages fuse together in the late phase of angiography (5:54.52). b In case 25, the leakage area spreads too far to be identified as one single leakage

To detect the leakage point, an AGN is used. This a deep learning model that is adept at leverage salient regions in medical images. It improves model performance for global and dense label predictions by suppressing feature activations in irrelevant regions despite small leakages. However, if the entire image is used as the input, the overwhelmingly higher number of negative pixels will overshadow the positive pixels and compromise training of the network. To resolve this problem, the images are cropped into patches with a size of 256 × 256 pixels and a stride of 128 pixels to focus on the area of interest only. Several researches have successfully applied this strategy to analyze medical images in ophthalmology [30,31,32].

Few researches have studied the leakage detection or automatic diagnosis of CSC in FFA images to date. Zhao et al. [33] proposed a saliency-based framework to detect three types of FFA leakage in malarial retinopathy. They reported sensitivities of 98.1, 88.2, and 82.7% in detecting large focal, punctate focal, and vessel segment leakage, respectively. But their work was based on single-frame FFA images without any temporal information. As a time sequence–based imaging procedure, FFA interpretations will not be reliable without comparisons between consecutive FFA frames. Phillip et al. [34, 35] used one early and one late FFA frame to measure macular leakage in diabetic maculopathy based on the rate of fluorescence change. An accuracy rate of 97% was reported. However, the proposed method required manual alignment during image acquisition and was unable to avoid FP. The FPs are due to lesions that exhibit increasing fluorescence with time, such as drusen and peripapillary scleral crescent, but are not true leakages. Moreover, using only two frames to detect leakage is not ideal and inadequate for diagnosis in FFA interpretations. Rabbani et al. [27] subtracted the mean early frame from the mean late frame to detect leakage areas in FFA images of diabetic retinopathy patients. This study which involved 24 eyes had a mean accuracy of 86%.

The impetus for this study is the growing demand for navigated laser photocoagulation systems such as NAVILAS [36,37,38,39] that is FFA image-guided. This system takes reference from imported FFA images to tailor and monitor laser applications, thus enabling precision and customization. The use of NAVILAS increased the microaneurysm hit accuracy of laser photocoagulator from 72 to 92% in the treatment of diabetic macular edema [37] and permitted the resolution of subretinal fluid in 75% of the chronic non-resolving CSC patients [39]. However, the target region and the forbidden zone of laser (the OD and macula region) were detected manually, and the registration operation was unable to run in an automated pattern. Although interpreting the leakage point from FFA is not a tough task in most cases, there is plenty that AI can do. Conventionally, doctors would analyze the leakage sites from FFA reports beforehand and then conduct the laser procedure by memory, which seemed a bit slow and laborious. What’s more, deviation will be inevitably induced. What we mainly focused on was not all about avoiding the misinterpretation of FFA leakage point by doctors but approving the doctors’ efficiency and reducing the introduced deviation mentioned above. We believe the accurate detection of true leakage points will largely pave the way for that purpose.

In addition, as an invasive therapy, laser photocoagulation needs to be very cautiously chosen and conducted. We have made the following efforts to minimize the avoidable damage. On the one hand, the distance between the detected leakage and the center of the macula was measured as “m,” which is supposed to serve as a “gatekeeper” before conducting the laser treatment in case of the misdiagnosis of the normal retina near the macula as the leakage point. When “m” is less than 500 μm, the leakage would be highlighted and need special approval before laser treatment. On the other hand, the planned target region of a true leakage point should be traced back to the FFA frame where it first appeared. This is only possible with our proposed method in this study. For example, in case 13 showed in Fig. 4, the detected leakage first appeared in 1:02.64. It would be most appropriate if the segmentation area in this frame could serve as the laser target.

It is worth noting that we expect our method to have wider clinical applications. Considering the complex temporal variation of retinal leakages, we preliminarily used typical CSC sequences as “ink-blot” or “smoke-stack” pattern. Given the promising results of the present study, the work could hopefully expand to detect leakages in other retinal diseases after proper adjustment, such as diabetic retinopathy. In addition, taken another perspective, a “time sequence” could also be formed by an array of imaging data at each follow-up visit. The interpretation of such “time sequence” might serve as an aid to monitor the progression, recovery, or activity of the certain disease.

There are some limitations to our study. First, it is limited by being a single-center, pilot study with a small sample size. There will be a need for further studies with a larger study population to verify our study results. Second, although the applied sliding window method showed promising results, bias is inevitably introduced by the predefined patch size and stride length. The optimal size of the patch is difficult to determine. Choosing a size that is too small results in an increase of the computational time per image, and a size too big will weaken the detection performance. Third, poorly aligned FFA frames may result in the exclusion of the OD or macula region, making accurate lesion localization difficult and causing segmentation error. Fourth, given the variation in the appearance and duration of the leakage point, the FFA frames are taken in different timings. Therefore, there is no fixed time sequence in the FFA images. This differential time sequence results in problems in segmentation and lesion localization. Applying a standardized imaging protocol can be part of the solution. However, the time variation that came from different blood circulation times in different subjects remained insoluble.

Our study demonstrated the benefits of using a DLA to analyze consecutive FFA images over a temporal sequence to automatically detect leakage points in CSC. The detection performance of the DLA is improved with the employment of a FP elimination procedure.