Keywords

1 Introduction

Throughout the history nature has been an inexplicable source of inspiration for scientists and researchers. Observations, many of which made unintentionally, have been triggering the inquisitive minds for hundreds of years. The task of resolving problems and its often present nature in the minds of scientists boosts the impact of these observations, which in cases led to discoveries. Among others, researchers in mathematics, physics and natural sciences have had their fair share of ‘observations-leading-to-discoveries’.

Observing the magnificently choreographed movements of birds, behaviour of ants foraging, convergence of honey bees in search for food source and so forth has led several researchers to propose (inspired versus identical) models used to solve various optimisation problems. Genetic Algorithm [10], Particle Swarm Optimisation [11] and Ant Colony Optimisation [8] are only few such techniques belonging to the broader category of swarm intelligence; it investigates collective intelligence and aims at modelling intelligence by looking at individuals in a social context and monitoring their interactions with one another as well as their interactions with the environment.

The work presented here aims at proposing a novel nature-inspired algorithm based on the behaviours of flies hovering over food sources. This model—Dispersive Flies Optimisation or DFO—is first formulated mathematically and then a set of experiments is conducted to examine its performance when presented with various problems.

Afterwards an introduction to metastatic disease is given along with a brief explanation on how to detect metastasis. The swarm intelligence algorithm is adapted for the purpose of this research. Next, a brief summary of x-ray mammography and its use is presented, emphasising on mammographic film reading as a particularly demanding visual task, which could be facilitated using the technique presented in this paper.

2 Dispersive Flies Optimisation

Dispersive Flies Optimisation (DFO) is an algorithm inspired by the swarming behaviour of flies hovering over food sources. The swarming behaviour of flies is determined by several factors and that the presence of threat could disturb their convergence on the marker (or the optimum value). Therefore, having considered the formation of the swarms over the marker, the breaking or weakening of the swarms is also noted in the proposed algorithm.

In other words, the swarming behaviour of the flies, in Dispersive Flies Optimisation, consist of two tightly connected mechanisms, one is the formation of the swarms and the other is its breaking or weakening. The algorithm and the mathematical formulation of the update equations are introduced below.

The position vectors of the population are defined as:

$$\begin{aligned} \varvec{x}_{i}^{t} = \left[ x_{i1}^{t}, x_{i2}^{t}, \ldots , x_{iD}^{t}\right] ,\qquad i = 1,2,\ldots ,\textit{NP} \end{aligned}$$
(1)

where t is the current time step, D is the dimension of the problem space and \(\textit{NP}\) is the number of flies (population size).

In the first generation, when \(t=0\), the ith vector’s jth component is initialised as:

$$\begin{aligned} x_{\textit{id}}^{0} = x_{{\textit{min}},d} + r\left( x_{{\textit{max}},d} - x_{{\textit{min}},d}\right) \end{aligned}$$
(2)

where r is a random number drawn from a uniform distribution on the unit interval \(U\left( 0,1\right) \); \(x_{{\textit{min}}}\) and \(x_{{\textit{max}}}\) are the lower and upper initialisation bounds of the dth dimension, respectively. Therefore, a population of flies are randomly initialised with a position for each flies in the search space.

On each iteration, the components of the position vectors are independently updated, taking into account the component’s value, the corresponding value of the best neighbouring fly (consider ring topology) with the best fitness, and the value of the best fly in the whole swarm:

$$\begin{aligned} x_{\textit{id}}^{t}= & {} x_{\textit{nb},d}^{t-1} + \text {U}\left( 0,1\right) \times ( x_{\textit{sb},d}^{t-1} - x_{\textit{id}}^{t-1} ) \end{aligned}$$
(3)

where \(x_{\textit{nb},d}^{t-1}\) is the value of the neighbour’s best fly in the dth dimension at time step \(t-1\); \(x_{\textit{sb},d}^{t-1}\) is the value of the swarm’s best fly in the dth dimension at time step \(t-1\); and \(\text {U}\left( 0,1\right) \) is the uniform distribution between 0 and 1.

The algorithm is characterised by two principle components: a dynamic rule for updating flies position (assisted by a social neighbouring network that informs this update), and communication of the results of the best found fly to other flies.

As stated earlier, the swarm is disturbed for various reasons; one of the positive impacts of such disturbances is the displacement of the disturbed flies which may lead to discovering a better position. To consider this eventuality, an element of stochasticity is introduced to the update process. Based on this, individual components of flies’ position vectors are reset if the random number, r, generated from a uniform distribution on the unit interval \(\mathrm{U}\left( 0,1\right) \) is less than the disturbance threshold (dt). This guarantees a proportionate disturbance to the otherwise permanent stagnation over a likely local minima.

Algorithm 1 summarises the DFO algorithm.Footnote 1

figure a

The next section briefly presents three population-based algorithms which will be used to compare the performance of DFO, and then the results of a series of experiments conducted on DFO over a set of benchmark functions are reported.

3 Experiments

This section presents a set of experiment investigating the performance of the newly introduced Dispersive Flies Optimisation (DFO) and discusses the results. Then, to understand whether disturbance plays an important role in the optimisation process, a control algorithm is presented DFO-c where no disturbance is inflicted upon the population of flies.

Recognising the lose of diversity as a common issue in all distribution based evolutionary optimisers (since dispersion reduces with convergence), the impact of disturbance on preserving the diversity of the population is also studied. Additionally, an optimal value for disturbance threshold, dt, is suggested. Afterwards the performance of DFO is compared against few other well-known population-based algorithms, namely Particle Swarm Optimisation (PSO), Differential Evolution (DE) and Genetic Algorithm (GA).

3.1 Experiment Setup

The benchmarks used in the experiments (see Table 1) are divided in two sets, \(f_{1-14}\) and \(g_{1-14}\); more details about these functions (e.g. global optima, mathematical formulas, etc.) are reported in [3, 15]. The first set, \(f_{1-14}\), have been used by several authors [3, 12, 14] and it contains the three classes of functions recommended by Yao et al. [17]: unimodal and high dimensional, multimodal and high dimensional, and low dimensional functions with few local minima. In order not to initialise the flies on or near a region in the search space known to have the global optimum, region scaling technique is used [9], which makes sure the flies are initialised at a corner of the search space where there are no optimal solutions.

Table 1 Benchmark Functions

The second test set, \(g_{1-14}\), are the first fourteen functions of CEC 2005 test suite [15] and they present more challenging features of the common functions from the aforementioned test set (e.g. shifted by an arbitrary amount within the search space and/or rotated). This set has also been used for many researchers.

One hundred flies were used in the experiments and the termination criterion for the experiments is set to reaching 300,000 function evaluations (FEs). There are 50 Monte Carlo simulations for each experiment and the results are averaged over these independent simulations. Apart from the disturbance threshold which is set to \(\textit{dt} = 0.001\), there are no adjustable parameters in DFO’s update equation.

The aim of the experiments is to study and demonstrate the qualities of the newly introduced algorithm as a population based continuous optimiser. The behaviour of the DFO algorithm is compared against its control counterpart and some other population based algorithms.

In this work, a standard particle swarm version, Clerc-Kennedy PSO (PSO-CK) is used. In terms of DE, DE/best/1 variation of mutation approaches is deployed with \(\textit{CR}\) and F set to 0.5. In GA algorithm, the probabilities of crossover and mutation of the individuals is set to \(p_{c} = 0.7\) and \(p_{m} = 0.9\) respectively. The tournament size of the tournament selection is set to two, and elitism with an elite size of one is deployed to maintain the best found solution in the population.

The details of these algorithms and the rest of configuration is given in [1].

3.2 Performance Measures and Statistical Analysis

In order to conduct the statistical analysis measuring the presence of any significant difference in the performance of the algorithms, Wilcoxon \(1\times 1\) non-parametric statistical test is deployed. The performance measures used in this paper are error, efficiency, reliability and diversity which are described below.

Error is defined by the quality of the best agent in terms of its closeness to the optimum position (if knowledge about the optimum position is known a priori, which is the case here). Another measure used is efficiency which is the number of function evaluations before reaching a specified error, and reliability is the percentage of trials where a specified error is reached. These performance measures are defined as below:

$$\begin{aligned} \textsc {Error}= & {} \left| f\left( \varvec{x}_g\right) -f\left( \varvec{x}_{o}\right) \right| \end{aligned}$$
(4)
$$\begin{aligned} \textsc {Efficiency}= & {} \frac{1}{n}\sum _{i=1}^{n}\text {FEs}\end{aligned}$$
(5)
$$\begin{aligned} \textsc {Reliability}= & {} \frac{n^{'}}{n}\times 100 \,\,\,\, \end{aligned}$$
(6)

where \(\varvec{x}_g\) is the best position found and \(\varvec{x}_o\) is the position of the known optimum solution; n is the number of trials in the experiment and \(n^{'}\) is the number of successful trials, \(\text {FEs}\) is the number of function evaluations before reaching the specified error, which in these experiments, set to \(10^{-8}\).

In this work, diversity, which is the degree of convergence and divergence, is defined as a measure to study the population’s behaviour with regard to exploration and exploitation. There are various approaches to measure diversity. The average distance around the population centre is shown [13] to be a robust measure in the presence of outliers and is defined as:

$$\begin{aligned} \textsc {Diversity}= & {} \frac{1}{\textit{NP}}\sum _{i=1}^{\textit{NP}}\sqrt{\sum _{j=1}^{D}\left( x_{i}^{j} - \bar{x}^{j}\right) ^2}\end{aligned}$$
(7)
$$\begin{aligned} \bar{x}^j= & {} \frac{1}{\textit{NP}}\sum _{i=1}^{\textit{NP}}x_{i}^{j} \end{aligned}$$
(8)

where \(\textit{NP}\) is the number of flies in the population, D is the dimensionality of the problem, \(x_{i}^{j}\) is the value of dimension j of agent i, and \(\bar{x}^{j}\) is the average value of dimension j over all agents.

3.3 Performance of Dispersive Flies Optimisation

The error, efficiency and reliability results of DFO performance over the benchmarks are reported in Table 2. The first five columns detail the error-related figures and the last column highlights the median efficiency along with the reliability (shown between brackets) of the algorithm in finding the optima. The algorithm exhibits a promising performance in optimising the presented problem set where half the benchmarks (\(f_{1-2,5-11}\) and \(g_{1-2,7,9}\)) are optimised with the specified accuracy. The figures in the table are expanded in the following categories:

Table 2 DFO—Dispersive Flies Optimisation

Unimodal, high dimensional ( \({{\varvec{f}}}_{\mathbf{1,2}}, {{\varvec{g}}}_{\mathbf{1-5}}\) ) The algorithm optimises \(57\,\%\) of the benchmarks in this category; while both functions in the first set are optimised (\(f_{1,2}\)), only two out of five benchmarks in the second and more challenging set are optimised to the specified accuracy. All optimised benchmarks achieve \(100\,\%\) success.

Low dimensional and few local minima ( \({{\varvec{f}}}_\mathbf{10-14}\) ) In this category, \(40\,\%\) of the benchmarks are optimised, with \(100\,\%\) reliability for \(f_{10}\) and \(32\,\%\) for \(f_{11}\). However, none of the Shekel functions (\(f_{12-14}\)) are optimised; Shekel is known to be a challenging function to optimise due to the presence of several broad sub-optimal minima; also the proximity of a small number of optima to the Shekel parameter \(\varvec{a}_i\) is another reason for the difficulty of optimising these set of functions.

Multimodal, high dimensional ( \({{\varvec{f}}}_\mathbf{3-9}, {{\varvec{g}}}_\mathbf{6-14}\) ) The optimiser is able to optimise \(50\,\%\) of the benchmarks in this category (\(f_{5-9}\) and \(g_{7,9}\)), \(71\,\%\) of which achieve \(100\,\%\) success rate (all except \(f_7,g_7\) with 28 and \(10\,\%\) success rates respectively). The optimiser exhibit a promising performance when dealing with the difficult Rosenbrock functions (\(f_3,g_6\)), reaching the error of \(10^{-4}\) and \(10^{-3}\) respectively. The algorithm performs exceptionally well in optimising the infamous Rastrigin functions, both common and shifted mode (i.e. \(f_5\) and \(g_9\)), achieving \(100\,\%\) success rate; however it does show weakness in the more challenging \(g_{10}\) rotated version.

The success of the optimiser in optimising the notorious Rastrigin function in its common and shifted modes will be discussed in the context of DFO’s dimension-to-dimension disturbance mechanism induced by the algorithm.

In order to provide a better understanding of the behaviour of the algorithm, in the next section, the disturbance is discarded and the diversity of the algorithm is studied.

3.4 Diversity in DFO

Most swarm intelligence and evolutionary techniques commence with exploration and, over time (i.e. function evaluations or iterations), lean towards exploitation. Maintaining the right balance between exploration and exploitation phases has proved to be difficult. The absence of the aforementioned balance leads to a weaker diversity when encountering a local minimum and thus the common problem of pre-mature convergence to a local minimum surfaces. Similar to other swarm intelligence and evolutionary algorithms, DFO commences with exploration and over time, through its mechanism (i.e. gradual decrease in the distance between the members of the population and as such, each agent’s local and global best positions), moves towards exploitation. However, having implemented the disturbance threshold, a dose of diversity (i.e. \(\textit{dt}\)) is introduced in the population throughout the optimisation process, aiming to enhance the diversity of the algorithm.

Figure 1 illustrates the convergence of the population towards the optima and their diversities in three random trials over three benchmarks (i.e. \(g_{1,7,9}\) chosen from the second set) as examples from unimodal and multimodal functions. The difference between the error and the diversity values demonstrates the algorithm’s ability in exploration while converging to the optima whose fitness reach as low as \(10^{-13}\) in \(g_{1}\) and \(g_9\).

Fig. 1
figure 1

DFO and DFO-c: diversity and error in \(g_{1,7,9}\)

Exploring the role of disturbance in increasing diversity, a control algorithm is proposed (DFO-c) where there is no disturbance (\(\textit{dt} = 0\)) during the position update process. The graphs in Fig. 1 illustrate the diversity of DFO-c populations in randomly chosen trials over three sample benchmarks (again \(g_{1,7,9}\)). The graphs illustrate that the diversity of the population in DFO-c is less than DFO, thus emphasising the impact of disturbance in injecting diversity which in turn facilitates the escape from local minima (e.g. as demonstrated in case of the highly multimodal Rastrigin functions \(f_5,g_9\)). Note the gradual shrinkage of diversity in \(g_9\) (\(\approx \) \(10^{-13}\)) which is a clear indication of a premature convergence to a local minima with very poor chance of escape.

In order to compare the performance of DFO and its control counterpart, Table 3 presents the result of optimising the benchmarks using DFO-c. Additionally, a statistical analysis is conducted and the output is reported in Table 4 where the performance is compared using the three aforementioned measures of error, efficiency and reliability (see Sect. 3.2 for the definitions of the measures). The results show that in \(89\,\%\) of cases (where there is a significant difference between the two algorithms), DFO is performing significantly better than its control counterpart (DFO-c) which is stripped from the diversity inducing disturbance. Furthermore, in all multimodal functions (\(f_{3-9}\) and \(g_{6-12}\)), whenever there is a statistically significant difference between DFO and DFO-c, the former demonstrates significant outperformance over the later.

Following on the results from measuring error, Table 4 also shows that in terms of efficiency and reliability measures, DFO is \(79\,\%\) more efficient than its control counterpart, and \(92\,\%\) more reliable.

Table 3 DFO-c—Control DFO Algorithm
Table 4 Comparing DFO and DFO-c Performance

3.5 Fine Tuning Disturbance Threshold

The role of disturbance in increasing the diversity of DFO population is discussed earlier (Sect. 3.4). Also, the importance of disturbance is investigated on the optimisation capability of DFO by introducing a control algorithm which lacks the disturbance mechanism and the results demonstrate the positive impact of this mechanism. The aim of this section is to recommend a value for the disturbance threshold, \(\textit{dt}\). The range of disturbance probabilities used in this experiment is between 1 and \(10^{-9}\) and the values were chosen according to:

$$\begin{aligned} \textit{dt}_n = 10^{-n}, \quad 0 \le n \le 9 \end{aligned}$$
Fig. 2
figure 2

Fine tuning disturbance threshold

Figure 2 illustrates the performance of DFO using these \(\textit{dt}\) probabilities. Both set of benchmarks (i.e. \(f_{1-14}\) and \(g_{1-14}\)) have been used to find a suitable value for the disturbance threshold. As the heat map highlights, the optimal range is \(10^{-2} < \textit{dt} < 10^{-4}\) and the overall recommended value of \(\textit{dt} = 10^{-3}\) is suggested as a good compromise.

3.6 Comparing DFO with Other Population-Based Optimisers

Table 5 DE (Differential Evolution), PSO (Particle Swarm Optimisation) and GA (Genetic Algorithm)
Table 6 Comparing Error in DFO with DE, PSO and GA

Having presented the performance of the DFO algorithm (taking into account the three performance measures of error, efficiency and reliability, as well as the diversity of its population and the impact of disturbance on its behaviour), this section focuses on contrasting the introduced algorithm with few well-known optimisation algorithms. The three population algorithms deployed for this comparison are Differential Evolution, Particle Swarm Optimisation and Genetic Algorithm. In this comparison, only the second and the more challenging set of benchmarks, \(g_{1-14}\) are used. Table 5 presents the optimising results of the aforementioned algorithms, and as shown, the algorithms have optimised some of the benchmark to the specified accuracy, \(10^{-8}\). Table 6 shows the result of the statistical analysis comparing DFO with the other three optimisers. Based on this comparison, whenever there is a significant difference between the performance of DFO and the other algorithms, DFO significantly outperforms DE, PSO and GA in 66.67, 58.33 and \(85.71\,\%\) of the cases, respectively. Table 7 summaries the efficiency results of the three optimisers with that of DFO; note that only the efficiency of functions reaching the specified error is given. As shown in the table, DFO, in the majority of cases, outperforms the other algorithms. In other words, although, when compared with DE, DFO only outperforms marginally (\(60\,\%\)), it outperforms both PSO and GA in all cases (\(100\,\%\)). The reliability comparison of DFO with the other optimisers is given in Table 8. DFO is shown to be the most reliable algorithm in this comparison. While DFO outperforms DE in \(75\,\%\) of cases, it show \(100\,\%\) outperformance when compared with PSO and GA. In order to compare the diversity of the DFO algorithm with the other three optimisers, three benchmarks were chosen from unimodal and multimodal categories (\(g_{1,7,9}\)). The result of this comparison is illustrated in Fig. 3. It is shown that DE has the least diversity in both uni- and multimodal functions. On the other hand, the diversity of the population in PSO decreases as the population converges towards an optimum (see \(g_{1}\)); however, when convergence does not occur (e.g. in \(g_{7,9}\)), PSO maintain its high diversity throughout the optimisation process. GA shows a similar pattern to that of PSO in multimodal functions, which is the gradual diversity decrease over time; however it maintains a higher diversity for the unimodal function than PSO (perhaps attributable to the difference in the fitness of the best positions found in both algorithms). In terms of DFO, diversity is less convergence-dependent and more stable across all modalities.

Table 7 Comparing Efficiency in DFO with DE, PSO and GA in this table, 1–0 (0–1) indicates that the left (right) algorithm is more efficient
Table 8 Comparing Reliability in DFO with DE, PSO and GA in this table, 1–0 (0–1) indicates that the left (right) algorithm is more reliable
Fig. 3
figure 3

Diversity of the population in DFO, DE, PSO and GA over three random trials in \(g_{1,7\text { and }9}\)

4 Computer Aided Diagnosis and Metastatic Disease

Computer aided diagnosis (CAD) is an emerging field in medicine. The technique introduced in this paper can help radiologists to examine the image in greater depth and has the potential to help doctors from different medical disciplines to interpret medical imaging with greater confidence. Furthermore CAD is a promising learning tool for both medical students and junior doctors to develop basic diagnostic skills. This paper presents a new CAD approach in which a recently developed swarm intelligence algorithm—Dispersive Flies Optimisation [1]—is applied to a medical imaging modality where the potential areas of microcalcifications on the x-ray mammography are detected (Fig. 4).

Fig. 4
figure 4

Mammograph

X-ray mammography has been shown to be effective as a method for detecting early breast cancer, but the success of mass screening depends critically on the availability of highly skilled film readers to interpret the images. The majority of film readers in the UK are consultant radiologists and in order to maintain a sufficiently high standard of interpretation, readers are required to undergo training, to keep in practice and to evaluate their performance at regular intervals [2]. Mammographic film reading is a particularly demanding visual task. In screening programmes, the film reader must search for extremely infrequent and often very subtle signs of cancer superimposed on complex and variable backgrounds. Early breast cancer may appear in a variety of forms: a few particles of microcalcification; a small ill-defined or speculated mass; abnormal asymmetry between right and left breast images, or subtle distortion of the underlying structure of the breast. These abnormalities vary in size, shape, structure, brightness and location and may share a great deal of similarity with normal mammographic appearances. False negative cases, in which signs of cancer are missed by a reader, sometimes occur. Retrospective evaluation of the previous screening films of cancers detected between screening rounds (interval cancers) and screen-detected cancers show evidence of abnormality in between 16 and 27 % of cases. Some of these signs are very subtle, and may have been seen by the readers but dismissed as being insignificant, but others are clear signs of malignancy [4, 5, 16]. However, different readers miss different cancers, as is evidenced by the success of double reading in which two readers independently read the films [6]. The most accurate method of interpretation is double reading with arbitration, where a third reader reviews cases about which the two readers disagree [6, 7]. In the UK particularly with the National Health Service Breast Screening Programme (NHSBSP) there is an increased demand for skilled manpower to effectively interpret mammographs and double or triple reading of the mammograph is not viable option due to the increased workload. A novel and different method of coping with this is the use of computer-based aids. Researchers have been developing algorithms to detect mammographic abnormalities for more than 30 years with the aim of either automating mammographic interpretation or, more realistically, providing a tool which will enhance human film-reading performance. There are two basic approaches to the problem of detecting abnormalities in mammograms: either to search the images for specific appearances suggestive of cancer, or to characterize normal mammographic appearance to the extent that it is possible to detect anything that fails to conform to the generated model of normality.

The purpose of the current study is to apply for the first time an swarm intelligence algorithm namely dispersive flies optimisation to perform the task of identifying the microcalcifications on the mammographs.

5 Applying Dispersive Flies Optimisation

In this paper, we are presenting a unique approach by deploying the recently developed DFO algorithm to detect microcalcifications on the mammographs. This approach demonstrates a promising ability to undertake this task with similar level of sensitivity. The scan used in this paper is processed by the DFO agents which are responsible for locating the affected areas.

The reproducibility and the accuracy of the DFO algorithm can be utilised in developing a standardised system to interpret bone scans and mammographs preventing operator errors and discrepancies. This technology can be employed as an adjunct to help radiologists assess the various parts of the bone scans and mammographs making the diagnosis of the lesions more thorough and less time consuming. Additionally this technique can be effectively used to develop programs for teaching and training medical students and junior doctors.

5.1 Experiments and Results

This section presents the technical details and the experiment setup, followed by the results and discussions of the performance of the algorithm.

The number of agents used in this experiment is 50,000. This figure depends on the size of the input scan (in the case of the paper the size of the scan is \(500 \times 667\) pixels) and the algorithm is run for 25 iterations (i.e. 25 cycles of test and diffusion phases). The output images shown later in the paper are snapshots taken after every 5 iterations recoding the behaviour of the agents at each stage. As stated earlier, in the beginning of the process, all the agents are initialised randomly throughout the search space.

DFO is adapted here to search for areas of metastasis or calcifications in the feasible solution space. Given that the problem is a multi-objective problem, on the contrary to Eq. 3 the local neighbourhood architecture of the algorithm is implemented as shown below:

$$\begin{aligned} x_{\textit{id}}^{t}= & {} x_{\textit{nb},d}^{t-1} + \text {U}\left( 0,1\right) \times ( x_{\textit{nb},d}^{t-1} - x_{\textit{id}}^{t-1} ) \end{aligned}$$
(9)

In order to evaluate the fitness of each agent, a radius (\(\textit{rad}\)) value is specified which determine how many pixels around the pixel chosen by the agent is used to calculate the fitness of each agent. In Model I of the algorithm the radius is set to 1, \(\textit{rad} = 1\) as shown in Fig. 5. In this model, the purpose is to highlight the area of calcification by allowing the DFO agents to converge on the areas of interest. In Mode II, radius is set to \(\textit{rad} = 10\) in order to segregate the areas that radiologists should pay particular attention. In this mode, the exact points of high calcifications are not marked but DFO agents form a border around the area of interest (Fig. 6).

Fig. 5
figure 5

\(\textit{rad} = 1\) for Model I. The symbol x represents the position of the agent and the o’s represent the pixels used in the calculation of the fitness value of the DFO agent

Fig. 6
figure 6

\(\textit{rad} = 10\) for Model II. The symbol x represents the position of the agent and the o’s represent the pixels used in the calculation of the fitness value of the DFO agent

As shown in Figs. 7 and 8 areas with higher potential of metastasis and calcifications are identified using Mode I and II respectively. These figures visually present the technique used, illustrating how agents congregate over the areas of interest over time (i.e. iterations) when fed with the scans as inputs of the algorithm. As the figures show, DFO agents converge to the areas of interest (as confirmed by the medical experts) throughout the entire search space.

Fig. 7
figure 7

Mode I Detecting calcifications

Fig. 8
figure 8

Mode II Detecting calcifications

6 Conclusion

Dispersive Flies Optimisation (DFO), a simple numerical optimiser over continuous search spaces, is a population based stochastic algorithm, proposed to search for an optimum value in the feasible solution space; despite its simplicity, the algorithm’s competitiveness over an exemplar set of benchmark functions is demonstrated. As part of the study and in an experiment, a control algorithm is proposed to investigate the behaviour of the optimiser. In this experiment, the algorithm’s induced disturbance mechanism shows the ability to maintain a stable and convergence-independent diversity throughout the optimisation process. Additionally, a suitable value is recommended for the disturbance threshold which is the only parameter in the update equations to be optimised. This parameter controls the level of diversity by injecting a component-wise disturbance (or restart) in the flies, aiming to preserve a balance between exploration and exploitation.

In addition to diversity, DFO’s performance has been investigated using three other performance measures (i.e. error, efficiency and reliability). Using these measures, it is established that the newly introduced algorithm, outperforms few generic population based algorithms (i.e. differential evolution, particle swarm optimisation and genetic algorithm) in all of the aforementioned measures over the presented benchmarks. In other words, DFO is more efficient and reliable in 84.62 and \(90\,\%\) of the cases, respectively; furthermore, when there exists a statistically significant difference, DFO converges to better solutions in \(71.05\,\%\) of problem set.

Additionally, this paper details the promising results of the novel application of DFO in detecting areas of interest and the identification of the potential microcalcifications on the mammographs. Two modes are proposed to further investigate the behaviour of the agents in the population and offer two representations of the outcome in order to emphasis on the area of interest and draw the attention of the clinicians in charge.

Finally, it is emphasised that the presented technique could be effectively utilised as an adjunct to the expert’s eyes of a specialist.