1 Introduction

In computer science, algorithmic efficiency refers to the amount of computational resources used by an algorithm. Naturally, to achieve maximum efficiency the usage of resources must be minimized; however, different resources such as time and space cannot be directly compared and the results of analyses depend on how efficiency is measured [1]. Actual efficiency may be in addition affected by implementation choices; the programming language, the way the algorithm is coded, the compiler or compilation options and even the operating system being used can also have their own effect. In summary, the efficiency of different algorithms devised to solve the same problem may differ dramatically. It is acknowledged that differences in design tend to be more significant than differences due to hardware and/or software [2], and because there is no globally unanimous agreement on how algorithmic efficiency should be measured and/or compared, the analysis of algorithmic efficiency is always a major challenge.

In this context, a vast body of the literature has focused on the efficiency of median filtering (MF) algorithms. MF is a sliding-window nonlinear smoothing process, best known for reducing impulsive or salt-and-pepper noise from a digital image while respecting its edges [3]. It is also the foundation upon which more advanced image filters are built [4], and some of its higher level applications include object segmentation, speech and writing recognition, sonar and radar, and others [5,6,7,8,9,10,11]. Briefly, the MF window visits each image pixel and places its center on it; all the intensity values within the window are sorted, and the median intensity value is then used to replace the window’s center in the filtered image. By using the median value as a consensus, the MF turns out to be more robust and stable than alternative filters.

Probably the main drawback of MF is its computational cost because sorting is a time-consuming task. Many authors have tried to approach this problem; a rather thorough comparison is offered in [12] where (if a squared window of radius size r is assumed as usual) the efficiency of several sorting methods is reviewed: insertion, with a \(O\left( {r^{2}} \right) \) algorithmic complexity; selection, which is also \(O\left( {r^{2}} \right) \); bubble sort, \(O\left( {r^{2}} \right) \); bucket sort, whose complexity also downs to \(O\left( {r^{2}} \right) \) when the pixel bit-depth is assumed constant, and more [13,14,15]. More recently, Tibshirani [16] binmedian algorithm used recursive binning and a Chebyshev’s approximation of the median by the mean to reach a \(O\left( {r^{2}} \right) \) complexity in average; alternatively, the interesting idea a double-linked list is introduced by Suomela [17] for MF in 1D, but the corresponding extension to the 2D case is not detailed.

One of the MF main references is Huang et al. algorithm [18] which was the first exhibiting a \(O\left( r \right) \) complexity. Different approaches have since tried to break this linearity: the Weiss method [19] uses hierarchical histograms to reach a \(O\left( {\log r} \right) \) complexity but losing simplicity, and the Gil–Werman method [20] has a \(O\left( {\log ^{2}r} \right) \) complexity and it is based on trees. A \(O\left( {\log r} \right) \) lower bound for the complexity of any two-dimensional MF algorithm was also given in [20]; this claim was refuted by Perreault and Herbert [21], where a variation of Huang et al. algorithm reaching an complexity of \(O\left( b \right) \) (with \(b=2^\mathrm{image\, bit-depth}\), the number of scale levels) is presented. In all these cases, complexity is computed per-pixel (however, if Huang et al. algorithm were applied to an image of, e.g., height by width AxB it can be shown that its global complexity would be in fact of \(O\left( {AxBxr} \right) )\). Among all these algorithms, [18] and [21] are of special interest: the latter claims the lowest complexity to date, while the former is its core.

In this article, a new insight into MF capabilities based on the optimal breakdown value (BV) of the median is offered, and BV-based versions of two of the most popular MF algorithms [18, 21] are introduced. A general framework to approach the theoretical analysis and comparison of MF algorithms is presented in the process, which uses several abstract and objective metrics that aid in understanding the MF subtle features. All the introduced ideas are experimentally tested by using real and synthetic images.

The structure of this article is the following: in Sect. 2, the main features of Huang et al. and Perreault and Herbert MF algorithms are reviewed. In Sect. 3, a new insight into the MF features is offered based on the optimal BV of the median; this concept is also used to present updated versions of both reviewed algorithms. A rather comprehensive framework for evaluating the theoretical efficiency of algorithms is introduced in Sect. 4: through different metrics the four MF versions at hand are analyzed and compared; experimental results are also given based on actual and synthetic images. The article ends with some final comments in Sect. 5.

2 The MF Algorithm

Let X be the matrix representing a digitized input image of size AxB (rows by columns), where \(X_{\left( {i,j} \right) } \) denotes the pixel value at the intersection of the image i-th row and the j-th column. The basic scheme is the following: in each step, a squared radius r window (alternative window’s shapes could also be used, with the corresponding adaptations [21, 22]) is centered at \(X_{\left( {i,j} \right) } \); the \(\left( {2r+1} \right) ^{2}\) window values are then sorted, their median is computed and then placed in pixel \(Y_{\left( {i,j} \right) } \) from the output image Y. The two most popular versions of the MF algorithm are reviewed next.

2.1 Huang et al. Algorithm

The key idea of Huang et al. algorithm (H_alg, in the following) is to use a kernel histogram \({\varvec{H}}\) to store and update all the values from the current window. Four basic steps can be distinguished in the filtering process (the pseudocode is shown in Algorithm A1):

  1. 1.

    Creating the kernel histogram \({\varvec{H}}\). In this step, the data structure to store the kernel histogram \({\varvec{H}}\) [line 5 in A1(a)] is created, and the memory needed for an array of size bx1 is reserved (\(b=2^\mathrm{image\, bit-depth}\) indicates the number of histogram bins).

  2. 2.

    Partial initialization (PI) of the kernel histogram \({\varvec{H}}\). This step gives \({\varvec{H}}\) most of its initial values [line 7 in A1(a)], an action to be repeated for every row. It can be further split into two substeps [lines 7.1–7.9 in A1(b)]:

    1. i.

      The kernel histogram \({\varvec{H}}\) is initialized with zero values [lines 7.1–7.3 in A1(b)].

    2. ii.

      For every row, the kernel histogram \({\varvec{H}}\) corresponding to the window centered at the first pixel is partially initialized with the appropriate image values [lines 7.4–7.9 in A1(b)]; this initialization will be completed through the first update step.

  3. 3.

    Updating the kernel histogram \({\varvec{H}}\). Here \({\varvec{H}}\) is updated while sliding through the image [lines 9–14 in A1(a)]. Figure 1 shows an example for a AxB sized image with a squared window of radius \(r=2\). When the window’s center shifts one pixel to the right, e.g., from \(X_{\left( {3,3} \right) } \) to \(X_{\left( {3,4} \right) } \), the update of \({\varvec{H}}\) requires those values from \(X_{\left( {1,1} \right) } ,\ldots ,X_{\left( {5,1} \right) } \) to be removed and those values from \(X_{\left( {1,6} \right) } ,\ldots ,X_{\left( {5,6} \right) } \) to be added from \({\varvec{H}}\); \(2r+1\) removals and \(2r+1\) additions thus need to be carried out.

  4. 4.

    Computing the median. Finally, the median value from the updated kernel histogram \({\varvec{H}}\) [line 15 in A1(a)] is obtained. Computing the median is done by accumulating frequencies from one extreme of the scale and stopping when the cumulative sum reaches the threshold \(\frac{\left( {2r+1} \right) ^{2}+1}{2}\); this is detailed next in Sect. 3 [A4(a)].

Fig. 1
figure 1

Each time the window’s center shifts to its right, H_alg requires \(2r+1\) additions to and \(2r+1\) removals from the kernel histogram \({\varvec{H}}\)

figure d

Note: for the sake of clarity when describing loops in algorithms, these two cases are distinguished in the following: (i) if a loop index is always used to refer to real elements or structures (e.g., existing image pixels or image column histograms) the corresponding strict range is used; this fact is indicated by an underlined specification of the index; (ii) instead, if a loop index sometimes refers to real elements or structures but at others to artificial ones (e.g., pixel values outside the image limits) all the needed values are included in the corresponding range; in this case, the specification of the index is not underlined.

2.2 Perrault and Hebert’s Algorithm

The advantages of the Perreault and Hebert algorithm (P&H_alg in the following) over the H_alg can be understood if some inefficiencies of the latter are first discussed [21]. In H_alg each image pixel value is added to and subtracted from \(2r+1\) histograms, and no information is retained between consecutive rows. Intuitively, if each pixel were instead used a constant number of times it could be possible to retain information between rows with the corresponding efficiency increase. To do this, P&H_alg uses the additivity property of histograms [19], by which the histogram of the union of two disjoint sets is simply the addition of their respective histograms. The addition of histograms only depends on the number of histogram bins b, itself a function of the image bit-depth.

In this way, the P&H_alg reserves one histogram for each image column; this set of histograms is preserved across rows until the whole image is processed (see the pseudocode in Algorithm A2). Essentially, the P&H_alg consists of six steps:

  1. 1.

    Creating the kernel histogram \({{\varvec{H}}}\) and column histograms \({{\varvec{h}}}^{j}\) (\(1\le j\le B)\). The data structures to store kernel histogram \({{\varvec{H}}}\) and column histograms \({{\varvec{h}}}^{j}\) [line 5 in A2(a)] are created. The memory needed to store the bx1 bins of \({{\varvec{H}}}\) is reserved, just as in A1; in addition, the memory needed to store the B column histograms \({{\varvec{h}}}^{j}\) of size bx1 is also reserved.

  2. 2.

    Partial initialization of column histograms \({{\varvec{h}}}^{j}\) (\(1\le j\le B)\). This step consists of two substeps [lines 6.1 to 6.11 in A2(b)]:

    1. i.

      The column histograms \({{\varvec{h}}}^{j}\) are initialized with zero values [lines 6.1–6.5 in A2(b)].

    2. ii.

      The values of the first r rows are added to every column histogram \({{\varvec{h}}}^{j}\) (\(1\le j\le B)\) [lines 6.6–6.11 in A2(b)]; this initialization will be completed through the first update step.

  3. 3.

    Initializing the kernel histogram \({{\varvec{H}}}\) and first update of column histograms \({{\varvec{h}}}^{j}\) (\(1\le j\le B)\). This step is performed on every row of the input image whenever the MF window shifts downward [lines 8.1–8.14 in A2(c)]. Three substeps can be distinguished:

    1. i.

      The column histograms \({{\varvec{h}}}^{j}\, (-r\le j\le r)\) are updated by removing and adding the r top and r bottom window pixel values, respectively [lines 8.1–8.6 in A2(c)].

    2. ii.

      The kernel histogram \({{\varvec{H}}}\) is initialized with zero values [lines 8.7–8.9 in A2(c)].

    3. iii.

      The updated first r column histograms \({{\varvec{h}}}^{j}\) are added to the kernel histogram \({{\varvec{H}}}\) [lines 8.10–8.14 in A2(c)]. Histograms \({{\varvec{H}}}\) and \({{\varvec{h}}}^{j}\, (1\le j\le r)\) are afterward ready to be used to compute the median.

  4. 4.

    Updating the rightmost column histogram \({{\varvec{h}}}^{col+r}\) (where col indicates the column of the current window’s center)when the window shifts to the right. This step is applied every time the MF window shifts to the right [lines 10–13 in A2(a)]. A top and a bottom pixel value are, respectively, removed from and added to the rightmost column histogram \({{\varvec{h}}}^{col+r}\).‘

  5. 5.

    Updating the kernel histogram \({{\varvec{H}}}\). In this step, those values from the window’s leftmost column histogram are removed from while those from the new rightmost column histogram are added to the kernel histogram \({{\varvec{H}}}\) [lines 14 in A2(a)]; several operations are involved in this task which depend on the image bit-depth but not on the radius size r, as in H_alg [lines 14.1–14.3 in A2(a)]. Worth mentioning here that Perreault and Herbert use a parallel processing SIMD (Single Instruction Multiple Data) hardware optimization to improve runtime performance; this fact was also noted by Alekseychuk [23], who attempted to lower the runtime constant but could not show practical improvements.

  6. 6.

    Computing the median. This step is analogous to that from H_alg, [see line 15 in A1(a) and A2(a)], which is detailed next in Sect. 3 [A4(a)].

3 A New Insight Into the MF Algorithm

It is acknowledged that as a summary measure, the median is much more stable and hence highly more reliable than the classical mean or average [24]. In fact, the mean of a set of numbers changes when at least one of those numbers is changed or replaced; the median, instead, does not necessarily change even if almost 50% of the numbers are changed or replaced. E.g., assume that n (odd) ordered numbers are at hand; if the \(\frac{\left( {n+1} \right) }{2}-1\) lowest numbers are replaced by arbitrary other numbers but keeping them below the \(\frac{\left( {n+1} \right) }{2}\) ranked number (the median), the new median will remain the same.

figure e

This property can be described through the concept of breakdown value: in general terms, it denotes the percentage of data in a set that could be arbitrarily replaced without grossly modifying the value of an estimation or a computation. Clearly, the mean has a 0% BV while the median has almost a 50% BV [24].

Each time the MF window shifts some of its values are removed and simultaneously replaced by new values, and a new median is computed. Two shortcomings can be identified in this task: first, the median computation typically uses a bottom-up accumulating strategy (e.g., for an 8-bit grayscale where values range from 0 to 255, frequencies are accumulated from the zero value and upward) until the \(\frac{\left( {2r+1} \right) ^{2}+1}{2}\) ranked value is reached; in this way, if the MF window eventually processes an image region with most values near the top of the grayscale, the algorithm will get slower (an analogous problem would take place if frequencies were accumulated top-down and image regions with most of low pixel values were eventually found). Second: it seems rather inefficient not to take into account that successive windows share most of their values and thus result in similar medians. More precisely: the proportion of shared values between consecutive radius r squared windows is essentially \(\frac{\left( {2r+1} \right) ^{2}-2\left( {2r+1} \right) }{\left( {2r+1} \right) ^{2}}\) (the exception being the case when the window shifts downward, assuming the typical left-right and top-down image processing). The percentage of shared information between successive windows thus increases really fast with the radius: a 33% of shared values for a radius \(r=1\); a 60% of shared values for a radius \(r=2\), a 90% of shared values for a radius \(r=10\) and so on. Both mentioned inefficiencies can be overcome by making the most of the median optimal BV: the median from a new window can be computed significantly faster by retaining the median from the previously processed window and updating it; in turn, this strategy enables a more efficient processing of those image regions with values in any extreme of the scale handling equally both cases. It is experimentally verified in Sect. 4 that this approach leads to a significant increase in efficiency as the bit-depth, the radius size and even the number of channels increases. Based on these considerations, updated formulations of both H_alg and P&H_alg are offered next.

3.1 BV-Based Version of H_alg

The H_alg (A1) is now revisited to incorporate the BV concept either by adding or modifying some of its instructions; the corresponding changes are light colored in Algorithm A3.

figure f

Three main modifications can be distinguished:

  1. 1.

    Initializing the auxiliary variables Pm, Lm and Gm. Through this step [line 7 in A3(a)], the kernel histogram \({{\varvec{H}}}\) is partially initialized and the new auxiliary variables previous median Pm, lower than (previous) median L m and greater or equal than (previous) median Gm are all initialized; this must be done for every image row. The initialization of \({{\varvec{H}}}\) is just the same as in A1(b), while the initialization of variables Pm, Lm and Gm consists of two substeps:

    1. i.

      Their values are set to zero [line 7.4 in A3(b)].

    2. ii.

      Before its addition to \({{\varvec{H}}}\), each new pixel value is compared with the previous median Pm: if the new value is greater or equal than Pm, Gm is incremented in one and otherwise Lm is incremented in one otherwise [lines 7.9–7.13 in A3(b)]. Whenever a new row is to be processed, variables Pm, Gm and Lm have all to be initialized with 0; a more efficient procedure is again to retain their values from the previous processed row.

  2. 2.

    Updating the auxiliary variables Lm and G m. Each time a pixel value is either added to or removed from the kernel histogram \({{\varvec{H}}}\) after being compared with Pm, variables Lm and Gm are correspondingly updated [lines 12–16 and 19–23 in A3(a)].

  3. 3.

    Computing the new median. This step computes the new window’s median by using the kernel histogram \({{\varvec{H}}}\) and the auxiliary variables Lm, Gm and Pm [line 25 in A3(a)]. For a better understanding of this step, an example is given next which compares the standard median computation [A4(a)] and the BV-based one [A4(b)].

In A4(a) the median is computed by accumulating frequencies in \({\varvec{H}}\) until a specific threshold T is reached [lines 7–10 in A4(a)]; the value of T naturally depends on the window’s radius r, but also on the window’s shape and on the distribution of the updated pixel values in \({\varvec{H}}\). For a squared window of radius r is \(T=\frac{\left( {2r+1} \right) ^{2}+1}{2}\) for a complete window, and this value decreases when incomplete windows are found at the image edges. Algorithm A4(b) shows instead the BV-based computation of median. First, the previous median Pm is stored in the auxiliary variable m [line 5 in A4(b)]; next, the threshold T is computed [line 6 in A4(b)] and compared with Lm [line 7 in A4(b)]: if \(T\le Lm\), the new median will be lower than the previous one and it will be found moving downward from the current Pm bin in the kernel histogram \({\varvec{H}}\) [lines 8–14 in A4(b)]; otherwise, the new median will be greater than the previous one and will be found moving upward from the current Pm bin in \({\varvec{H}}\) [lines 16–21 in A4(b)]. The auxiliary variables Lm and Gm are accordingly updated in the process [lines 11–13 or 18–20 in A4(b)].

Consider now this situation: a squared window W of radius \(r=1\) with \(\left( {2r+1} \right) ^{2}=9\) pixel values is given; then \(T=\left[ {\frac{\left( {2r+1} \right) ^{2}+1}{2}} \right] =5\). Assume that \(W=\left[ {1,3,3,4,5,5,7,7,9} \right] \) are current window’s ordered values, so \(Pm=5\), \(Lm=4\) and \(Gm=5\). Assume next that (the right new column) values 1, 1 and 1 are added to W while (the leftmost column) values 7, 7 and 1 are removed from it. This update of W results in \(W=\left[ {1,1,1,3,3,4,5,5,9} \right] \) which in turn updates \(Lm=6\) and \(Gm=3\) following the comparison with \(Pm=5\). Since \(T=5\le 6=Lm\), the new median will be found moving downward from the \(Pm=5\) bin in the kernel histogram \({\varvec{H}}\). Here, only two bin-displacements through \({\varvec{H}}\) are needed to obtain the new median value; the auxiliary variables are accordingly updated to \(Pm=3\), \(Lm=3\) and \(Gm=6\), respectively. Since T is now greater than Lm, the displacement will next start from Pm and upward.

figure g

3.2 BV-Based Version of the P&H_alg

The P&H_alg A2 is now revisited to incorporate the BV concept by either adding or modifying some instructions which are light colored in Algorithm A5.

figure h

Three main changes can be distinguished:

  1. 1.

    Partial initialization of the kernel histogram \({\varvec{H}}\) and initialization of auxiliary variables Pm, Lm and Gm. This step is similar to step 3 from A2(c). Only two modifications are made [see A5(b)]:

    1. i.

      The auxiliary variables Pm, Lm and Gm are all initialized with zero values [line 8.10 in A5(b)].

    2. ii.

      Each pixel value in the window is compared with the previous median Pm: if its value is greater or equal Pm, variable Gm is incremented in one; otherwise, variable Lm is incremented in one [lines 8.14–8.18 in A5(b)].

  2. 2.

    Updating the auxiliary variables Lm and Gm. This is analogous to step 2 in Sect. 3.1 [lines 15–27 in A5(a)].

  3. 3.

    Computing the new median. This is analogous to step 3 in Sect. 3.1 [line 29 in A5(a)].

4 A Theoretical Framework for MF Analysis and Comparison

A general framework for the analysis and comparison of MF algorithms at the theoretical level is now presented, which is used in the process to evaluate the four discussed versions. As mentioned, algorithmic efficiency is an important part of the computational complexity theory which aims to provide estimates of the resources needed by any algorithm solving a given computational problem, such estimates provide valuable insight to design more efficient algorithms.

The performance of algorithms can be studied independently of specific languages or machines, and several tools aid in assessing their efficiency without considering implementation. Among them, the widely known asymptotic analysis of worst-case complexity [1, 2] and the random access machine abstract model of computation are indeed relevant. Generally speaking, the latter is a rather simple theoretical model which helps to understand how an algorithm performs on actual machines; although alternative theoretical models exist [25, 26], the random access machine model strikes a fine balance between capturing the essential behavior of the computer and being simple to work with. In addition, it has proven useful in practice: under this model, the algorithmic analysis of efficiency seldom produces substantially misleading results. The framework presented next is based on all these considerations.

4.1 Theoretical Framework

Table 1 exhibits the comparison for the two analyzed algorithms both in their standard and BV-based versions. The MF was applied to an input image X of size AxB by using a squared window of radius r that produced a filtered output image Y of the same size.

Table 1 Comparison metrics and the corresponding results for the four analyzed MF algorithms

To be consistent with the theoretical model described above, this comparison should ideally be based on metrics that are independent from specific languages or machine implementations [25, 26]. The chosen ones are capable of capturing differences in the theoretical efficiency of the algorithms; some of them are space metrics, while the remaining ones are time metrics:

  • algorithmic complexity (AC);

  • dynamic memory (DM);

  • static memory (SM);

  • dynamic memory accesses (DMA);

  • arithmetic operations (AO)

  • logic comparisons (LC).

A last ad hoc time metric to specifically measure the efficiency of the median computation was also included: the transition effort (TE), interpreted as the effort rate associated with the transition between consecutive median values in the MF dynamic, emerged as a useful tool to explain the BV-based algorithms superior experimental performance.

For most of these metrics the worst, the average and the best case (defined, respectively, by the maximum, the average and the minimum number of steps taken in any instance) were considered, but only the worst case was used for the introduced ad hoc metric. Metrics DM, SM and TE are globally quantified while AC, DMA, AO and LC are quantified per-pixel, as usual.

As shown in Table 1, the AC for both the standard and the BV-based versions of the H_alg (A1 and A3, respectively) is simply \(O\left( r \right) \): i.e., the AC does not vary with the median computation method. A similar pattern is found when comparing the standard P&H_alg (A2) with the BV-based P&H_alg (A5): both share an AC of \(O\left( b \right) \).

To compute the DM for each of the four MF algorithms (A1-A3 and A5), the amount of memory needed to allocate both the input and output images of size AxB has to be established and also the corresponding amount of memory to store the kernel histogram \({\varvec{H}}\) of size bx1. MF algorithms A3 and A5 need extra DM due to the B column histograms \({\varvec{h}}^{j}\, (1\le j\le B)\). On the other hand, the BV-based MF algorithms A2 and A5 need more SM involved in managing variables Pm, \(\hbox {G}m\), Lm and the remaining auxiliary ones.

For each pixel value, the H_alg requires [lines 9–15 in A1(a)]:

  1. i.

    A loop of six instructions to update \({\varvec{H}}\). This loop needs \(2r+1\)LC and \(2r+1\) AO [used by the loop index k,  where \(-r\le k\le r\); line 9 in A1(a)]. Inside the loop, the addition and the removal of values [lines 10–13 in A1(a)] require six DMA (to read the pixel value from image X, to read the previous value from kernel histogram \({\varvec{H}}\) and to update the new value in \({{\varvec{H}}})\) and seven AO (to compute the indexes from image X and for the addition/substraction operations) are needed. In total \(6\left( {2r+1} \right) \) DMA, \(8\left( {2r+1} \right) \) AO and \(2r+1\) LC, respectively.

  2. ii.

    A function to calculate the median value [line 15 in A1(a) which uses A4(a)]. The initialization of the threshold T requires five AO [line 6 in A4(a)] and a loop with four instructions [lines 7–10 in A4(a)] is also needed to compute the median. In general, the worst case needs 2b LC (since \(0\le i<b\) and \(acc<T)\), b DMA (to read the pixel value from \({{\varvec{H}}})\) and 2b AO (b increments of the loop index i, and b additions to accumulate the kernel histogram values); the average case uses b LC, \(\frac{b}{2}\) DMA and b AO, while the best case uses two LC, one DMA and two AO. Globally, the TE worst case for the standard computation of the median is obtained as the rate of maximum effort transitions between consecutive median values: the maximum effort takes place each time the new median is exactly b when transitioning from any of the b possible current median values. Therefore, b different maximum effort transitions are associated with the worst case, an occurrence rate that varies linearly with b.

  3. iii.

    Finally, one DMA is required to store the median in the output image Y [line 15 in A1(a)].

The BV-based H_alg [A3(a)] requires the same amount of DMA but more AO and more LC than the standard H_alg [A1(a)]:

  1. i.

    Two LC and two AO are required to update the auxiliary variables Lm and Gm [lines 12–16 and lines 19–23 in A3(a)]. This results in \(6\left( {2r+1} \right) \) DMA, \(10\left( {2r+1} \right) \) AO and \(3\left( {2r+1} \right) \) LC, respectively.

  2. ii.

    An additional function uses the BV concept to compute the median [line 25 in A3(a), using A4(b)]. The initialization of the threshold T requires the same five AO as in the standard version [line 6 in A4(b)]. In addition, one LC is used to establish the direction of search [line 7 in A4(b)]. If the condition \(T\le Lm\) is true, a loop of seven instructions to obtain the median value [lines 8–14 in A4(b)] is needed; more precisely, the worst case requires 2b LC (since \(0\le i<b\) and \(T\le Lm-{{\varvec{H}}}\left( i \right) )\), b DMA (to read the pixel value from \({{\varvec{H}}})\) and 5b AO (b subtractions to check \(T\le Lm-{{\varvec{H}}}\left( i \right) \), b increments/decrements for the loop index i and 3b to update variables Pm, Lm and Gm). If the condition \(T\le Lm\) is instead false, a loop of five instructions to obtain the median value [lines 16–21 in A4(b)] is required; in this case, the same amount of DMA and LC resources are required as before, but a fewer AO. In summary, the worst case requires b DMA, 5b AO and 2b LC; for the average case \(\frac{b}{2}\) DMA, \(\frac{5b}{2}\) AO and b LC are used, while for the best case one DMA, five AO and two LC are needed. The TE worst case for the BV-based computation of the median is obtained as the rate of maximum effort transition between consecutive medians: this maximum effort is b, which takes place when transitioning from a current median 0 to a new median b or vice versa. Only two realizations of the worst-case result in a TE rate that is therefore constant.

  3. iii.

    Finally, one DMA is required to store the median result [line 26 in A3(a)].

The amount of needed resources for both the standard and the BV-based H_alg is shown in Table 1.

For each pixel, the P&H_alg requires [lines 10–15 in A2(a)]:

  1. i.

    Five instructions to update the column histograms \({\varvec{h}}^{j}\) (\(1\le j\le B)\) and the kernel histogram \({\varvec{H}}\). Removing or adding a pixel value from \({\varvec{h}}^{j}\) [lines 10–11 or 12–13 in A2(a), respectively] requires six DMA (to read the pixel value from image X, to read the previous value from column histogram \({\varvec{h}}^{j}\) and to update the new value in \({\varvec{h}}^{j})\) and eleven AO (to compute indexes from locations at image X and at column histograms \({\varvec{h}}^{j}\), and also for the substraction/addition operations). A loop of three instructions to update the kernel histogram \({\varvec{H}}\) [lines 14.1–14.3 in A2(a)] requires 4b DMA (\(b+b=2b\) readings used in the addition/removal of values from \({\varvec{h}}^{j}\), b readings of the current \({\varvec{H}}\) values and b updates in \({{\varvec{H}}})\), 6b AO (to compute indexes from column histograms \({\varvec{h}}^{j}\) and an extra addition/removal operation) and b LC (for the loop index i, where \(0\le i<b)\).

  2. ii.

    The same function as in H_alg [A4(a)] to compute the median value. The initialization of the threshold T requires five AO [line 6 in A4(a)], and a loop of four instructions [lines 7 to 10 in A4(a)] enables to compute the median. Typically, the worst case needs 2b LC, b DMA and 2b AO; the average case in turn needs b LC, \(\frac{b}{2}\) DMA and b AO, while the best case needs two LC, one DMA and two AO. Finally, one DMA is required to assign the median value in the output image Y [line 15 in A2(a)]. The TE worst case linearly depends on b, just as in the standard H_alg.

The BV-based P&H_alg [A5(a)] requires the same amount of DMA, but more AO and LC resources than the standard P&H_alg [A2(a)]:

  1. i.

    2b LC and 2b AO are required to update variables Lm and Gm [lines 18–27 in A5(a)].

  2. ii.

    Using the BV, the median value is computed by using the same function as in the standard P&H_alg [A4(b)]. Initializing the threshold T requires five AO [line 6 in A4(b)]. In addition, one LC is needed to establish the direction of search [line 7 in A4(b)] and a seven/six instructions [lines 8–14 or lines 16–21 in A4(b)] are needed to compute the median. In summary, the worst case requires b DMA, 5b AO and 2b LC; for the average case \(\frac{b}{2}\) DMA, \(\frac{5b}{2}\) AO and b LC are used, while for the best case one DMA, five AO and two LC are needed. The TE worst case is constant, just as in the BV-based H_alg.

  3. iii.

    Finally, one DMA is required to store the median result [line 29 in A5(a)].

The amount of needed resources for both the standard and the BV-based P&H_alg is shown in Table 1.

In summary, in terms of DM both the standard and the BV-based P&H_alg require more resources than their H_algm analogues due to the use of column histograms \({\varvec{h}}^{j}\) (\(1\le j\le B)\). On the other hand, both BV-based MF algorithms require more SM, more LC and more AO resources either in the worst, the average and the best case. This fact is balanced by the highest efficiency of the BV-based versions in terms of the ad hoc time metric TE.

4.2 Experimental Results

To experimentally test the theoretical results, a comparison based on common implementations for each of the four considered MF versions is required; a specific environment was therefore defined to instantiate all of them. They were first implemented in MATLAB\(^\circledR \) and afterward migrated to an Integrated Development Environment (IDE) QtCreator from Nokia for GNU/Linux in C++ code. Tests were performed in an Intel\(^\circledR \) Core\(^{\mathrm{TM} }\)i7-3630\(^{\mathrm{QM}}\) with 8-GB RAM 2.4GHz CPU, and the chosen operating system was Ubuntu 14.04 LTS (32 bits). It should be apparent that alternative implementations in other programming languages and/or actual machines are also possible; for some of these languages and machines, specific optimization shortcuts can be found but they highly depend on the implementation strategy (such as SIMD instructions [21, 27], cache friendliness [21], multilevel histograms [28] and others).

Fig. 2
figure 2

Processing time versus radius size for the four MF versions

Fig. 3
figure 3

Test image: sunken boat in Lake Washington (center). Five ROI and the corresponding MF outputs for radius sizes \(r=1,3,5\)

Figure 2 exhibits the experimental results processing time against radius size for the common implementation of the four MF versions; only in this context the processing time makes sense as a measure of experimental efficiency. These four MF versions were tested on the classic side-scan sonar image of a sunken boat from Lake WashingtonFootnote 1 (Fig. 3, center), whose size is \(470\times 470\, (AxB)\) pixels from an 8 bit \((b=256)\) grayscale. In Fig. 2, the vertical axis indicates the processing time (in seconds) while the horizontal axis indicates the MF window’s radius size.

Fig. 4
figure 4

Column-average bin-displacements versus column number for each ROI when MF radius sizes \(r=1,3,5\) are used

As shown, the processing time varies linearly with the radius size for the H_alg versions; conversely, the P&H_alg versions processing time does not depend on the radius size (e.g., for a window’s radius of up to \(r=20\), consisting of \(\left( {2r+1} \right) ^{2}=1681\) window pixel values, the H_alg is faster than the P&H_alg in both versions). It is also shown that each MF algorithm (H_alg andP&H_alg) exhibits approximately the same trend for both versions, but lower processing times are consumed by the BV-based ones. Worth noting that the extra computational resources required by the corresponding BV-based versions (e.g., in terms of SM, AO or LC) do not result in additional processing time; this feature could not be perceived if only the AC metric were used for the comparison.

Table 2 Min–max column-average bin-displacements comparison by radius size

In Fig. 3, five different regions of interest (ROI) extracted from the test image are shown; their sizes are \(100\hbox {x}100\) (AxB) pixels in every case, and the cartesian coordinates of the corresponding upper-left vertex used as reference are, respectively: (i) \(\left( {0,0} \right) \); (ii) \(\left( {270,0} \right) \); (iii) \(\left( {110,270} \right) \); (iv) \(\left( {215,340} \right) \) and (v) \(\left( {320,250} \right) \). These five sampled regions were deliberately chosen to capture different features often found in side-scan sonar image processing [10, 29, 30]: (1) seabed reverberation; (2) seabed reverberation and little acoustic shadow; (3) seabed reverberation and acoustic highlight plus a significant shadow zone; (4) the highest acoustic highlight, and (5) seabed reverberation and shadow zones plus strong acoustic highlight.

Fig. 5
figure 5

Column-average bin-displacements versus column number curves for different synthetic images when a MF of radius size \(r=1\) is used

Figure 3 exhibits the corresponding filtered images when radius sizes \(r=1\) (9 processed pixels in every window), \(r=3\) (49 processed pixels) and \(r=5\) (121 processed pixels) are used. The smaller the radius size, the smaller number of processed pixel values in each iteration of the MF; the output image will include more geometrical details but also will be potentially more affected by noise. Conversely, the greater the radius size, the higher the noise reduction but the less the geometrical details that are retained.

As the radius size increases, the greater the number of incomplete windows that will be processed at specific image locations; e.g., this is indeed the case whenever the MF window approaches the image edges or corners, where the number of window’s missing values is maximum. The equation \(\frac{100\left( {AB-2r\left( {A+B-2r} \right) } \right) }{AB}\ge p\) gives an upper bound for the percentage p of incomplete windows; this bound depends on the image dimensions AxB but also on the windows radius r and can be used to reach a desired percentage of complete windows. For example, if a filter of radius \(r=5\) is applied to the whole test image in Fig. 3, a 95.78% of complete windows are guaranteed; this percentage decreases to 83.70% when \(r=20\) and to 61.97% when \(r=50\). In this way, not only the radius size but a trade-off between the radius size, the image size, the desired geometric details of the output image and the aimed processing time must be taken into account when the performance of a MF algorithm is evaluated.

Figure 4 exhibits the column-average bin-displacements needed to update the median (vertical axis) versus the column number (horizontal axis) for each ROI. The former metric can be thought as an experimental realization of the TE ad hoc theoretical metric, measuring the effort in transition between consecutive medians as a column-average. Note that for a fixed radius r the same performance in terms of this metric is shared by any of the standard versions (H_alg and P&H_alg; dotted line) and by any of the BV-based versions (BV-based H_alg and BV-based P&H_alg; solid line).

Blue, green and red curves, respectively, describe the effort when radius sizes \(r=1\), \(r=3\) and \(r=5\) are used. Table 2, in turn, shows the minimum and maximum column-average bin-displacements from Fig. 4. Several inferences can be drawn:

  • For each ROI, when the radius increases the curves corresponding to the BV-based median computation stabilize in comparison to their standard analogues. This is caused by the greater number of processed values in the MF window that allows the median’s optimal BV to come into play; the median is thus quickly updated, and the effort to achieve tends to be zero.

  • The curves corresponding to the MF (by using both the standard and the BV-based median computation) of ROI-1 and ROI-2 in Fig. 4 exhibit a rather similar pattern. This might be caused by the shared pattern of seabed reverberation and high variability of pixel values.

  • ROI-3 exhibits a higher proportion of acoustic shadow (equivalently, most near zero pixel values): the curves for the standard version indicate that whenever the MF window processes the shadow zone, the column-average effort to update the median decreases (Table 2 suggests that at least 13 fewer displacements are used by the BV-based version in this case, whatever the radius size). Toward the end of the curve the average effort increases again, because this region includes acoustic highlight. As shown in Table 2, a maximum of about 200 displacements are needed in any case. Finally, the average effort needed by the BV-based version tends to stabilize around zero as the radius size increases; this pattern is seen for every ROI.

  • ROI-4 includes a significant amount of acoustic highlight, caused by the reflective object (with pixel values near 255). The standard median computation curves in Fig. 4 indicate a higher column-average effort at the beginning of the region, reaching a local maximum of approx. One hundred and seventy-eight displacements to update the median. Next, the curves drops as the presence of seabed reverberation increases: a minimum of approx. Fifty-six displacements is reached for every radius size. Finally, the standard curves climb again until a maximum of 200 displacements is reached whatever the size, which seems to be caused by the increasing amount of acoustic highlight. For the BV-based version, all curves remain stable near zero but become more plain (or constant) as the radius increases.

  • When ROI-5 is analyzed, the standard version curves indicate that a significant amount of acoustic shadow (or zero pixel values) is found in this region: these curves drop to a minimum of 15 displacements for all three radius sizes. Approaching the right border, the amount of seabed reverberation increases and a small reflective object appears: a maximum of 127 displacements approx. is thus reached for all three radius sizes. The corresponding curves for the BV-based version remain stable near zero most of the time but as the variability of pixel values increases on the right, a maximum of 27 displacements are needed for the radius size \(r=1;\) this is caused by a high contrast between light gray levels and dark gry levels (or vice versa), which makes the median to suddenly change.

Fig. 6
figure 6

Column-average bin-displacements versus column number curves for the RGB image when a MF with radius size \(r=5\) is applied

Figure 5 shows a set of synthetic images designed to illustrate the advantages of BV-based version. All images are of size \(100 \times 100\) and only the extreme levels of the grayscale 0 (black) and 255 (white) were used. Several inferences can be drawn:

  • Figure 5 \((\hbox {a}_{1})\) shows a diagonal pattern of color change. Assuming that MF is performed following a top-down and left-right pattern as usual, the first column has all white pixels (255 values). As the MF window shifts to its right, the frequency of 255 pixel values decreases while the frequency of 0 (black) values increases until a last column full of black pixels is reached. This feature can be perceived in the curve of the standard median computation (Fig. 5 \((\hbox {a}_{2}))\), where the column-average bin-displacements required to update the median starts at 255 and descends to 0. Because the search in kernel histogram begins at the 0 level and upward, an average effort of 255 displacements is required to update the median at the beginning; the last column only includes 0 values and no bin-displacement is finally required in average. Alternatively, the curve of the displacement dynamics for the BV-based version indicates that for every column only one maximum change (when switching from 255 to 0) is needed to update the median. Figure 5 \((\hbox {b}_{1})\) and \((\hbox {b}_{2})\), in turn, exhibits the corresponding pattern for the symmetric case.

  • Figure 5 (c1) the pattern of gray-scale variation is now vertical, considering the same two extreme levels: the first 50 rows are completely white, and the remaining 50 rows are completely black. The dynamics of the standard version described in Fig. 5 (c2) shows a column-average effort of 127 bin-displacements to update the median. On the other hand, the BV-based version keeps using only one displacement by column. Figure 5 (d1) and (d2) shows the symmetric case.

  • Figure 5 \((\hbox {e}_{1})\), in turn, shows a horizontal variation pattern: the first 50 columns are completely white while the last 50 columns are completely black. The dynamics of the standard version described in Fig. 5 \((\hbox {e}_{2})\) stabilize around 255 bin-displacements by column to update the median for the first 50 white columns. For the next 50 black columns, a constant average of zero effort is needed to update the median. On the other hand, the BV-based version reaches a peak of 255 average bin-displacements only in column 50 because all windows centered there include approximately the same number of black and white pixels; the effort is due to the transition from 0 to 255 in the grayscale. For the remaining black columns, no displacement in average is needed. Figure 5 \((\hbox {f}_{1})\) and \((\hbox {f}_{2})\) exhibits the dynamic for the symmetric image.

  • Figure 5 \((\hbox {g}_{1})\) and \((\hbox {h}_{1})\) exhibits a rather similar pattern of variation as those of Fig. 5 \((\hbox {e}_{1})\) and \((\hbox {f}_{1})\), but with a higher frequency of black and white columns interspersed. Figure 5 \((\hbox {g}_{1})\) has a frequency of 5 consecutive columns of the same color, while this frequency is 10 in Fig. 5 \((\hbox {h}_{1})\). Analyzing the dynamic curves (Fig. 5 \((\hbox {g}_{2}))\), the standard version keeps an average of 255 bin-displacements to update the median for all white columns; for all the black ones, no displacement in average is needed. Instead, the BV-based version reaches peaks of 255 average displacements every 5 columns. A similar pattern applies to Fig. 5 \((\hbox {g}_{2})\) but every 10 columns in this case.

Finally, Fig. 6 shows an RGB image with three 8 bit-depth channels \((b=256)\) and the result of MF with radius sizes 1 (9 pixels per window), 3 (49 pixels) and 5 (121 pixels); each channel was processed independently. The image size is 189x267 pixels. The comparison between the standard and the BV-based version is shown in Fig. 6 for each channel. As in Fig. 4, a similar pattern of differences between both versions can be perceived.

5 Final Comments

By assuming a clear distinction between the theoretical analysis of efficiency and the comparison of experimental implementations, this article intends to bring a new insight into the MF capabilities. A detailed review of two of the most popular MF algorithms (Huang et al. and Perreault and Hébert) is offered first in order to understand their main features and also to underline their differences. Updated versions of them, based on the optimal BV of the median are then presented and analyzed in detail. A comprehensive framework to assess the efficiency of MF algorithms in general is introduced in the process. Through six different and objective-abstract metrics the theoretical efficiency of the four MF algorithms under consideration is assessed in detail; these metrics could also be used to approach the theoretical analysis of a wider class of algorithms. An additional ad hoc metric, the transition effort rate, provides a rather intuitive explanation of the experimental results: the BV-based versions clearly outperform their corresponding standard versions, even the latter are equally or more expensive than the former in terms of the remaining metrics.

The experimental results combined both real and synthetic data including a RGB example and the experimental computation of the transition effort rate. The differences between the standard and BV-based versions of the analyzed MF algorithms are thus highlighted and objectively quantified. Overall, the BV-based algorithms noticeably outperform their corresponding standard versions either for a single channel or a multichannel image.

Lines of future work might include the use of BV-based MF versions for edge detection in object segmentation, and the analysis of MF algorithms under alternative theoretical models of computation.