1 Introduction

Video noise invariably degrades the quality of the displayed image. Noise can be divided into two categories: (1) non-continuous noise, such as impulse noise, which can be easily removed with a non-linear filter [14]; and (2) continuous noise, such as Gaussian noise. Filtering this type of noise is more problematic, because it is difficult to distinguish the edge pixels from the noisy pixels. The non-linear impulse method cannot effectively filter this type of noise. As an alternative, a powerful algorithm [5, 6] has been proposed to remove Gaussian noise from camera signals. Gaussian noise is locally distributed and can be processed with local information. In the case of a wide range noise distribution, however, where the noise artifacts cover many pixels over a number of levels, spatial processing fails to remove Gaussian noise. Temporal filters have been used to remove noise in continuous frames [7], but this method produces a dragging effect on moving objects. The temporal and spatial adaptive filter is proposed to overcome this drawback [810].

Television (TV) noise is distributed over a large region of the display and cannot be modeled as a single noise type. As the signal–noise ratio weakens, the display noise increases accordingly. Signal processing to improve the image quality by reducing noise has been discussed in the literature. Nevertheless, the problem of noise on large screen TVs has not been completely resolved. The improved methods provided by the widely known median filter are effective at removing impulse noise signals [14], but they are ineffective at reducing large region TV noise. Recent studies [710] have discussed several highly efficient techniques that employ a new filtering method by encompassed wide information to completely remove the noise. To remove large region TV noise, temporal processing is generally considered better than spatial processing since more information is used.

In this study, we present a novel adaptive algorithm designed to improve filtering performance, particularly for large region TV noise, based on real-time adaptations to the temporal and spatial processing. The new adaptive algorithm is described in Section II. Simulation results and comparisons are shown in Section III, and our conclusions are presented in Section IV.

2 Proposed noise processing algorithm

The adaptive algorithm is comprised of three parts: noise detection, motion detection, and the adaptive filter. The system architecture is shown in Fig. 1. The TV video input signals are stored in field memory (Mem). Motion detection uses the inter field information to determine whether the pixel being processed is moving. Noise detection determines the noise level using intra- and inter-field information. The adaptive filter determines the degree of filtering strength required, based on the level of motion and noise detected. Finally, the de-interlacing is used to convert the field and achieve a noise-free frame.

Fig. 1
figure 1

The architecture of the proposed noise reduction method

2.1 Noise detection algorithm

To evaluate noise level, noise detection algorithm computes the amount of noise using both of the intra- and inter-blocks.

2.1.1 Intra-field noise detection

For intra field detection, the block size is determined by L × M. In order to reduce the number of line-buffers, we set L = 2 and M = 40 (see Fig. 2), to cover a large region of video information. As the amount of block noise increases, the difference between neighboring pixels becomes greater. To determine the intra field noise level, we compute the difference in the block mean variance (BV) for the (j,i)th block as follows:

$$ {\text{BV}}(j,i) = \sum\limits_{m = 0}^{L - 1} {\sum\limits_{n = 0}^{M - 1} {|{\text{Mean}}} } (j,i) - f(j + m,i + n)|, $$
$$ {\text{Mean}}(j,i) = \sum\limits_{m = 0}^{L - 1} {\sum\limits_{n = 0}^{M - 1} {\frac{f(j + m,i + n)}{L \times M}} } , $$
(1)

where Mean(j,i) is the mean value of the (j,i)th block, and f(j,i) is the (j,i)th input pixel.

Fig. 2
figure 2

The block processing for intra-field noise detection

If BV is high, the noise level of an intra field is correspondingly high. Since edge blocks have high BVs, however, it is difficult to ascertain whether these blocks include noise. If a smooth block shows a high degree of variance, noise should appear on the block. Therefore, the intra noise value can be estimated by summing the minimum BV values over one entire field, as follows:

$$ \begin{aligned} & \{ {\text{MBV}}_{(j,i)}^{S} \} = ({\text{BV}}(j,i)_{\min }^{1} ,{\text{BV}}(j,i)_{\min }^{2} , \ldots {\text{BV}}(j,i)_{\min }^{p} ), \\ & {\text{Noise}}_{{{\text{intra - field}}(n)}} = \sum\limits_{S = 1}^{P} {\{ {\text{MBV}}_{(j,i)}^{S} } \} \\ \end{aligned} $$
(2)

where \( {\text{BV}}(j,i)_{\min }^{1} ,{\text{BV}}(j,i)_{\min }^{2} , \ldots {\text{BV}}(j,i)_{\min }^{p} \) is the first, second, and Pth minimum BV values at the (j.i)th block. In our experiments, we used P = 16. The minimum BV values are stored in a set of \( {\text{MBV}}_{(j,i)}^{S} \). The intra-field noise value Noiseintra-field(n) is calculated by performing a summation of 1 to Pth minimum BV values from the set of \( {\text{MBV}}_{(j,i)}^{S} \) for the nth field. In order to represent the level of intra-field noise in accuracy, the BV value selected for intra noise in (2) must satisfy two conditions: (a) to eliminate low-noise blocks, the minimum BV selected must be greater than the threshold Th1; and (b) the BV value must be limited to Th2 < Mean(j,i) < Th2 + 200, to eliminate regions that are too dark or too bright.

The final intra noise value is derived from a summation of the results of the previous four fields, which can be expressed as

$$ {\text{Noise}}_{\text{intra}} = \left( {\sum\limits_{n = 1}^{4} {{\text{Noise}}_{{{\text{intra - field}}(n)}} } } \right) \gg k1 $$
(3)

where Noiseintra-field(n) denotes the previous nth field, and k1 is the number of right-shifted bits, which is a constant arrived at experimentally.

2.1.2 Inter-field noise detection

To detect the temporal noise, five reference fields are employed to encompass more information than is possible using one intra-field (see Fig. 3). NTSC TV uses an interlaced signal composed of alternating odd and even fields. Field-5 is the one currently being processed. Two temporal differential parameters are computed (using two odd or two even fields) as follows:

$$ \begin{aligned} {\text{Diff}}_{1} (j,i) & = \left| {f(j,i)_{(t)} - f(j,i)_{(t - 2)} } \right|\quad {\text{for}}\;j = 0\sim H - 1 \\ {\text{Diff}}_{2} (j,i) & = \left| {f(j,i)_{(t)} - f(j,i)_{(t - 4)} } \right|\quad i = 0\sim W - 1 \\ \end{aligned} $$
(4)

where f (t), f (t-2), and f (t-4) represent the current, and the previous 2nd, and 4th fields, respectively. H and W denote the height and width of one field. If one pixel is low motion and high temporal difference, the probability of inter-noise dot becomes high. The amount of inter noise for the current field is accumulated by the number of inter-noise pixels, detected as follows:

$$ {\text{Noise}}_{\text{inter}} = \sum\limits_{j = 0}^{H - 1} {\sum\limits_{i = 0}^{W - 1} {\left\{ {\begin{array}{*{20}c} {1,} \\ {0,} \\ \end{array} \quad \begin{array}{*{20}c} {({\text{Diff}}_{1} (j,i) > {\text{Th}}3||{\text{Dif}}f_{2} (j,i) > {\text{Th}}3)\& \& {\text{MF}}} \\ {\text{otherwise}} \\ \end{array} } \right.} } $$
(5)

where Th3 is a threshold. The symbols || and && denote “Or” and “And” operation, respectively. The motion feature (MF) is used to check whether the current block is a motion block, which can be expressed by

$$ {\text{MF}} = \left\{ {\begin{array}{*{20}c} {1,} \\ {0,} \\ \end{array} \quad \begin{array}{*{20}c} {{\text{MD}}_{\text{dif}} (j,i) < ({\text{Noise}}_{\text{intra}} \gg k2)} \\ {\text{otherwise}} \\ \end{array} } \right. $$
(6)

where k2 is a constant and “≫” denotes the shift right operation for division. The various MDdif represent the motion differential between inter-frames. We describe these in the next section.

Fig. 3
figure 3

The five reference fields for noise detection

The final noise level for the sampling field can be evaluated as the summation of the inter- and intra- noise, expressed as follows:

$$ {\text{Noise}}_{\text{field}} = 4 \times {\text{Noise}}_{\text{inter}} + {\text{Noise}}_{\text{intra}} $$
(7)

Since the inter noise level is lower than the intra noise level, the amount of inter noise is amplified 4 times to balance the level predicted from the inter and intra estimates.

2.1.3 Performance evaluation

To evaluate the noise detection algorithm, low-noise and noisy-sequences are employed to estimate the parameters. The images are directly sampled from TV programs showing on both noisy and noiseless channels. The test benchmarks are shown in Fig. 4a, b, respectively. Each sampled sequence contains 150 frames. In order to classify the video output as noisy or noiseless, the following thresholds were found through exhaustive research: Th1 = 100, Th2 = 10, and Th3 = 25 in (2) and (3). The shift bits k1 = 4 and k2 = 5 used in (3) and (6), respectively, on average produce the best results of all the testing sequences. The threshold sensitivity is lower than the bit shift sensitivity. When the threshold changes by 10 %, the resulting image shows no obvious change, but the bit shift parameters are sensitive to noise removal performance. For various test sequences, when low noise frames are involved, the inter-noise level is nearly zero and the intra-noise level is below 3,000. When the tested sequence contains visible noise, the amount of intra noise detected can be above 6,000. The proposed noise detection algorithm is able to detect whether the sequence contains noisy images or not.

Fig. 4
figure 4

a The noise estimation for a low-noise sequence. b The noise estimation for a noisy sequence

2.2 Motion detection

An L × M window is used to detect motion pixels. Odd line-buffers are employed for motion detection. Setting L = 3 and M = 8 provides a sufficient amount of motion differential coverage, as seen in Fig. 5. The first detection involves the 1st and 3rd rows between the current field, t, and the last field, t-4, which can be expressed as

$$ {\text{MD}}1 = \sum\limits_{k = 0}^{M - 1} {\left[ {\left| {h1_{(t)} (k) - h1_{(t - 4)} (k)} \right|} \right] + \left| {h3_{(t)} (k) - h3_{(t - 4)} (k)} \right|} , $$
(8)

where hn is the nth row pixel in the processing block. The second detection involves the 2nd row between fields t-1 and t-3, which can be expressed as

Fig. 5
figure 5

A 3 × 8 window for motion detection

$$ {\text{MD}}2 = \sum\limits_{k = 0}^{M - 1} {\left| {h2_{(t - 1)} (k) - h2_{(t - 3)} (k)} \right|} $$
(9)

The last step is to detect fields t and t-2 as follows:

$$ {\text{MD}}3 = \sum\limits_{k = 0}^{M - 1} {\left| {h2_{(t)} (k) - h2_{(t - 2)} (k)} \right|} . $$
(10)

The illustration is shown in Fig. 6. The motion deferential (MD) is the summation of MD1, MD2, and MD3, which can be found by

Fig. 6
figure 6

The computations of motion detection

$$ {\text{MD}}_{\text{dif}} = {\text{MD}}1 + {\text{MD}}2 + {\text{MD}}3. $$
(11)

The MD value is used to determine whether the current pixel is a motion pixel by comparing it with an adaptive threshold. The threshold varies according to the noise level established with (7), and can be expressed as

$$ {\text{MD}}_{\text{thr}} (j,i) = \left\{ {\begin{array}{*{20}c} {8,000/{\text{MD}}_{\text{fac}} ,} \\ {6,000/{\text{MD}}_{\text{fac}} ,} \\ {{\text{Noise}}_{\text{field}} /{\text{MD}}_{\text{fac}} ,} \\ \end{array} \quad \begin{array}{*{20}c} {{\text{Noise}}_{\text{field}} > 15,000} \\ {6,000 < {\text{Noise}}_{\text{field}} \le 15,000} \\ {{\text{Noise}}_{\text{field}} \le 6,000} \\ \end{array} } \right. $$
(12)

When the noise level is high, the threshold increases to avoid erroneously detecting a noisy pixel as a motion pixel. MDfac represents the motion detection factor. Our experiments found that with MDfac = 10 and with the range of parameters established in (12), we could eliminate noisy pixel detection errors and achieve accurate motion detection. The parameters are selected by incrementally changing the value by 500 to find the best detection settings. If the motion differential in (11) is higher than the threshold established in (12), the pixel is detected as a motion pixel; otherwise, it is considered a still pixel. The detection function can be expressed as

$$ {\text{MD}}(j,i) = \left\{ {\begin{array}{*{20}c} {0,} \\ {1,} \\ \end{array} \quad \begin{array}{*{20}c} {{\text{MD}}_{\text{dif}} (j,i) \le {\text{MD}}_{\text{thr}} (j,i)} \\ {\text{otherwise}} \\ \end{array} } \right., $$
(13)

for the (j,i)th pixel. An MD(j,i) value of zero denotes that the pixel is a still pixel.

2.3 Noise filter

We propose an adaptive filter to remove video noise based on noise and motion detection. A flowchart of the process is given in Fig. 7. If the noise detection results show a low-noise level, the filtering operation can be bypassed to avoid image blurring. When a high noise level is found, if it is in a still region, the filter employs temporal processing to remove the noise. If the noise is in a region of motion, however, to avoid motion object dragging and temporal aliasing, a spatial low-pass filter [1] is employed.

Fig. 7
figure 7

The processing flow of the proposed adaptive filter

In general, human eyesight is more sensitive to luminance pixels than chrominance ones. Because of this, the processing domain uses YUV signals. The Y- and UV-signals are luminance and chrominance components that are processed with various filtering coefficients to achieve a better tradeoff between complexity and performance.

2.3.1 For Y signal filtering

The temporal filter for Y processing is divided into three operational levels; the choice of which level is used is based on the amount of noise detected. To effectively remove the noise, the filtering power increases as the noise level increases. When the noise level is the highest, the temporal filter removes noise using five fields with the weighting filter calculated as follows:

$$ \begin{aligned} \hat{Y}_{(t)} (j,i) & = Y_{(t)} (j,i) \times \frac{3}{8} + \hat{Y}_{(t - 1)} (j,i) \times \frac{1}{8} + \hat{Y}_{(t - 2)} (j,i) \times \frac{2}{8} \\ & \quad + \hat{Y}_{(t - 3)} (j,i) \times \frac{1}{8} + \hat{Y}_{(t - 4)} (j,i) \times \frac{1}{8} \\ \end{aligned} $$
(14)

where Y n (j,i) denotes the luminance components of the (j,i) pixel in the nth field, and \( \hat{Y}_{(t - 1)} (j,i)\;..\;\hat{Y}_{(t - 4)} (j,i) \). are the filtered pixels of the previous fields.

In our experiments, the temporal filter is invoked when Noisefield in (7) is above 3,000. The filtering operation is shown in Fig. 8. The process flow can be divided into three steps. First, the interlaced pixel is loaded into fields (t) to (t-4). Second, the noisy pixel is removed by calculating the weighted average of the various fields. Third, the filtered pixel is reloaded into field memory for the temporal recursive operation.

Fig. 8
figure 8

The weighting coefficient of five fields for high-noise processing

We found that in a low-noise field, when 2,500 < Noisefield < 3,000, the filtering power degraded. The filter employs three processing fields as follows:

$$ \hat{Y}_{(t)} (j,i) = Y_{(t)} (j,i) \times \frac{2}{4} + \hat{Y}_{(t - 2)} (j,i) \times \frac{1}{4} + \hat{Y}_{(t - 4)} (j,i) \times \frac{1}{4} . $$
(15)

When Noisefield <2,500, the field is noise free, and no filtering is required. This adaptive filter uses the temporal recursive processing to improve the filtering power. The filtered pixel is reloaded into field memory for the next field processing operation. The filter has the capability of removing the large range of noise from the TV display.

2.3.2 For UV signal filtering

When UV signals are damaged by noise, there is a small amount of distortion in the image color. Since the eye’s sensitivity to UV is less than that of Y, however, the strength of the filter can be reduced, thereby decreasing the complexity of the calculations. The filtering operation can be performed as follows:

$$ \hat{U}(\hat{V})_{(t)} (j,i) = U(V)_{(t)} (j,i) \times \frac{2}{4} + \hat{U}(\hat{V})_{(t - 1)} (j,i) \times \frac{1}{4} + \hat{U}(\hat{V})_{(t - 2)} (j,i) \times \frac{1}{4}, $$
(16)

when the field noise level represented as Noisefield in (7) is above 2,500 in our experiments. Recursive filtering is also employed for UV signal processing.

3 Performance and complexity evaluations

To simulate the adaptive filter, interlaced 720 × 240 fields from an NTSC TV program were sampled. The testing system is shown in Fig. 9. The composite NTSC program was decoded to a YUV signal. The noise and motion detection operations used the Y signal to find the motion and noise-level parameters. The filtering strength depended on the results of the noise and motion detection operations. The filtered field was de-interlaced to a frame. The YUV pixels were converted to RGB, allowing them to display on a VGA monitor.

Fig. 9
figure 9

The testing system for noise reduction evaluation

To evaluate the filtering performance, both objective and subjective measurements were employed. For objective analysis, we sampled noise-free sequences from TV channels. In order to compute the PSNR value, 12 % Gaussian noise was added to the noise-free sequences. To show the entire frame, de-interlacing [11] was used to interpolate the missing field. Five algorithms were used to filter the noisy sequence for comparative purposes. The literature [2] proposed a spatial edge detection to determine the filtering capability. This kind of method is effectively to remove impulse noises. However, spatial processing alone [2] was not able to adequately remove TV noisy pixels. The filtering quality can be improved by employing spatial–temporal algorithms [8, 9]. Although the adaptive algorithm was able to achieve good noise removal performance on the sampled videos, its computational complexity was high due to the amount of motion compensation used. The image results are shown in Fig. 10.

Fig. 10
figure 10

a Added Gaussian noise; b standard median filter [1] (PSNR = 26.13 dB); c low-cost filter [2] (PSNR = 26.06 dB); d α-trimmed mean filter [8] (PSNR = 27.67 dB); e STS-DDWA filter [9] (PSNR = 25.27 dB); f proposed (PSNR = 28.02 dB)

For more experiments, we had selected the standard benches, “Suzie” and “Carphone” for testing. The noise ratio added to the benches is the same as the TV program sampling. Generally, the image quality of standard bench is better than sampling images from TV channel. The noise detection for benches is more accuracy than that of sampling TV images, so the filtering performance of benches is generally better than that of sampling images. Besides, two HD (High-Definition) signals are tested with the same flow. We find that the quality of HDTV filtering is better than NTSC image.

Table 1 lists the processing time and the average PSNR after filtering. The PSNR value was measured using the average of 100 frames. The processing time was evaluated with an Intel Pentium 4 CPU 2.4 GHz and 2 GB DDR3 RAM system. Results demonstrate that the proposed method can outperform previous methods both in terms of PSNR and image filtering. Our speed is close to the fast impulse filter in [2], and the filtering performance is much better than that given in [2]. The processing time for the proposed algorithm is shorter than that of previous adaptive ones [8, 9]. However, for HD signal filtering, the processing time increases a little since the image resolution becomes high.

Table 1 The PSNR performance comparisons with various noise filters

Next, we directly sampled the noisy sequence from the TV program. The noisy images were filtered with various algorithms, and the results are shown in Figs. 11, 12, 13 and 14. The sampled image shows ripple-noise and a Gaussian-like noise. Ripple noise appears like a water ripple on the image (see Fig. 11). The results demonstrate that the performance of our filtering algorithm is superior to that of the conventional algorithms since they are unable to effectively filter TV noise. Even after processing, most of the noise still remained. Our proposed algorithm was able to remove most of the noise and produce a greatly improved image.

Fig. 11
figure 11

a Original sampling image; b standard median filter [1]; c low-cost filter [2]; d α-trimmed mean filter [8]; e STS-DDWA filter [9]; f proposed

Fig. 12
figure 12

a Original sampling image; b standard median filter [1]; c low-cost filter [2]; d α-trimmed mean filter [8]; e STS-DDWA filter [9]; f proposed

Fig. 13
figure 13

a Original sampling image; b standard median filter [1]; c low-cost filter [2]; d α-trimmed mean filter [8]; e STS-DDWA filter [9]; f proposed

Fig. 14
figure 14

a Original sampling image; b standard median filter [1]; c low-cost filter [2]; d α-trimmed mean filter [8]; e STS-DDWA filter [9]; f proposed

4 Conclusions

In this paper, our low-computation filter employed the adaptation of temporal and spatial processing to eliminate noise and produce a markedly improved image. The proposed noise and motion detection methods were able to efficiently distinguish noisy pixels from motion and edge pixels. The filtering strength was adaptive; increasing or decreasing depending on the detected level of field noise and motion that was detected. To remove the large range of noise from the TV display, five reference fields were chosen to sample a large amount of information. Simulations show that the proposed algorithm outperforms competing ones when using both objective and subjective measurements. With its high filtering quality and low computation costs, the proposed noise filter achieves a superior balance of performance and complexity, and is highly suitable for real-time TV video noise reduction.