1 Introduction

Traditionally, single broadband panchromatic (EO) imagers and red–green–blue (RGB) color video cameras have been widely used for daytime threat detection, while single broad band SWIR, MWIR, and LWIR cameras have been widely used for both daytime and nighttime threat detection. With more recent advances of multispectral and hyper-spectral sensing techniques, detection and identification of different threat types on the earth’s surface are conducted primarily through SWIR, i.e., 400 nm through 2500 nm bands. SWIR spectral bands are favored by many sensing and imaging systems because of the high reflectivity and strong solar illumination available during the day.

In general, the 400–500 nm range may be utilized for illuminating material in shadow and water penetration bathymetry; the 500–600 nm range for water penetration bathymetry and discrimination of oil on water; the 600–700 nm range for vegetation differentiation; the 700–1100 nm range for camouflage detection and shoreline mapping; the 1100–3000 nm range for discrimination of oil on water, snow/cloud differentiation, camouflage detection, change detection, plume detection, and explosion detection.

In one of our recent SPIE conference papers [1], we have shown results on Chemical Agent Resistant Coating (CARC) detection using HSI, and also presented a newly developed feature transformation (FT) detection method. CARC is the term for the paint commonly applied to military vehicles which provides protection against chemical and biological weapons. There are different CARC colors. We have presented results for detecting CARC with two colors: green and beige. A target insertion method has been developed. This method allows one to insert the target radiance into any HSI sensor scene, while still preserving the sensor spatial–spectral noise at all the pixel positions. Different CARC types have been inserted to a DC Mall hyper-spectral scene with this target insertion tool. We have used several current state-of-the-art HSI target detection methods [210] such as matched filter (MF), adaptive coherence estimator (ACE), constrained energy minimization (CEM), and spectral angle mapper (SAM), to detect the inserted CARCs.Footnote 1

There are two issues with the current-state-of-the-art detection methods (such as the four mentioned above): (1) Many targets under consideration may have similar spectral signatures as the signatures from the background, leading to bad target detection performance (high false detection rate); and (2) These methods need to use all the available spectral bands (high spectral dimensions). In fact, many bands in the spectral range (400–2500 nm) may have high dependency to each other with redundant information. Therefore, if we can find a few bands that have the least redundant information, we may then use a few of these best bands (low spectral dimensions) to obtain better detection performance. Furthermore, the reduction from several hundred spectral dimensions to a couple of spectral dimensions will considerably reduce the required computational times. Accordingly, we have developed a FT method to de-similar the target signature from the background signatures, and an automated best spectral band selection process to reduce the spectral dimensions for target detection.

In essence, the FT method, by transforming the original features to a different feature domain (e.g., the Fourier, wavelet, PCA, and local cosine domains), may considerably increase the statistical separation between the target and background probability density functions, and thus may significantly improve the target detection and identification performance, as evidenced by the test results presented in this paper. In our tests, we used signatures for green and beige CARCs in original measured reflectance value units against a small signature library including 433 background material signatures that are frequently encountered such as: paints, green trees (grass) and forests, metal, shingle, concrete, brick, sand, tar, asphalt, limestone, snow, and water, etc. In the original spectral band domain, the beige and green CARC signatures have large overlaps and similarities with the signatures of background materials/objects across the whole spectral bands (400–2500 nm), leading to bad detection results.Footnote 2

On the other hand, by observing the CARC and background signature curves, we noticed that the curve slopes are quite different at some spectral bands, and thus we have conducted a spectral FT by differentiating the originally measured reflectance spectral bands. We have shown that by differentiating the original spectral features (this operation can be considered as the 1st level Haar wavelet high-pass filtering), we can completely separate beige CARC from the background using a single band at 650 nm, and completely separate green CARC from the background using a single band at 1180 nm, leading to perfect detection results [1].

After the FT process, we then apply an automated best spectral band selection process that can select the best band and rank the available spectral bands from the best to the worst for target detection. With a double-threshold scheme [11], we can rank the spectral bands by measuring the number of the background materials that are fallen within the double-threshold interval with the target mean intensity level at the center.

To further improve the detection performance in lower spectral range (<1000 nm), we have developed an automated cross-spectrum fusion process to find spectral band pairs that are less correlated to each other using multivariate analysis. Preliminary tests have shown that the fused spectral band pair can considerably reduce false detections than the use of a single spectral band. Our automated cross-spectrum fusion process has found several such good spectral band pairs in the low spectral range (400–800 nm) for improved beige and green CARC detections.

In this paper, we present substantively expanded and improved results on the FT detection method and the automated best spectral band (as well as the fused best band-pair) selection process with more realistic testing data and additional target type (human skin):

  1. 1.

    In [1], we have only tested the FT and the best band selection methods using a small background signature library (that does not contain sensor noise) for concept-proving. In this paper, we have tested our new methods with real HSI imagery cube data collected by a HYDICE sensor (imagery of Washington DC Mall sceneFootnote 3);

  2. 2.

    In this paper, the receiver operating characteristic (ROC) performance curves have been newly estimated to quantitatively compare detection performance between different target types, different feature domains, and between the best single spectral band and the best fused band-pair;

  3. 3.

    In addition to the green and beige CARCs as the targets used in [1], we have also tested human skin detection with our new methods. Human skin detection has recently become a more popular topic for reliable human and pedestrian detection, tracking, identification, and behavior/activity estimation. As shown in one of our recent papers [12], reliable human skin detection is critical for human full body, body-parts (head, arms, torso, and legs) detection, tracking, as well as pose and activity estimation. In [12], we have shown that with an RGB camera, it is difficult to distinguish the yellowish skin from the cloths with yellow or reddish colors. In this paper, we present that we can reliably detect the skin signature with a few best single spectral bands and best fused band-pairs using the FT and the best band selection methods.

One motivation to use the low spectral range (visible to near IR spectrum) is that the cost of HSI equipment will be much lower. If the CARC and the skin detections can be conducted using a few low spectral bands, we may use a low cost single broadband panchromatic imager or a high-definition (HD) RGB camera with a few color filters to accomplish the tasks. This will further reduce the equipment cost and improve the image spatial and temporal resolution. In general, the HD RGB camera and the panchromatic imager have much higher spatial and temporal resolution than an HSI. As discussed in another recent SPIE conference paper [13], we have shown a new way to improve HSI spatial resolution (to deal with HSI sub-pixel un-mixing problem) with HSI sharpening using a high-resolution RGB camera and a panchromatic imager.

This paper is organized in the following way: a target insertion method is discussed in Sect. 2, the FT method in Sect. 3, automated best band selection process with double-threshold scheme in Sect. 4, automated cross-spectrum band-pair fusion process in Sect. 5, quantitative detection performance evaluation with ROC curve estimation in Sect. 6, detection performance evaluation with real HSI imagery background in Sect. 7, and finally we discuss and summarize the paper in Sect. 8.

2 Target Insertion Method

We would like to test CARC and human detection performance under many different background environments and different weather Conditions. However, accomplishing this task is very costly and time consuming. Alternatively, we can test target detection by inserting the target signature into different available hyper-spectral scene imagery cubes. Such a target insertion tool would allow us to test complex target detection and identification performances under many different target conditions (resolved or un-resolved) and background environmental conditions.

Un-resolved (point) target insertion (implant) methods have been well developed. Originally, point target insertion methods have been developed for single broadband IR sensors by inserting a point target into a single pixel [14] or by inserting the optical point spread function (PSF) into a 3 × 3 or 5 × 5 pixel area in the background scenes [15]. The Rotman–Bar Tal Algorithm (RBTA) developed in [14] has been adapted to the analysis of hyperspectral imagery in [7]. The authors of [8] have further fine-tuned and improved the RBTA method to account for target blurring (PSF) and pixel phasing (e.g., center, corner, or edge pixel phases, etc.). In this paper, we aim on inserting an extended (resolved) target with relatively large target area (7 × 7 or 11 × 11) for HSI sensor spatial–spectral noiseFootnote 4 estimation, and thus we do not need to deal with the un-resolved target pixel phasing problem.

There are three technical issues with the target insertion: (1) the target (e.g., CARC) signatures are measured in laboratories as reflectance, while the material signatures from the HSI imagery are usually measured as radiance. Therefore, we need a way to convert the target reflectance to radiance, or wise verse, to convert the whole scene radiance to reflectance. (2) The target signatures measured in laboratories are generally sampled in uniform intervals (e.g., 1 or 3 nm intervals), while the HSI imagery along the spectral dimension are generally sampled in larger non-uniform intervals (e.g., 5 or 10 nm intervals) depending on the HSI hardware designs. Therefore, we need to resample the target signatures before inserting the target into the imagery scenes. (3) When we insert the target signature into pixels of a small area (replacing the original background in that area), we must preserve the original spatial–spectral sensor noise in that area with high fidelity.

In this study, we have used the QUick Atmospheric Correction (QUAC) algorithm [16, 17, 18, 19] for the 400–2500 nm VNIR-SWIR spectral range to convert the whole hyper-spectral imagery radiance to reflectance.Footnote 5 The QUAC algorithm has done quite a good job for this task (Some examples have been presented in Fig. 21b in Sect. 7). We then developed a three-step process to insert the target reflectance into a small region of interest (ROI) area (e.g., 7 × 7, or 11 × 11) in the imagery scene:

  1. 1.

    Find a homogeneous small area in the scene with the same background material (e.g., the green grass, or the asphalt roads): ROI(x, y, λ);

  2. 2.

    Estimate the spatial–spectral noise:

    $$ ROI\_noise\left( {x,y,\lambda } \right) = ROI\left( {x,y,\lambda } \right) - ROI\_mean\left( \lambda \right), $$
    (1)

    where \( ROI\_mean\left( \lambda \right) \) is the spatial mean across (x, y) spatial ROI area at each spectral band λ as expressed below:

    $$ ROI\_mean\left( \lambda \right) = \frac{1}{n \cdot m}\mathop \sum \limits_{x = 1}^{n} \mathop \sum \limits_{y = 1}^{m} ROI\left( {x,y,\lambda } \right), $$

    where n and m are the ROI spatial sizes n × m.

  3. 3.

    Finally, we add the target reflectance signature tgt(λ) to the spatial–spectral noise in the ROI area:

    $$ ROI\_insert\left( {x,y,\lambda } \right) = ROI\_noise\left( {x,y,\lambda } \right) + tgt\left( \lambda \right). $$
    (2)

Figure 1 shows laboratory measured reflectance signatures as tgt(λ) for beige CARC, green CARC Type II paints, and human skin (forearm) in the spectral range of 400–2500 nm. The left-side image in Fig. 2 shows a HYDICE HSI imagery data cube in the D.C. Capitol Hill area, and the right-side image shows a RGB image from the same area with a higher spatial resolution from Google Earth. The HYDICE imagery spatial size is 304 × 400 with 121,600 pixels (spectral vectors). The full spectral range of this HYDICE sensor is 400–2500 nm. In this paper for most of the test cases, we only used part of the full sensor spectral range (400–1000 nm sampled into 80 spectral bands) for our CARC and human skin detection tests.

Fig. 1
figure 1

Laboratory measured reflectance signatures for beige CARC, green CARC and human skin

Fig. 2
figure 2

HYDICE HSI imagery data cube in the D.C. Capitol Hill area

As shown in Fig. 3, the green CARC has been inserted into an 11 × 11 (121 pixels) ROI on a grass background area, and the beige CARC and the skin have been inserted into two 7 × 7 (49 pixels) ROIs on grass and asphalt surface background area, respectively, using Eqs. (1) and (2). Figure 3a–c show the three inserted target reflectance signatures with added sensor spatial–spectral noise.Footnote 6 The estimated spatial–spectral noise using Eq. 1 from the 11 × 11 grass ROI area (121 pixels) is shown in Fig. 4.

Fig. 3
figure 3

High-fidelity beige CARC, green CARC, and human skin insertion (the reflectance values multiplied a 10,000 scale factor by the ENVI tool process)

Fig. 4
figure 4

The 121 sensor spatial–spectral noise vectors from the 11 × 11 grass ROI area

In [1], we have used the target inserted scene (Fig. 3d) to test detection performance using the conventional methods: MF, ACE, CME, and SAM. In this paper, as will be discussed in Sect. 7, the estimated sensor spatial–spectral noise variance as shown in Fig. 4 is used for setting the double-threshold interval for the best spectral band selection process when using the realistic background clutter for target detection. It is seen from Fig. 4 that there are higher noise variances within the spectral range 750–1400 nm at the 11 × 11 grass ROI area.

In addition to pure spatial–spectral noise, the variance shown in Fig. 4 certainly also contain within-class material (grass in this case) variability. Therefore, the inserted target signature contains a small portion of the grass variance, which in general will worsen the target detection performance since the within-class material variability will cause deviation of the inserted target spectral signatures from the target signature truth that is used as the reference signature during the target detection process. Nevertheless, as shown in the detection results in [1], we can still obtain quite good detection performance, indicating that the impact of the within-class material (grass in this case) variability on the detection performance is relatively small.

In order to reduce the within-class material variability, one should try to find a ROI area with high homogeneous and flat surface. For example, the asphalt surface is better than the grass area, and the grass area is better than the tree area.

3 Feature Transformation Method

In our tests, we have used signatures for Green and Beige CARCs in original measured reflectance value units against a small background material signature library containing 433 reflectance signatures that are frequently encountered in our natural environment, such as paints with different colors, green trees and forests (Douglas Fir, Big Leaf Maple, Jasper Ridge), grass, rubber, pine wood, plate window glass, various metals, silt, shingle, roof tile, granite, concrete, brick, sand, tar, asphalt, limestone, clouds, snow, and water, etc.

Some example signatures from the small signature library are shown in Fig. 5. The green paints and green gloss paints signatures are shown in Fig. 5a as the green and blue curves, respectively. Green trees and forests (Douglas fir, Big Leaf maple, and Jasper Ridge) signatures are shown as red curves, various metals are shown as the yellow curves, and shingle is shown as the cyan curves in Fig. 5a. In Fig. 5b, concrete and brick are shown as red, snow as green, water as blue, sand as black, limestone as cyan, and asphalt and tar as yellow curves.

Fig. 5
figure 5

Some example signatures from the small signature library (Color figure online)

The original beige CARC, green CARC, and human skin reflectance signatures are shown in Fig. 6a, while original reflectance signatures of the 433 background materials are shown in Fig. 6c that including more than 100 different materials. It is seen from Fig. 6a, c that in the original spectral feature domain, the beige CARC, green CARC, and skin reflectance intensities have large overlaps with those of the background materials/objects across the whole spectral bands (400–2500 nm). On the other hand, by observing the CARC, skin, and background signature curves, we noticed that the curve slops are quite different at some spectral bands, and thus we have conducted a FT by differentiating the originally measured reflectance features. The differentiating operation can be considered as the 1st level Haar wavelet high-pass filtering by convolving the reflectance signatures with the Haar high-pass filter along the spectral dimension. The 2-tap Haar high-pass filter is expressed as: \( Haar\_hp\left( \lambda \right) = \left[ {1, - 1} \right]. \)

Fig. 6
figure 6

Original and differentiated spectral signatures for CARC, skin, and background

The differentiated reflectance signatures of the beige CARC, green CARC, and skin are shown in Fig. 6b, while the differentiated reflectance signatures of the 433 background materials are shown in Fig. 6d. It is seen that by differentiating the original spectral features, the CARC signature values, at some spectral bands (e.g., the spectral 600–700 nm region for the beige CARC and the spectral 1150–1250 nm region for the green CARC), are separated away from background signature values. Figure 7 shows two examples at spectral bands 650 and 640 nm for the beige CARC. The reflectance and differentiated-reflectance values for both the beige CARC and background in Fig. 7 have been normalized with the scale that sets the maximum beige CARC values to 1.Footnote 7

Fig. 7
figure 7

Original and differentiated reflectance values for beige CARC and background (Color figure online)

As shown in Fig. 7a, b in the original reflectance domain, the beige CARC values (the five red circles—here we have five beige CARC measurements) at both 650 and 640 nm have large overlaps inside the 433 background values (the blue circles). On the other hand, as shown in Fig. 7c, d in the differentiated reflectance domain, the five beige CARC values (the red circles) at both 650 and 640 nm have totally separated from the background values (the blue circles), leading to perfect detection results (e.g., if we set a differentiated reflectance threshold at 0.7).

Similarly as shown in Fig. 8a, b in the original reflectance domain, the green CARC values (the fifty red circles, here we have fifty green CARC measurements) at both 1180 and 1190 nm have large overlaps inside the 433 background values (the blue circles). On the other hand, as shown in Fig. 8c, d in the differentiated reflectance domain, the fifty green CARC values (the red circles) at both 1180 and 1190 nm have totally separated from the background values (the blue circles).

Fig. 8
figure 8

Original and differentiated reflectance values for green CARC and background (Color figure online)

4 Automated Best Band Selection Process with Double-Threshold Scheme

4.1 Target Detection with Single-, and Double-Threshold

For a physical image sensor, the sensing errors are mainly caused by the measurement noise that is generally described as a random variable (RV). Wold in 1938 proposed and proved a theorem [20] that gives us some insight into the way that a physical measurement can always be decomposed in two components: a deterministic component and a random noise component. Wold’s Fundamental Theorem states that any stationary discrete-time stochastic process, x(n), may be expressed in the form:

$$ x\left( n \right) = u\left( n \right) + s\left( n \right), $$
(3)

where u(n) and s(n) are uncorrelated processes, u(n) is a RV, and s(n) is a deterministic process.

In general, for target detection (e.g., CARC) with an HSI camera, the sensor and environmental measurement noise contribute to the u(n) component, and the target and background clutter (CARC, cars, trees, buildings, and roads, etc.) intensitiesFootnote 8 contribute to the s(n) component. To increase the probability of detection (Pd), we must reduce the influence of the noise. Figure 9 shows an example of probability density functions (PDF) of target and clutter noise with assumed normal distribution. In Fig. 9, the influence of noise can be decreased by reducing the noise variances (σ2) and/or by increasing the distance (d = m t  − m c ) between the means of the two intensity RVs related to the target and the clutter.

Fig. 9
figure 9

PDFs of target and clutter noise (Color figure online)

The reduced noise variance and/or the increased intensity distances will increase the signal to clutter noise ratio (SCNR) and thus lead to a better ROC performance, that is, a higher Pd for the same probability of false alarm (Pfa). The distance d may be increased by applying a MF (CEM, ACE, or SAM) detection algorithm. With a single-threshold scheme, the ROC curve performance is obtained by moving the threshold level (the green bar with arrowhead in Fig. 9). When we reduce the threshold intensity level (move the threshold level from right to left in Fig. 9), both Pd and Pfa will increase.

As discussed above with Fig. 9, the conventional detection scheme uses a single threshold between the target PDF and clutter PDF to estimate the Pd and Pfa. As shown in Fig. 9, one may presume that the clutter has a lower (or higher) reflectance intensity level than the target. However, in reality, there are many different types of clutter in the scene. Some of them may have intensity levels lower and some others may have intensity levels higher than the target. For example, for the HSI scene shown in Fig. 2, the asphalt roads have intensity levels lower than the target (the green CARC), while the white/bright roof-tops have intensity levels higher than the target. Figure 10 illustrates this situation. In this case, if the traditional single threshold is set at a lower intensity level shown as the left-side green bar with an arrowhead in Fig. 10, then all the clutter 2 will be detected as false detections, resulting in very high Pfa. In one of the authors’ previous publications [11], a double-threshold process was proposed to reduce Pfa. As shown in Fig. 10, the second threshold (shown as the right-side green bar with an arrowhead) will reduce the false detections caused by the clutter 2.

Fig. 10
figure 10

PDFs of two clutters and one target (Color figure online)

However, as noted in [11], this double-threshold process has the additional burden of requiring a prior knowledge of the target intensity range. This may be an issue with the detection tasks using the conventional EO or IR sensors under different weather and environmental conditions. Nevertheless, for the HSI sensors, there are well developed radiometric and atmospheric correction algorithms and methods [16, 17, 18, 19] such as QUAC, FLAASH, and FLAASH-IR to convert the measured target radiance under various environmental conditions to the invariant target reflectance or emissivity with high fidelity. Therefore, target intensity range at different spectral bands should be available for us to apply the double-threshold scheme.

Accordingly, for the symmetric Gaussian PDF, to estimate the ROC curve performance, we change the interval between the two green arrow-bars with the target mean (mt) at the center of this interval as shown in Fig. 10. When we increase this interval, both Pd and Pfa will increase.

4.2 Automated Best Band Selection Process

Here are the algorithm steps for the automated best band selection process with double-threshold scheme:

  1. 1.

    Differentiate the background scene reflectance imagery Bckgnd(x, y, λ) and the target reflectance signature tgt(n, λ) along the spectral dimension, and obtain: d_Bckgnd(x, y, λ) and d_tgt(n, λ), where n is the number of available target signatures;

  2. 2.

    Set the double-threshold at each spectral band as:

    $$ \begin{aligned} & Mean\left( {d\_tgt\left( {n,\lambda } \right)} \right) - g \cdot STD\left( {d\_tgt\left( {n,\lambda } \right)} \right) < d\_Bckgnd\left( {x,y,\lambda } \right) < Mean\left( {d\_tgt\left( {n,\lambda } \right)} \right) \\ & \quad + g \cdot STD\left( {d\_tgt\left( {n,\lambda } \right)} \right), \\ \end{aligned} $$
    (4)

    where the mean value (along the n dimension) of d_tgt(n, λ) indicates the m t value shown in Fig. 10, STD(d_tgt(n, λ)) is the standard deviation (along the n dimension) of the differentiated target signature values at a specific spectral band, and g is a constant;

  3. 3.

    Estimate the number of pixels from d_Bckgnd(x, y, λ) that fall into the double-threshold interval at each spectral band;

  4. 4.

    Sort the estimated pixel numbers along the spectral dimension in ascending order to rank each spectral band from the best to the worst for target detection.

The constant scale g in Eq. (4) is critical to set the double-threshold interval. For example, if the target RV distribution is Gaussian with g = 3 setting, then the double-threshold interval is equal to 6σ (where σ = STD(d_tgt(n, λ))). Based on Gaussian PDF property, the 6σ interval covers more than 99.7 % of the Gaussian PDF area, indicating Pd > 99.7 % for this double-threshold interval.

From the standard normal distribution PDF, we have estimated the Pd (the covered Gaussian PDF area) as a function of the σ number from 2 to 10, as shown in Fig. 11. It is seen that the Pd is very close to 100 % when the double-threshold interval is larger than 5.5 σ.

Fig. 11
figure 11

Pd versus double-threshold interval at different σ number

In the detection tests shown in Figs. 7 and 8, we only have 5 (n = 5) measured beige CARC signatures and 50 (n = 50) measured green CARC signatures (and 10 measured human skin signatures). These numbers may not be large enough to obtain reliable STD(d_tgt(n, λ)). On the other hand, we have more than 400 background signatures with multiple materials.Footnote 9 In general, we have:

$$ STD\left( {d_{{Bckgnd\left( {x,y,\lambda } \right)}} } \right) > STD\left( {d_{{tgt\left( {n,\lambda } \right)}} } \right). $$
(5)

Accordingly, we use a conservative alternative by substituting STD(d_tgt(n, λ)) with STD(d_Bckgnd(x, y, λ)) in Eq. (4).

Similarly, the automated best band selection process can be applied in the original reflectance feature domain by setting the double-threshold at each spectral band as:

$$ Mean\left( {tgt\left( {n,\lambda } \right)} \right) - g \cdot STD\left( {tgt\left( {n,\lambda } \right)} \right) < Bckgnd\left( {x,y,\lambda } \right) < Mean\left( {tgt\left( {n,\lambda } \right)} \right) + g \cdot STD\left( {tgt\left( {n,\lambda } \right)} \right). $$
(6)

In the differentiated feature domain with Eq. (4), we have applied the automated best band selection process across a spectral range of 420–1420 nm with 201 bands and with the setting of the double-threshold interval at g = 2.3 for CARC (g = 1.4 for human skin), and have successfully ranked the 201 bends from the best to the worst. The double-threshold interval was shown in Figs. 7 and 8 as the intervals between the two horizontal green lines at different spectral bands. The top best eight bands for the beige CARC with the false detection numbers are shown in Table 1, and those for the green CARC and human skin are shown in Tables 2 and 3, respectively.

Table 1 Band ranking results for beige CARC (g = 2.3)
Table 2 Band ranking results for green CARC (g = 2.3)
Table 3 Band ranking results for human skin (g = 1.4)

As indicated by the curve in Fig. 11, for a 4.6σ threshold interval (g = 2.3), we achieve Pd > 96 %. From Tables 1 and 2, it is seen that we achieve Pfa = 0 % at band 650 nm for the beige CARC, and at band 1180 nm (or 1190, 1205, 1210 nm) for the green CARC. For the same 4.6σ threshold interval, we have also applied the best band selection process for the original reflectance data using Eq. (6) for comparison. We obtain 108 false detections out of the 433 (Pfa = 25 %) background signatures at the best band 705 nm for the beige CARC, and obtain 299 false detections out of the 433 (Pfa = 69 %) background signatures at the best band 1365 nm for the green CARC. These results indicate that the FT method by differentiating the original spectral reflectance can significantly improve detection performance by reducing the Pfa.

As discussed in the Sect. 1, we prefer to use low cost HSI cameras with the higher end of spectral band to be smaller than 1000 nm. Accordingly, we have applied the automated best band selection process across a spectral range of 420–1020 nm with 121 sampled bands and with the setting of the double-threshold interval at g = 1.5 in the differentiated feature domain using Eq. (4) for the green CARC. The resulted top best 8 bands with the false detection numbers are shown in Table 4. It is seen that we achieve Pfa = 0 % at band 760 nm for the green CARC. The original and differentiated reflectance values of the green CARC and background are illustrated in Fig. 12 for the top two best bands at 760 and 775 nm.

Table 4 Band ranking results for green CARC (g = 1.5)
Fig. 12
figure 12

Original and differentiated reflectance values for green CARC and background

5 Automated Cross-Spectrum Band-Pair Fusion Process

To further improve the detection performance in lower spectral range (<1000 nm), we have developed an automated cross-spectrum fusion process to find spectral band pairs that are less correlated to each other using multivariate analysis. We show that the fused spectral band pair can considerably improve Pd and reduce false detections than the use of a single spectral band alone. The fused two bands occupies a 2D space (u, v). There are several ways to extend the 1D double-threshold to the 2D threshold bound:

  1. (a)

    Apply the 1D double-threshold for each band separately, and form a rectangular bound in the 2D space as the 2D threshold;

  2. (b)

    Use the larger 1D double-threshold from the two bands (λ u , and λ v ) as the diameter to form a circular 2D threshold:

    $$ \begin{aligned} v & = \pm \sqrt {\left( {g \cdot STD\left( {d\_tgt\left( {n,\lambda_{m} } \right)} \right)} \right)^{2} - \left( {u - m_{u} } \right)^{2} } + m_{v} , \\ \lambda_{m} & = \left\{ {\begin{array}{ll} {\lambda_{u} } & {if\,STD\left( {d\_tgt\left( {n,\lambda_{u} } \right)} \right) > STD\left( {d\_tgt\left( {n,\lambda_{v} } \right)} \right)} \\ {\lambda_{v} } & Otherwise \\ \end{array} } \right., \\ \end{aligned} $$
    (7)

    where m u and m v are the target mean values at these two bands;

  3. c)

    Use the 2D separable Gaussian function to form an elliptical 2D threshold:

    $$ v = \pm \sqrt {\left( {g \cdot STD\left( {d\_tgt\left( {n,\lambda_{v} } \right)} \right)} \right)^{2} - \left( {\frac{{STD\left( {d\_tgt\left( {n,\lambda_{v} } \right)} \right)}}{{STD\left( {d\_tgt\left( {n,\lambda_{u} } \right)} \right)}}} \right)^{2} \left( {u - m_{u} } \right)^{2} } + m_{v} $$
    (8)

Equation (8) is derived from the 2D space (u, v) separable normal distribution function with the correlation coefficient ρ setting to zero. It is seen that Eq. (7) is a special case of Eq. (8) where STD(d_tgt(n, λ u )) = STD(d_tgt(n, λ v )).

Here are the algorithm steps for the automated cross-spectrum fusion process:

  1. 1.

    Select several top best bands (e.g., top 10–15) after running the automated best band selection process. Some examples of the top eight best bands were shown in Tables 1, 2 and 3;

  2. 2.

    Pair each of the selected top bands with all the other available bands;

  3. 3.

    In the 2D space (u, v) for each band pair, select one 2D threshold type from the three methods discussed above;

  4. 4.

    Estimate the number of pixels from d_Bckgnd(x, y, λ) that fall into the 2D threshold area for each band pair;

  5. 5.

    Sort the estimated pixel numbers along all the estimated spectral pair combinations in ascending order to rank each spectral band pair from the best to the worst for false detection.

We have applied the automated cross-spectrum fusion process, for the beige and green CARC, across a spectral range of 420–1420 nm with 201 bands and with the setting of threshold scale g = 4.4 using a circular 2D threshold as expressed in Eq. (7). The selected top best six band-pairs for the beige CARC with the false detection numbers are shown in Table 5, while those for the green CARC are shown in Table 6. For the human skin with g = 3.3, the selected top best six band-pairs with the false detection numbers are shown in Table 7. Based on the results in Table 4, we have applied the automated cross-spectrum fusion process across a smaller spectral range of 420–1020 nm with 121 bands and with the setting of threshold scale g = 2.1 using a circular 2D threshold. The selected top best six band-pairs for the green CARC with the false detection numbers are shown in Table 8. By comparing Tables 1 and 2 with 5 and 6, it is seen that with similar false detection numbers, the use of the fused band-pair can considerably increase the target–clutter separation from 4.6σ to 8.8σ.

Table 5 Band pair ranking results for beige CARC (g = 4.4)
Table 6 Band pair ranking results for green CARC (g = 4.4)
Table 7 Band pair ranking results for human skin (g = 3.3)
Table 8 Band pair ranking results for green CARC (g = 2.1)

The results of the top band-pair 650/620 nm in Table 5 (beige CARC) are plotted in Fig. 13a, b. The green rectangle indicates a 2D rectangle threshold, the cyan circle indicates a 2D circular threshold, and the magenta ellipse indicates a 2D elliptical threshold. Figure 13a shows the 2D threshold results in the original spectral domain using Eq. (6). It is seen that almost all the 433 background signatures have fallen into the 2D threshold bounds. On the other hand, as shown in Fig. 13b for the 2D threshold results in the differentiated spectral domain using Eq. (4), there are some false detections that have fallen inside the up-left corner of the rectangle threshold bound, but no false detection inside the circular and elliptical threshold bounds.

Fig. 13
figure 13

2D threshold of 650/620 nm and 650/645 nm pairs for beige CARC (g = 4.4) (Color figure online)

The results of the band-pair 650/645 nm (beige CARC) are plotted in Fig. 13c, d. It is interesting to see from Fig. 13d that there are 11 false detections that fall inside the circular threshold bound, but only 2 false detections fall inside the elliptical threshold bound. In general, the elliptical threshold performs better than the circular threshold, while the circular threshold performs better than the rectangle threshold. Finally, results of the best bend-pair 1205/1200 nm in Table 6 (green CARC), the best bend-pair 440/590 nm in Table 7 (human skin), and the best bend-pair 760/680 nm in Table 8 (green CARC) are illustrated in Figs. 14, 15 and 16, respectively.

Fig. 14
figure 14

2D threshold of 1205/1200 nm pair for green CARC (g = 4.4)

Fig. 15
figure 15

2D threshold of 440/590 nm pair for human skin (g = 3.3)

Fig. 16
figure 16

2D threshold of 760/680 nm pair for green CARC (g = 2.1)

6 Quantitative Detection Performance Evaluation with ROC Curve Estimation

As discussed in Sect. 4.1, we can extend the conventional ROC estimation method with a single-threshold scheme to estimate ROC using the double-threshold scheme. For the symmetric Gaussian noise PDF, to estimate the ROC curve performance, we change (either gradually increase or decrease) the g constant in Eq. (4), and thus change the double-threshold interval (\( 2 g \cdot STD\left( {d\_tgt\left( {n,\lambda } \right)} \right) \)) as illustrated in Fig. 10. When we increase this interval, both Pd and Pfa will increase.

The estimated ROC curves of beige CARC for the best original (705 nm), differentiated bend (650 nm), and bend-pair (650/620 nm) are plotted in Fig. 17a, b, and those for green CARC and human skin are plotted in Figs. 18 and 19, respectively. For example, as shown in Fig. 19 (human skin), the blue curve is the detection performance for the best original band at 495 nm estimated using Eq. (6). The red curve (in both Fig. 19a, b) is the detection performance for the best single differentiated band at 440 nm estimated using Eq. (4), and the green curve is the detection performance for the best differentiated fused band-pair at 440/590 nm. From Fig. 19, for a Pd = 95 %, the detection with the best original band at 495 nm (the blur curve) resulted in a high Pfa (>30 %), while the detection with the best differentiated single band at 440 nm (the red curve) resulted in a low Pfa (=0.23 %). As the best performance, the detection with the differentiated fused band-pair at 440/590 nm (the green curve) resulted in a perfect Pfa = 0.

Fig. 17
figure 17

ROC curves for the best original, differentiated bend and bend-pair (beige CARC)

Fig. 18
figure 18

ROC curves for the best original, differentiated bend and bend-pair (green CARC)

Fig. 19
figure 19

ROC curves for the best original, differentiated bend and bend-pair (human skin)

7 Detection Performance Evaluation with Real HSI Imagery Background

So far, we have only tested the FT and the best band selection methods using a small background signature library that contains many frequently encountered background materials for concept-proving. However, this signature library does not contain sensor noise, and we do not have a large measurement of the target signatures for reliable target statistical estimation. In this section, we have further tested our new methods with real HSI imagery cube data collected by a HYDICE sensor, and we have also used the estimated real spatial–spectral noise from an 11 × 11 ROI area of this imagery cube data as shown in Fig. 4 in Sect. 2. These 121 spectral noise vectors provide us with a relatively reliable target noise σ estimation at different spectral bands.

The 150 × 100 scene area used for the tests is shown in Fig. 20a. The 11 × 11 grass ROI area containing the spatial–spectral noise shown in Fig. 4 is also within this scene area. As discussed in Sect. 2, the CARC and skin signatures are measured in laboratories as reflectance, while the material signatures from the HSI imagery are measured as radiance. Here we have used the QUAC process (contained in the ENVI Tool) [16] to convert the whole hyper-spectral imagery radiance to reflectance. Figure 20b shows the converted 15,000 reflectance signatures from all the pixels in the imagery shown in Fig. 20a.Footnote 10 The smoothness of all the 15,000 reflectance signature curves in Fig. 20b indicates that the atmospheric correction (QUAC) process has been executed successfully. If there are ill-conditions occurred, the resulting signature curves would be very noisy with spikes and discontinuities.

Fig. 20
figure 20

The 15,000 real HSI imagery background signatures from a 150 × 100 area

The QUAC algorithm has done quite a good job for this task. Figure 21 shows such an example. The red signature curve in Fig. 21a shows a radiance signature from a tree pixel from the original imagery shown in Fig. 20a. Its converted relative reflectance (multiplied by a scale factor of 10,000—an ENVI process routine) signature is shown in Fig. 21b as the red curve. By comparing these two red spectral curves, it is seen that the deep atmospheric absorption dents in the radiance curve have been rightfully corrected.

Fig. 21
figure 21

Original radiance signature and converted reflectance signature from a tree pixel (the red curves) (Color figure online)

For comparison, we have taken two reflectance signatures (the Jasper Ridge and the fir tree) from the small reflectance signature library with 433 signatures used in the previous section, and plotted them in Fig. 21b as the green curve (the Jasper Ridge) and the blue curve (the fir tree).Footnote 11 It is seen that the three curves are similar, and the curve differences may be caused by different tree types.

In these tests, we only used a spectral range of 400–1000 nm with 80 bands. Same as in the previous sections, we used three target signatures: green, beige CARC, and human skin. Each of these three target signatures tgt(λ) then added to the 121 ROI noise (Fig. 4) to obtain tgt(n, λ) where n = 121, as expressed in Eq. 2. Figure 22a, b show the 121 original and differentiated spectral signatures for green CARC, while, Fig. 22c, d show 15,000 of those for the background.Footnote 12

Fig. 22
figure 22

Original and differentiated spectral signatures for green CARC and background

In the differentiated feature domain with Eq. (4), we have applied the automated best band selection process using the STD estimated from the 121 target signatures to set the double-threshold interval, and have successfully ranked the 80 bends from the best to the worst. The top best 8 bands for the beige CARC with the false detection numbers are shown in Table 9, and those for the green CARC and human skin are shown in Tables 10 and 11, respectively.

Table 9 Band ranking results for beige CARC (g = 15)
Table 10 Band ranking results for green CARC (g = 3)
Table 11 Band ranking results for human skin (g = 5)

By comparing these results with the results from the previous sections (Table 9 compared to Table 1, Table 10 to Table 4 and Table 11 to Table 3), we notice that we have obtained very similar top best spectral bands even though we have used totally different background sets—a small signature library (With 433 signatures) versus a real HSI background area (with 15,000 signatures).

We have applied the automated cross-spectrum fusion process, for the three target signatures, across a spectral range of 400–1000 nm with 80 bands using a circular 2D threshold as expressed in Eq. (7).The top best six fused band-pairs for the beige CARC with the false detection numbers are shown in Table 12, and those for the green CARC and human skin are shown in Tables 13 and 14, respectively. By comparing these results with the results from the previous sections (Table 12 compared to Table 5, Table 13 to Table 8 and Table 14 to Table 7), we notice that we have obtained similar top best fused spectral band-pairs even though we have used totally different background sets.

Table 12 Band pair ranking results for beige CARC (g = 35)
Table 13 Band pair ranking results for green CARC (g = 8)
Table 14 Band pair ranking results for human skin (g = 12)

The results of the top two differentiated band-pairs 640/632 and 640/649 nm in Table 12 (beige CARC) are plotted in Fig. 23a, b.The red circles are the 121 target values and the blue squares are the 15,000 background values. Figures 24 and 25 show the fused band-pair plots for the green CARC and human skin, respectively.

Fig. 23
figure 23

2D threshold of 640/632 nm and 640/649 nm pairs for beige CARC (g = 35) (Color figure online)

Fig. 24
figure 24

2D threshold of 759/686 nm and 759/705 nm pairs for green CARC (g = 8)

Fig. 25
figure 25

2D threshold of 601/580 nm and 443/601 nm pairs for human skin (g = 12)

It is worth pointing out that the double-threshold interval related g number for the results in this section are larger (related to larger Pd) than the g number for the results in the previous section. The main reason is that in this section, we used the realistic spatial-spectral sensor noise to estimate the target STD. On the other hand, in the previous section, we used a conservative substitution of the background library STD. We also notice for the results in this section that the g number for the green CARC are smaller than the other two target signatures. It is related to the larger noise variance around the spectral range above 750 nm as shown in Fig. 4.

8 Discussion and Summary

In this paper, we present results for detecting beige CARC, green CARC, and human skin. A target insertion method has been developed. This tool allows one to insert the target reflectance or radiance into any HSI sensor scene, while still to preserve the sensor spatial–spectral noise in the inserted ROI area. We have developed a FT algorithm by transforming the original spectral features to a different feature domain. In this paper, we have shown that by differentiating the original spectral bands, we can considerably increase the statistical distance between the target and background clutter PDFs, leading to better performance for the beige, green CARC, and human skin detection.

One problem with the current-state-of-the-art detection methods (e.g., MF, ACE, CEM, and SAM) is that these methods need to use all the available spectral bands (high spectral dimensions). Many bands in the spectral range may have high dependency to each other with redundant information. In this paper as discussed in Sect. 4, we have developed an automated best spectral band selection process. This process selects the best band and ranks the available spectral bands from the best to the worst for target detection with a double-threshold scheme, and thus we only need to use a few best spectral bands (low spectral dimension) to obtain better detection performance with faster processing times.

To further improve the detection performance, we have developed an automated cross-spectrum fusion process to find spectral band-pairs that are less correlated to each other using multivariate analysis. As discussed in the results from Sect. 5, the use of the fused band-pair can considerably increase the target–clutter separation from 4.6σ (when using the best single band) to 8.8σ (when using the best fused band-pair) for the CARC signatures. For a Gaussian random noise with the target, an 8.8σ Gaussian PDF window means that the Pd is very close to 100 %. Based on the current algorithm design, it is straightforward to extend the spectral band fusion process from 2D to 3D by fusing three different bands for further detection improvements.

In Sect. 6, we presented a way for quantitative detection performance evaluation with ROC curve estimation by extending the conventional ROC estimation method with single-threshold scheme to ROC curve estimation using our double-threshold scheme. The estimated ROC curves for all the three target signatures indicate that the FT method can considerably improve target detection performance using a few best spectral bands, and the best fused band-pair selection process can further improve detection performance over the use of a single best band alone.

In Sect. 7, we have tested our new FT method using more realistic background signatures with real spatial–spectral sensor noise and non-uniform spectral sampling depending on the sensor hardware. The D.C. scene imagery cube, as shown in Fig. 2, was used. The originally measured background radiance signatures were first converted to reflectance, and then applied for the FT method and the best band selection processes. The detection results in Sect. 7 show that we can obtain similar detection performance as the performance obtained in previous sections where a small signature library was used as background for concept-proving, and thus further validating the new target detection methods developed in this paper.

In this paper, we have shown that CARC and human skin detection can be significantly improved with a spectral band differentiation FT method because the target signature curves have large slope changes in certain spectral bands. Reliable CARC detection is critical for distinguishing military vehicles from civilian vehicles, and also for timely warning of potential chemical or biological (war) activity. Reliable human skin detection is important for human full-body, body-parts detection, tracking, and activity/behavior estimation. In general, the spectral band differentiation FT method should work well for targets with large slope changes in signature curves, but not work well for targets with flat (less slope changes) signature curves. For example, the Tyvek signature discussed in [13] is quite flat. Alternatively, we may apply a spectral band integral FT method to improve detection performance for these kinds of targets with flat spectral signature curves. Our ongoing efforts include applying the new detection methods for other types of targets such as Tyvek, chemicals related to home-made-explosives, cars, and vessels, etc., and testing with different types of FTs (spectral differentiation or integral, wavelet filtering, PCA, and local cosine functions, etc.) depending on the target signature properties.

In [1], we have used several conventional HSI target detection methods such as MF, ACE, CEM, and SAM to detect the inserted green and beige CARC. If we use the whole 400–2500 nm spectral vector for detection, the conventional detection methods can still perform quite well with high Pd and low Pfa, as demonstrated in [1] and [13]. However, the performance became worse with lower Pd and higher Pfa when we reduced the spectral range to 400–1000 nm. In this paper, we aim on using only a few better spectral bands for target detection. If we compare ‘Apple’ to ‘Apple’ by further limiting the spectral range to, e.g., 400–500, 500–600, or 600–700 nm for the conventional detection methods, the performance will certainly be further worsen.

One motivation to use the low spectral range (visible to near IR spectrum) is that the HSI hardware with a reduced spectral range will cost less. In this paper, we have demonstrated that the CARC with different colors can be reliably detected with a few low spectral bands. For example, as shown in Tables 1 and 5 for the beige CARC detection results, the best band 650 nm (4.6σ separation) and the best band-pair 650/620 nm (8.8σ separation) can be used for very reliable beige CARC detection. Accordingly, we may design and build a low cost dedicated CARC (or human skin) detector with higher spatial and temporal resolution by using a low cost and high-resolution panchromatic imager or a HD RGB camera with a few color filters to accomplish the tasks.