Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Part I Principles of Matched Filtering

This section presents a compact tutorial exposition of the principles of matched filtering for the biological scientist. The matched filter is a concept from communications and radar theory (Van Trees 1968; Whalen 1971; Kay 1998; Papoulis 1965) that has been applied widely to various other applications in science and engineering (Carpranica and Moffat 1983; Wehner 1987; Clark et al. 1999, 2000, 2009; Waltz and Llinas 1990; Clark 1999; Jazwinski 1970; Candy 2006). It is a statistical signal processing algorithm designed for detecting the existence of a desired signal that is buried in a noisy measurement signal. In general and for our purposes, a “signal” can be interpreted to be a scalar or multidimensional construct; e.g., a time series, an image, a three-dimensional volume, a video sequence, etc. The fundamental mathematical approaches are common to all of these modalities. For tutorial purposes, we focus here on the fundamental matched filtering approach for a time series. The literature in matched filtering is vast, and a full understanding of the concept requires a great deal of study. This section endeavors to encapsulate the most important principles of matched filtering so as to aid the biologist in processing experimental data. The concepts are introduced with the idea that the reader can consult the referenced literature for in-depth treatments.

Imagine that we are conducting a scientific experiment involving a physical process that generates a noisy discrete-time temporal signal (time series). Our goal is to make a judgment or decision about whether or not the noisy measured signal contains a particular desired signal component of interest to us. We can say that we wish to detect the desired signal component. More specifically, imagine that we have a real noisy measured discrete-time signal

$$ x(n)=a\left(n-{n}_0\right)+\nu (n)\kern1em \left(\mathrm{Noisy}\kern0.5em \mathrm{Measurement}\kern0.5em \mathrm{Signal}\right) $$
(5.1)

where \( a\left(n-{n}_0\right) \) is a time delayed version of the desired signal a(n) that we wish to detect, v(n) is noise (undesired component of the signal), n denotes the discrete time index, n = 0, 1, 2,…, N − 1, n 0 is the time delay, and the sampling interval T is normalized to one (i.e., T = 1 s, so nT = n). We assume that the arrival time n 0 of the desired signal a(n − n 0) is unknown to the user.

Perhaps the simplest way to detect the signal a(n) in the noisy measurement x(n) is to choose various values of a threshold γ and compare the amplitude values of x(n) to that threshold at each value of time n. This is often called a “threshold detector ” that uses the raw signal x(n) as the decision statistic (the quantity we compare to the threshold). At each time instant n, if x(n) is less than the threshold γ, then we declare that a(n) is not present in the measurement at time n. If x(n) is greater than or equal to the threshold γ, then we declare that a(n) is present in the measurement at time n. Of course, our declarations will vary as we vary the threshold value. This section discusses methods for dealing with the various declarations and measuring the performance of the detection algorithm.

Note that the terms “detection” and “classification” are often used interchangeably, for good reason. Detection theory is often regarded as a subset of classification theory. Classification generally describes methods for multidimensional hypothesis testing, but detection theory was originally developed for scalar (one dimensional) hypothesis testing. The distinction is not nearly as important as understanding their common underlying concepts.

Imagine now a scenario in which we have prior information about the desired signal a(n). In the signal processing world, we always welcome prior knowledge because we are often able to incorporate it into our processing algorithms to give us an advantage. In many applications (especially in communication and radar systems ), we have available exemplars of the signal a(n) we wish to detect. In radar, for example, we have a pulse generator and antenna system that create waves in the form of a transient pulse which are propagated through a channel, reflect from a physical target, propagate back through the channel, and are measured by the antenna and radar system. In this scenario, we have prior knowledge of the transmitted transient pulse used to interrogate the target, because we generated it ourselves. We can, of course, generate additional exemplars of the transmitted pulse. The issue now is whether or not we can use this prior knowledge to help us detect such a transient pulse in a measured waveform. The idea behind the correlation detector is that we can.

1.1 The Correlation Detector

The term “correlation detector ” refers to a signal detection algorithm that cross-correlates the measured data with a replica or exemplar of the desired signal. It is also sometimes called a “correlator” or a “replica correlator.” The basic idea behind the correlation detector is to use the mathematical operation called cross-correlation to scan the measured signal x(n) with an exemplar of a(n). The cross-correlation result is R ax (k), a time waveform that is a function of the time delay k between a(n) and x(n) during the correlation process. The premise of the correlation detector is that the cross-correlation waveform will be large when the measurement x(n) contains nonzero a(n) and small when it does not. The correlation detector uses the cross-correlation waveform R ax (k) (or a function of R ax (k)) as the decision statistic in a threshold detection algorithm. The hope is that using R ax (k) as the decision statistic will give better detection performance than that which would be obtained by using the raw signal x(n) as the decision statistic. This hoped-for result is generally realized in practice. Prior knowledge is very helpful.

1.2 Section Organization

In the remainder of this section, we introduce the matched filter by examining an example detection problem and showing the steps in the detection process. We show that the matched filter is really another name for the correlation detector and why. We then step through the process of detecting a signal buried in a noisy measurement and develop the appropriate measures for evaluating detector performance. We show that the matched filter can be an effective detection tool when exemplars of the desired signal are available a priori.

2 An Example Detection Problem

The data for an example event signal detection problem are depicted in Fig. 5.1. The top signal is a transient “event waveform” a(n) representing a physical event that we wish to detect. In this particular case, this event is a dissolver acid time series from a chemical processing plant. The middle signal is a delayed version of the transient waveform with delay equal to n 0. The bottom signal denotes the noisy measurement signal \( x(n)=a\left(n-{n}_0\right)+\nu (n) \). The noise v(n) is statistically white, zero mean and Gaussian distributed with noise variance \( {\sigma}_{\nu}^2=.599 \). We denote this as \( \nu (n)\sim N(0,\kern0.28em {\sigma}_{\nu}^2) \). Clearly, the noise power is fairly large, so the desired event signal a(n) is significantly obscured in the measurement. This leads us to define the concept of signal-to-noise ratio.

Fig. 5.1
figure 1

Simulated measurement signal x(n) created for our matched filtering example: (Top) Exemplar of the transient event waveform a(n) we wish to detect in the noise measurement. (Middle) Time delayed version a(n − n 0) of the transient waveform. (Bottom) Measured event plus noise signal \( x(n)=a\left(n-{n}_0\right)+\nu (n) \). The signal-to-noise ratio is SNR = 20 dB. The magnitude units are arbitrary and the temporal sampling period T = 1

2.1 Define the Signal-to-Noise Ratio (SNR)

Consider a noisy measured signal as described in Eq. 5.1. In a nice theoretical simulation experiment, we can easily know the desired signal a(n) and the noise v(n) separately, because we create them ourselves. However, in many real-world experiments, we can measure only the sum in Eq. 5.1. In some experiments, the measured signal consists of pre-event noise (before a signal event occurs) followed in time by the sum of the event signal a(n) and the noise v(n). This occurs in, for example, seismic event signals. In this case, we can cut out a section of pre-event noise and compute its noise variance. In some rare applications, we have available a noiseless signal a(n) before the physical system corrupts it (e.g., in communications and radar).

Let us assume for now that we have prior knowledge that allows us to separate the desired signal from the noise. In general, we define the signal-to-noise ratio (SNR) as follows (Kay 1998; Candy 2006):

$$ \mathrm{S}\mathrm{N}\mathrm{R}\triangleq \frac{\mathrm{Signal}\kern0.24em \mathrm{energy}}{\mathrm{Noise}\kern0.24em \mathrm{variance}}\kern1em \left(\mathrm{Signal}\hbox{-} \mathrm{t}\mathrm{o}\hbox{-} \mathrm{noise}\kern0.24em \mathrm{ratio}\right) $$
(5.2)
$$ \triangleq \frac{E_a}{\sigma_{\nu}^2} $$
(5.3)

where the energy in signal a(n) is given by

$$ {E}_a\triangleq {\displaystyle \sum_{n={n}_0}^{n_1}}{a}^2(n) $$
(5.4)

and we calculate the energy in signal a(n) over the time interval between appropriate time indices n 0 and n 1. We denote the noise variance by σ 2 v . We can express the SNR in the commonly used units of decibels (dB) by applying the following definition:

$$ \mathrm{S}\mathrm{N}\mathrm{R}(\mathrm{d}\mathrm{B})\triangleq 10lo{g}_{10}[\frac{E_a}{\sigma_{\nu}^2}] $$
(5.5)
$$ =10{log}_{10}[R],\kern1em \mathrm{w}\mathrm{h}\mathrm{e}\mathrm{r}\mathrm{e}\kern0.24em R\triangleq {E}_a/{\sigma}_{\nu}^2 $$
(5.6)

Note that in a simulation experiment, once we have computed E a for our particular signal and we know our desired SNR(dB), we can solve for the noise variance required to achieve that SNR(dB). If we define Q as follows, then we have:

$$ Q\triangleq \mathrm{S}\mathrm{N}\mathrm{R}(\mathrm{d}\mathrm{B})/10 $$
(5.7)
$$ R={10}^Q $$
(5.8)
$$ {\sigma}_v^2={E}_a/R $$
(5.9)

Consider a numerical example: Let E a  = 4.2632 and the desired SNR(dB) = 40. Then, we see that Q = 4, R = 104, and \( {\sigma}_v^2=4.2532e-4 \). Note that in applications such as the seismic event signal described earlier, we can define an approximate SNR, call it SNRE, that consists of the ratio of the energy in the measured signal x(n) and the noise variance (Clark and Rodgers 1981). This is one way in which we can cope with the lack of prior knowledge.

3 The Matched Filter Detector

The term matched filter is another name for the correlation detector. The fundamental principle of matched filtering is to exploit the prior knowledge we have about the desired signal of interest a(n) to build a correlation detector. One might ask the question, “Then why do we call the correlation detector a matched filter?” The answer lies in the meaning of the mathematical operation of correlation, as we show next.

3.1 Convolution and Filtering

Given two discrete-time signals a(n) of length M samples and x(n) of length N samples, we can define the convolution y(n) of the two signals as follows:

$$ \begin{array}{rl}y(n)& =a(n)*x(n)=\sum_{k=-\mathrm{\infty}}^{\mathrm{\infty}}a(k)x(n-k)\\ {}& =x(n)*a(n)=\sum_{k=-\mathrm{\infty}}^{\mathrm{\infty}}x(k)a(n-k)\kern1em (\mathrm{C}\mathrm{o}\mathrm{n}\mathrm{v}\mathrm{o}\mathrm{l}\mathrm{u}\mathrm{t}\mathrm{i}\mathrm{o}\mathrm{n})\end{array} $$
(5.10)

We see that convolution is commutative. The convolution operation can be interpreted as “flipping” (reversing) one of the two signals in time, then sliding it in time across the other signal and multiplying each of the values of the two signals together at each time sample and summing the products (Oppenheim and Schafer 1975). The time-reversal operation is described mathematically by x(−n) and a(−n). The key concept is that a linear filtering operation in the time domain can be written as a convolution summation (Oppenheim and Schafer 1975). Thus, if we filter a signal x(n) with a linear filter impulse response a(n), then that filtering operation is written as a convolution of the form in Eq. 5.11. Note that the convolution of a signal of length N with a signal of length M has length N c  = N + M − 1.

3.2 Correlation vs. Convolution

The correlation of two real discrete-time signals a(n) of length M samples and x(n) of length N samples is written as follows:

$$ \begin{array}{rlr}{R}_{ax}(n)& =a(n)*x(-n)=\sum_{k=-\mathrm{\infty}}^{\mathrm{\infty}}a(k)x(n+k)& \\ {}& =x(n)*a(-n)=\sum_{k=-\mathrm{\infty}}^{\mathrm{\infty}}x(k)a(n+k)={R}_{xa}(n)& (\mathrm{Correlation})\end{array} $$
(5.11)

We see from this equation that correlation is commutative, and the correlation operation can be written in terms of the convolution operation. If we do the convolution operation without reversing one of the two signals in time, then we get the correlation operation. We recall that the convolution operation reverses one of the signals in time before sliding it across the other signal. If we reverse one of the signals before doing the convolution, then the convolution reverses it again, so the result is an operation with a signal that has been reversed twice. This is equivalent to the correlation operation.

3.3 The Matched Filter

We are ready now to see why we call the correlation detector a matched filter. If we interpret the exemplar signal a(n) to be the impulse response of a linear filter, then if we convolve a(n) with x(n), we are filtering the measurement x(n) with a filter matched to the signal of interest a(n). We have chosen a filter impulse response that is matched to the desired signal of interest a(n). Because correlation is equivalent to convolution (with one of the signals flipped in time), we can think of the correlation operation as a filtering operation. Because the chosen filter impulse response is a signal matched to the desired signal, we call this filtering operation matched filtering. Of course, the term “matched filter” does not explicitly mention the fact that the filter output is used as the test statistic in a threshold detector, so the reader must infer that information without help from the name.

Figure 5.2 depicts the general block diagram for a matched filter detector, showing the filter, the test statistic, and the threshold test. In order to maximize detector performance and enable the proper measurement of performance, it can be shown (Kay 1998) that the threshold test must be conducted at the time sample corresponding to the largest value (peak) of the test statistic r[y(n)]. Letting n* denote the time index corresponding to the peak of the test statistic, we conduct the threshold test at r[y(n*)].

Fig. 5.2
figure 2

General matched filter block diagram. Letting the time sample at which the test statistic is maximum be denoted by n*, note that the threshold test is conducted at the peak r[y(n*)] of the test statistic r. Ideally, n* = n 0. The decision threshold is denoted by γ

3.4 Example: Applying the Matched Filter to Our Example Signals

Figure 5.3 depicts an example of a correlation detector scheme. We see that the scheme consists of a cross-correlation operation, a test statistic and a thresholding operation. Note the plots depicting the various signals at each step of the scheme. Note the very low noise level in the plot of the absolute value of the cross-correlation. This demonstrates a key property of the matched filter, that the matched filter maximizes the SNR at the output of the filter/correlator. We discuss the processing scheme in detail in the following sections.

Fig. 5.3
figure 3

Block diagram for the example problem. Letting the correlation lag index at which the test statistic is maximum be denoted by k*, note that the threshold test is conducted at the peak of the test statistic, \( r\left[{R}_{xa}\left(k,*\right)\right]=\left|{R}_{xa}\left(k,*\right)\right| \)

4 Bayesian Binary Hypothesis Testing and Performance Measurement

Bayesian detection theory provides a rigorous foundation for evaluating detector performance (Whalen 1971; Van Trees 1968). Assume that we have a noisy one-dimensional (scalar) measured discrete-time signal (time series) as in Eq. 5.1. In general, the desired signal a(n) can be deterministic or stochastic, and the noise v(n) is modeled as stochastic and uncorrelated (statistically white) or correlated (statistically colored). In this tutorial, we model the desired signal a(n) as deterministic (Papoulis 1965). We treat the noise as either uncorrelated or correlated. Most textbooks focus on the special case when the noise is Gaussian distributed (Whalen 1971; Kay 1998; Van Trees 1968), and for that case, the reader is directed to the references. For generality, this tutorial makes no assumptions about the form of the noise distribution.

We can use binary hypothesis testing to make decisions or declarations about whether or not the desired signal a(n) is present in the measurement. The hypothesis that a(n) is not present in measurement x(n) is denoted H 0, and the hypothesis that a(n) is present in x(n) is denoted H 1.

$$ \begin{array}{rlrl}{H}_0:x(n)& =\nu (n)& & (\mathrm{Null}\ \mathrm{Hypothesis}: \mathrm{Desired}\ \mathrm{Signal}\ \mathrm{Not}\ \mathrm{Present})\\ {}{H}_1:x(n)& =a(n-{n}_0)+\nu (n)& & (\mathrm{Alternative}\ \mathrm{Hypothesis}: \mathrm{Desired}\ \mathrm{Signal}\ \mathrm{Present})\end{array} $$
(5.12)

Notice that this problem definition assumes that we do not know in advance the arrival time n 0 of the signal of interest a(n − n 0). We must estimate the arrival time as part of the detection/classification process. As depicted in Figs. 5.2 and 5.3, the general matched filter includes computing a test statistic r[y(n)]. This test statistic can take many forms; however, probably the most commonly used is the absolute value of the correlation result y(n), as shown in the figure. Recall from before that this threshold test is conducted at the time sample at which the test statistic is maximum (Kay 1998). For our case, the test statistic is the absolute value of the cross-correlation. Because the correlation lag index is different from the signal time index n, we denote the lag index by k, and we denote the lag index at which the test statistic is maximum by k*. This index k* corresponds to the arrival time n 0 of the desired signal a(nn 0). Expressed mathematically, we define k* as the correlation lag index that satisfies:

$$ \underset{k\in \left[0,N-M\right]}{max}\left|{R}_{xa}(k)\right| $$
(5.13)

Recall that N is the number of time samples in x(n) and M is the length of a(n). The set of k values over which to search is [0, NM] because that is the range over which the cross-correlation is affected by the signal a(n) (Kay 1998). The decision is made when the test statistic evaluated at k* is compared with the threshold \( \gamma \):

$$ \begin{array}{rlrl}r[{R}_{xa}({k}^{*})]& \overset{H_1}{\underset{H_0}{\gtreqless }}\gamma & & (\mathrm{Decision}\ \mathrm{Rule})\end{array} $$
(5.14)

4.1 The Confusion Matrix (or Contingency Table)

Each time the hypothesis test is conducted, one of four events can occur: (1) H 0 is true and we declare H 0, (2) H 0 is true and we declare H 1, (3) H 1 is true and we declare H 1, and (4) H 1 is true and we declare H 0. The first and third alternatives correspond to correct choices. The second and fourth alternatives correspond to errors. The confusion matrix (or contingency table) depicted in Fig. 5.4 summarizes these four events, their associated probabilities, and the method for computing the four probabilities. The Bayes test assumes that there exist prior probabilities (priors) for the hypotheses and costs associated with the four courses of action. The priors P(H 0) and P(H 1) represent information available about the source prior to conducting the experiments. The costs for the four possible courses of action are given by C 00, C 10, C 11, and C 01, where C ij is the cost of deciding H i given that H j is true. Once the costs have been assigned, the decision rule is based on minimizing the expected cost, which is known as the Bayes risk ℜ (Van Trees 1968; Whalen 1971; Kay 1998):

Fig. 5.4
figure 4

Confusion matrix (contingency table): the two hypotheses are denoted H 0, the null hypothesis, and H 1, the alternative hypothesis. Note that for the special case in which the prior probabilities are equal \( \left(P\left({H}_0\right)=P\left({H}_1\right)=\frac{1}{2}\right) \), the probability of correct classification becomes \( {P}_{CC}=\frac{1}{2}\left[{P}_D+\left(1-{P}_{FA}\right)\right] \). Note that we construct one confusion matrix for each value of the decision threshold γ

$$ \mathrm{\Re}=\sum_{i=0}^1\sum_{j=0}^1{C}_{ij}P({H}_i|{H}_j)P({H}_j)\kern1em (\mathrm{B}\mathrm{a}\mathrm{y}\mathrm{e}\mathrm{s}\kern0.5em \mathrm{R}\mathrm{i}\mathrm{s}\mathrm{k}) $$
(5.15)

We assume throughout this discussion that the cost of an incorrect decision is higher than the cost of a correct decision. In other words, C 10 > C 00 and C 01 > C 11. Note that in the confusion matrix, the probabilities in the two columns of the matrix each sum to one. An important special case of the Bayes criterion is that in which a correct classification is assigned zero cost and an incorrect classification is assigned full cost. In this case, we assign C 00 = C 11 = 0 and C 01 = C 10 = 1. Inserting these values in the Bayes Risk of Eq. 5.15 and using the fact that P (error) + P(correct classification) = 1, we obtain the probability of correct classification. Note that P cc is the weighted sum along the main diagonal of the confusion matrix, weighted by the priors:

$$ P\left(\mathrm{Correct}\kern0.24em \mathrm{Classification}\right)={P}_{CC}=P\left({H}_1,{H}_1\right)+P\left({H}_0,{H}_0\right) $$
(5.16)
$$ =P\left({H}_1\Big|{H}_1\right)P\left({H}_1\right)+P\left({H}_0\Big|{H}_0\right)P\left({H}_0\right) $$
(5.17)

Often in practice, there exists insufficient information about an experiment to allow the user to assign the values of the prior probabilities P 0 and P 1. In this case, it is common to assume that the priors are equal (no information), so \( P\left({H}_0\right)=P\left({H}_1\right)=\frac{1}{2} \). Under this condition, the probability of correct classification becomes

$$ {P}_{CC}=\frac{1}{2}\left[P\left({H}_1\Big|{H}_1\right)+P\left({H}_0\Big|{H}_0\right)\right] $$
(5.18)

This can now be written in terms of the probability of detection and probability of false alarm as follows:

$$ {P}_{CC}=\frac{1}{2}\left[{P}_D+\left(1-{P}_{FA}\right)\right]\kern1em \left(\mathrm{Probability}\kern0.5em \mathrm{of}\kern0.5em \mathrm{Correct}\kern0.5em \mathrm{Classification}\right) $$
(5.19)

4.2 Bayesian Hypothesis Testing for Multidimensional (Vector) Data

Let us now generalize our discussion. Our example problem specifies a scalar measurement signal x(n). However, in general, we can have a vector of J observations denoted as follows.

$$ \underline{X}={\left[{x}_1,{x}_2,\dots, {x}_J\right]}^T\kern1em \left(\mathrm{Observation}\kern0.5em \mathrm{Vector}\right) $$
(5.20)

The observations x j , j = 1, 2,…, J are called features of the physical process being observed, and T denotes the vector transpose. In our example problem, the vector has one element x(n), the measurement signal. We assume throughout this discussion that the cost of an incorrect decision is higher than the cost of a correct decision. In other words, C 10 > C 00 and C 01 > C 11. Under this assumption, the detector that minimizes the Bayes risk is given by the following (Van Trees 1968):

$$ \begin{array}{r}\frac{f(\underline{X}|{H}_1)}{f(\underline{X}|{H}_0)}\overset{H_1}{\underset{H_0}{\gtreqless }}\frac{P({H}_0)({C}_{10}-{C}_{00})}{P({H}_1)({C}_{01}-{C}_{11})}\kern1em (\mathrm{B}\mathrm{a}\mathrm{y}\mathrm{e}\mathrm{s}\kern0.5em \mathrm{D}\mathrm{e}\mathrm{c}\mathrm{i}\mathrm{s}\mathrm{i}\mathrm{o}\mathrm{n}\kern0.5em \mathrm{C}\mathrm{r}\mathrm{i}\mathrm{t}\mathrm{e}\mathrm{r}\mathrm{i}\mathrm{o}\mathrm{n})\end{array} $$
(5.21)

where \( f\left(\underline{X}\Big|{H}_1\right) \) denotes the conditional probability density function (pdf) of the observation vector \( \underline{X} \) given that hypothesis H 1 is true, and \( f\left(\underline{X}\Big|{H}_0\right) \) denotes the conditional pdf of \( \underline{X} \) given that hypothesis H 0 is true (Kay 1998). The ratio of the conditional densities is called the likelihood ratio and is denoted by \( \varLambda \left(\underline{X}\right) \):

$$ \varLambda \left(\underline{X}\right)=\frac{f\left(\underline{X}\Big|{H}_1\right)}{f\left(\underline{X}\Big|{H}_0\right)}\kern1em \left(\mathrm{Likelihood}\kern0.5em \mathrm{Ratio}\right) $$
(5.22)

Because this is a ratio of two functions of a random variable, the likelihood ratio is a random variable. A very important result is that regardless of the dimensionality of the observations \( \underline{X}, \) the likelihood ratio \( \varLambda \left(\underline{X}\right) \) is a one-dimensional variable. This idea is of fundamental importance in hypothesis testing. Regardless of the dimension of the observation space, the decision space is one dimensional. The quantity on the right-hand side of the relation (5.21) is the threshold of the test and is denoted by \( \gamma \):

$$ \gamma \triangleq \frac{P\left({H}_0\right)\left({C}_{10}-{C}_{00}\right)}{P\left({H}_1\right)\left({C}_{01}-{C}_{11}\right)}\kern1.56em \left(\mathrm{Decision}\kern0.24em \mathrm{Threshold}\right) $$
(5.23)

Thus, the Bayes criterion leads to a likelihood ratio test:

$$ \begin{array}{rlr}\Lambda (\underline{X})\overset{H_1}{\underset{H_0}{\gtreqless }}\gamma & & (\mathrm{Likelihood}\ \mathrm{Ratio}\ \mathrm{Test})\end{array} $$
(5.24)

We see that the test threshold allows for weighting according to the priors and the costs. This allows the user flexibility in choosing a threshold that is best for the problem at hand. Note that if we have available the conditional pdfs or estimates of them, we can construct the confusion matrices (one for each value of the threshold) by integrating under the pdfs as depicted in Fig. 5.6. In most practical problems, we do not have the pdfs, so we construct the confusion matrices using the threshold method shown in Fig. 5.2.

Fig. 5.5
figure 5

(Left) Training data sets for controlled experiments are depicted on the left side. These include the standard labeled training and testing sets, as well as a possible blind test set that can be used if enough data are available. When using this, we can ask some unbiased users to conduct the experiment. (Right) The unlabeled test set is used after training and testing in the controlled experiments. To minimize the “leap of faith,” these data should be representative of the data in the training and testing sets

Fig. 5.6
figure 6

The Receiver Operating Characteristic (ROC) curve can be constructed by integrating the conditional probability density functions depicted on the left. This, of course, assumes that the pdfs or estimates of them are available. If not, then we generally use the confusion matrix method

4.3 Training and Testing Phases of the Detection Process

In order to measure detection performance, we must be able to conduct controlled experiments in which we know the correct experimental outcomes (“ground truth”) a priori (Duda and Hart 1973; Duda et al. 2001). This is a very important point that is often overlooked. Our use of the threshold detector occurs in phases. (1) First, in the Training Phase, we conduct controlled experiments in which the ground truth is known in order to construct a Receiver Operating Characteristic (ROC) curve from which we determine the appropriate operating threshold for the detector. We use a set of known “training data” during this phase. (2) Second, in the Testing Phase, we use the selected operating threshold on a set of “testing data,” from which we produce operational detection results. The remainder of this section concentrates on the training phase. Note that the training and testing process involves a considerable “leap of faith” in which we assume that the training data are representative of the test data. Therefore, extreme care must be taken to ensure that this assumption is valid. Otherwise, interpretations of the processing results may be meaningless.

The user should carefully design the experiments so the statistical sample size (number of data samples we can use for performance evaluation) is large enough to enable the computation of meaningful confidence intervals (see the following two subsections). Figure 5.5 summarizes some rules of thumb for picking the training and testing data sets. Using ground truth, we create a labeled data set in which each data sample is labeled H 0 or H 1. Rule of Thumb 1: We generally want to have a large sample size (approximately 100 or 1,000 or more). However, the real world often does not allow us this luxury. Rule of Thumb 2: If the data are Gaussian distributed, we need a minimum of about 30 or more samples to obtain reasonable confidence interval estimates (Hogg and Craig 1978). If our sample size is small, we turn to “hold one out” and bootstrap algorithms for estimating confidence intervals (Zoubir and Iskander 2004). Even with these methods, we cannot take too seriously any results with very small sample sizes. Rule of Thumb 3: Generally, we should set aside about 60 % of our labeled samples for the training phase and the other 40 % for the testing phase (Hand 1981; Devijver and Kittler 1987). Hard Rule: Never test on the training set. That is cheating and leads to improper inferences. Despite warnings, many people continue to do this. Please do not become one of them.

4.4 The Receiver Operating Characteristic (ROC) Curve

A Receiver Operating Characteristic (ROC) curve is constructed to quantify the tradeoff between the probability of detection P D and the probability of false alarm P FA versus the detection threshold \( \gamma \), as depicted in Fig. 5.6 (Whalen 1971; Duda et al. 2001; Duda and Hart 1973; Van Trees 1968; Kay 1998). Note that this requires the construction of one confusion matrix for each value of the threshold. We vary the value of the decision threshold \( \gamma \) over the full range of values of the decision statistic r[y(n)]. For each measurement signal in the ensemble, at the time sample n*, we compare the decision statistic to the decision threshold as in Fig. 5.2. If r[y(n*)]<\( \gamma \), we declare that H 0 is true. If r[y(n*)]≥\( \gamma \), we declare that H 1 is true.

Once we have the ROC curve, we choose the “operating threshold” \( \gamma \)* to be the one that maximizes P D (\( \gamma \)) and minimizes P FA (\( \gamma \)) . This threshold value is found at the “knee” of the ROC curve at the appropriate SNR for the experiment. Note that this knee occurs for the threshold value at which the probability of correct classification P CC is maximum. We denote this by P cc (\( \gamma \)*).

4.5 Statistical Confidence Interval About the Probability of Correct Classification

The author believes that “the ROC curve is not finished until we have computed the confidence interval on the probability of correct classification.” Unfortunately, this last step is almost always overlooked by most practitioners. P cc is not a deterministic quantity. It can be viewed as a random variable with an associated distribution.

The classifier/detector performs a random experiment, the outcome of which can be classified in one of two mutually exclusive and exhaustive ways: success or failure. Success means that the classification is correct. Failure means that the classification is incorrect. Let N equal the number of independent trials. Let p = P cc = the probability of correct classification. Assume that the true value p is the same on each repetition. Let q = 1 − p = probability of error. For the classification problem in which we conduct an experiment, we can calculate the estimated quantities in the confusion matrix. The maximum likelihood estimate of p is given by \( \widehat{p} \) and is the estimated \( {\widehat{P}}_{CC} \) computed in the confusion matrix. We can write the confidence interval about the true value of p as follows, where α, a probability, is the significance of the test.

$$ P\left\{L<p<U\right\}=1-\alpha \kern1em \left(\mathrm{Confidence}\kern0.5em \mathrm{Interval}\kern0.5em \mathrm{about}\kern0.5em p\right) $$
(5.25)

where L and U are the lower and upper bounds, respectively, of the confidence interval. The most common interpretation is to read the confidence interval relation above as follows: “With confidence 1 − α, the true p lies between L and U.” However, this interpretation is not generally supported by statistical rigor. The preferred interpretation is: “Prior to the repeated independent performances of the random experiment, the probability is 1 − α that the random interval (L, U) includes the unknown fixed point (parameter) p (Hogg and Craig 1978).” For our example problem, we arbitrarily choose α = 0.05 so we have a “95 percent confidence interval.”

There exists a significantly large literature on how to compute the lower and upper bounds L and U. In most practical cases, the author prefers to use bounds that do not assume a particular distribution and which can be used for both small and large sample sizes N. To this end, a reasonable set of bounds is the following (Hogg and Craig 1978):

$$ L=N\widehat{p}+2-2\sqrt{\frac{N\widehat{p}\left(1-\widehat{p}\right)+1}{N+4}}\kern2em \left(\mathrm{Lower}\kern0.24em \mathrm{Bound}\kern0.24em \mathrm{on}\kern0.24em p\right) $$
(5.26)
$$ U=N\widehat{p}+2+2\sqrt{\frac{N\widehat{p}\left(1-\widehat{p}\right)+1}{N+4}}\kern3.5em \left(\mathrm{Upper}\kern0.24em \mathrm{Bound}\kern0.24em \mathrm{on}\kern0.24em p\right) $$
(5.27)

We can evaluate L and U and plot them versus the true p and the estimated \( \widehat{p}, \) for various values of the sample size N, as in Figs. 5.8 and 5.10. These plots are very instructive in showing how confidence intervals tighten as the sample size increases. Bootstrap techniques for estimating confidence intervals from the data for small sample sizes are discussed in Zoubir and Iskander (2004).

Fig. 5.7
figure 7

Example problem for SNR = 3 dB: the ROC curve is plotted given the sample size P = 300 samples per class (300 for H 0 and 300 for H 1). The abscissa is probability of false alarm P FA , and the ordinate is probability of detection P D computed using confusion matrices. The knee of the curve (marked by a circle) occurs at \( {P}_{CC}\left({\gamma}^{*}\right)=.698,\kern0.28em {P}_{FA}\left({\gamma}^{*}\right)=.21 \) and \( {P}_D\left({\gamma}^{*}\right)=.60 \). The corresponding operating detection threshold is γ * = 0.37

Fig. 5.8
figure 8

Example problem for SNR = 3 dB: the abscissa is \( \widehat{p}= \) the maximum likelihood estimate of the probability of correct classification. The ordinate is = the true value of the probability of correct classification. The 95 % (\( 1-\alpha =.95 \)) confidence interval bounds \( \left(L,U\right)=\left(.66,.734\right) \) for the probability of correct classification are plotted, given the sample size N = 600 and \( {\hat{p}}^{*}={P}_{CC}({\gamma}^{*})=.698 \) (the green vertical line). Note that the confidence interval tightens as the sample size N increases

Fig. 5.9
figure 9

Example problem for SNR = 20 dB: the ROC curve is plotted given the sample size P = 300 samples per class (300 for H 0 and 300 for H 1). The abscissa is probability of false alarm P FA , and the ordinate is probability of detection P D computed using confusion matrices. The knee of the curve (marked by a circle) occurs at \( {P}_{CC}\left({\gamma}^{*}\right)=.997,\kern0.28em {P}_{FA}\left({\gamma}^{*}\right)=.01 \) and P D *) = 0.99. The corresponding operating detection threshold is γ* = 0.96

Fig. 5.10
figure 10

Example problem for SNR = 20 dB: The abscissa is \( \widehat{p}= \) the maximum likelihood estimate of the probability of correct classification. The ordinate is = the true value of the probability of correct classification. The 95 % (1 − α = 0.95) confidence interval bounds \( \left(L,U\right)=\left(.988,.999\right) \) for the probability of correct classification are plotted, given the sample size N = 600 and \( {\hat{p}}^{*}={P}_{CC}({\gamma}^{*})=.997 \) (the green vertical line). Note that the confidence interval tightens as the sample size N increases

5 Processing for the Example Problem

5.1 Experiment Design for the Example Problem

The training data were created so as to allow the computation of the confusion matrices necessary for computing the ROC curve and the confidence interval on P cc . Please refer to Fig. 5.5. For the null hypothesis H 0, we simulated an ensemble of P = 300 labeled realizations (each realization is labeled H 0) of the simulated measurement waveform x(n) specified in Eq. 5.15 in which the desired signal a(n) = 0. Each realization of x(n) is different because each has a different realization of the stochastic process v(n). For hypothesis H 1, we created an ensemble of P = 300 labeled realizations of the simulated measurement waveform x(n) in which the desired signal a(n) is nonzero. Again, each realization of a(n) is different because each has a different realization of the stochastic process v(n). These ensembles are sufficient to compute the performance indices. For controlled testing purposes, we can similarly simulate two more labeled ensembles of realizations of the stochastic process x(n). For blind controlled testing, we can similarly simulate two additional labeled ensembles. For uncontrolled testing, we can similarly simulate two more unlabeled ensembles.

5.2 Processing Results for the Example Problem

For our example, we have no prior knowledge from which to derive prior probabilities P(H 0) and P(H 1) so we assume that the two hypotheses are equally likely, and \( P\left({H}_1\right)=P\left({H}_0\right)=\frac{1}{2} \). Under this condition, the expression for P cc is simplified, as shown in Eq. 5.19.

Figure 5.7 depicts a ROC curve constructed for our threshold detector example in which the SNR is 3 dB. We see that with this low SNR, the curve is far away from the desired upper left-hand corner of the diagram, and the knee of the curve (marked by a circle) occurs at \( {P}_{CC}\left({\gamma}^{*}\right)=.667,\;{P}_{FA}\left({\gamma}^{*}\right)=.36 \) and \( {P}_D\left({\gamma}^{*}\right)=.76 \). Figure 5.8 presents the confidence interval about the P CC for this example in which the SNR is 3 dB. For our problem, the number of samples (signals) in the training data for both H 0 and H 1 is N = 600 and the estimate of \( {P}_{CC}\left({\gamma}^{*}\right)=\widehat{p}=.667 \). The green vertical line depicts \( \widehat{p} \), and the curves it crosses depict the lower and upper bounds on p. We see that the confidence interval is given by \( P\left(.627<p<.704\right)=.95 \). The corresponding operating detection threshold is γ* = 0.36. For tutorial purposes, the figure also shows what the bounds would be if N = 10 and N = 1000. We see that for small N, the bounds are very wide and for large N, the bounds are much narrower, as expected.

In Fig. 5.9, the SNR is 20 dB and the ROC curve lies in the desired upper left-hand corner. Here, the knee of the curve occurs at \( {P}_{CC}\left({\gamma}^{*}\right)=.997,\kern0.28em {P}_{FA}\left({\gamma}^{*}\right)=.01 \) and \( {P}_D\left({\gamma}^{*}\right)=.99 \). The corresponding operating detection threshold is \( {\gamma}^{*}=0.91 \). Figure 5.10 shows the confidence interval for N = 600 and \( \widehat{p}=.993 \). The estimated 95 % confidence interval is given by \( P\left(.988<p<.997\right)=.95 \). We see the clear performance improvement that occurs with increased SNR.

5.3 Conclusions

In this chapter, we introduce the concept of matched filtering for detecting desired signals buried in noisy measurement signals. We show that the matched filter is another name for the correlation detector, which exploits prior knowledge in the form of an exemplar of the desired signal. We use an example detection problem to demonstrate the matched filtering approach. We see that the detection methodology comes from hypothesis testing algorithms in Bayesian detection theory. This Bayes approach gives us very powerful methods to choose the detection threshold and evaluate detection performance in the form of the Receiver Operating Characteristic (ROC) curve and the statistical confidence interval about the probability of correct classification. We show that the matched filter can be an effective detection tool when exemplars of the desired signal are available a priori.

6 Part II Auditory Matched Filtering: Biological Examples from Selected Vertebrates

According to the auditory matched filter hypothesis (Capranica and Moffat 1983), auditory information processing in sub-mammalian vertebrates (e.g., fishes, amphibians, reptiles, and to some extent birds) relies on extensive peripheral prefiltering. In other words, the auditory sensory filter (ear) is often tuned to signals of biological importance to the species so that less post-processing is required by the reduced central nervous systems in these species. In contrast, mammals can afford to “take in” all sensory input and rely on their superior brain power to sort out the meaning behind the message. The optimum receiver strategy, according to one formulation of the matched filter hypothesis, is to “design” a bias into the frequency response of the auditory system (Capranica and Moffat 1983; Wehner 1987). Rather than a high-fidelity flat frequency response (no bias), the receiver’s auditory system should have a frequency response which closely matches the envelope of the energy spectrum of the emitter’s call. This ensures that the receiver maximizes the signal-to-noise ratio in the frequency domain for that particular call. Figure 5.11 illustrates the decision criterion of such a matched filter detector. The received signal has been contaminated by unwanted noise. After passing through the filter, the receiver must make a decision if the signal is present or not. One decision criterion is to simply ask if the power (energy per unit time) output of the matched filter at any time exceeds some internal threshold (θ); if it has, the signal is present; if not, the signal is not present. But this statistical decision-making process depends critically on the setting of θ. If θ is set too low, the false alarm rate will be high; if θ is set too high, the receiver runs the risk of missing some of the signals (Capranica and Moffat 1983). In the following section, we have chosen several examples from the literature of sub-mammalian vertebrates which appear to exhibit a conspicuous match between the properties of their acoustic signals and the tuning of their auditory systems and thus provide evidence biological matched filters (Wehner 1987).

Fig. 5.11
figure 11

Schematic diagram for the decision criterion of a matched filter detector. Signal propagation through the environment results in a noisy signal at the input to the receiver’s matched filter. If the power output (P out) of the filter exceeds a certain threshold (θ), the receiver decides that the emitter’s signal is present. If P out remains below this θ, the receiver decides that no signal is present. By adjusting the internal threshold, the reliability of the receiver’s decision can be modulated (Modified from Capranica and Rose (1983))

6.1 Weakly Electric Fish

6.1.1 Correspondence of Electric Organ Discharge and Electroreceptor Tuning

The wavelike electric organ discharge (EOD) in some gymnotoid species of South American weakly electric fish is one of the most regular of all known biological phenomena (Heiligenberg 1991). The internal fluctuation in the EOD rate is ca. 0.01 %. In these fish, the electric field is generated by the electric organ located in the tail, and the field is sensed by electroreceptors in the head region. In a landmark study of gymnotoids, the EOD frequencies of three species were recorded and compared to the best frequencies (BFs) of a population of tuning curves obtained electrophysiologically from individual fibers innervating electroreceptors (Hopkins 1976). Figure 5.12 is a plot of the electroreceptor BFs versus the EOD frequencies from the same fish, for three different species. Clearly, there is a high degree of matching between the electroreceptor tuning and the EOD rate.

Fig. 5.12
figure 12

Plot of the electroreceptor BFs versus the EOD frequencies from the same fish, for three different species of South American weakly electric fish (Sternopygus macrurus, Eigenmannia virescens, and Apteronotus albifrons). Clearly, there is a high degree of matching between the electroreceptor tuning and the EOD rate. The diagonal line shows perfect correspondence between EOD frequency and BF (From Hopkins (1976))

6.1.2 Effect of Androgens on the Matched Filter

Meyer and Zakon (1982) confirmed the strong matching between the EOD discharge rate and electroreceptor tuning in another species of South American weakly electric fish—Sternopygus. Moreover, they demonstrated that systematic treatment of these fish with androgens—in this case 5α-dihydrotestosterone (DHT)—lowered their EOD rate. Concomitantly, DHT caused decreases in electroreceptor best frequencies over a 2-week period, maintaining the close match between discharge frequency and receptor tuning. Thus, electroreceptor tuning is dynamic and it parallels natural shifts in the EOD frequency.

6.2 Anuran Amphibians

To help ensure that an appropriate behavioral response is evoked during acoustic communication, the anuran auditory system is often tuned to salient spectral and/or temporal features of the conspecific call (Frishkopf et al. 1968; Capranica and Moffat 1975; Capranica and Rose 1983). The elegant coevolution of this relatively straightforward acoustic system has made anurans an extremely valuable neuroethological model for the study of acoustic communication.

6.2.1 The Puerto Rican Coqui (Eleutherodactylus coqui)

The Puerto Rican Coqui frog , Eleutherodactylus coqui (Anura: Leptodactylidae), is abundant in Puerto Rico where it can be found at altitudes from sea level to over 1,000 m above sea level (a.s.l.). Males of this arboreal amphibian are territorial, spaced several meters from each other, and call from tree branches or vegetation from sunset to shortly after midnight throughout 11 months of the year. They produce a characteristic two-note call (“Co-Qui”) in which each note has a different significance for each sex: males use the “Co”-note for territorial defense, while females are attracted to the “Qui”-note (Narins and Capranica 1976, 1978). In this species, the advertisement calls and snout-vent length (SVL) both exhibit an altitudinal gradient such that at 30 m.a.s.l., small males produce short, rapidly repeating, high-pitched calls, whereas at 1,000 m.a.s.l., males are larger and the calls are longer, lower pitched, and repeated more slowly (Narins and Smith 1986). For example, the “Co”-note frequency produced by males at 30 m.a.s.l. is about a third of an octave higher in frequency than that produced by males at 1,000 m.a.s.l. More recently, it was found that the spectral contents of the males’ “Co”-note calls and the frequency to which the inner ear is most sensitive are tightly correlated and change concomitantly along this same altitudinal gradient (Meenderink et al. 2010). In that study, advertisement calls of males of E. coqui were recorded in situ along an altitudinal gradient ranging from 30 to 1,005 m.a.s.l. Following the recordings, males were captured and transported to a nearby lab in Puerto Rico where, the following day, distortion product otoacoustic emissions (DPOAEs) w ere measured from the anesthetized males. This was done by sweeping two primary tones (f 1 and f 2) from low frequencies to high frequencies and plotting the amplitude of the resulting third-order distortion product emission (2f 1-f 2) versus the lower primary frequency, f 1, resulting in a DPOAE audiogram. From this audiogram, the frequency that results in maximum DPOAE amplitude (F maxDP) can be identified and interpreted as the frequency to which the ear is most sensitive, or the frequency to which the ear is “tuned” (Meenderink et al. 2010). This frequency was then plotted against the frequency of the animal’s “Co”-note in its advertisement call. The resulting strong correlation (Fig. 5.13) is good evidence for a close match between the call note frequency (signal) and the peripheral auditory tuning (receiver characteristic) along the entire altitudinal gradient inhabited by these vocal animals. It was suggested that the animal’s body size, conditioned by the calling site temperature, determines the frequencies of the emitted calls and the best sensitivity of the inner ear (Meenderink et al. 2010).

Fig. 5.13
figure 13

Scatter plot showing the relationship between the frequency at maximum DPOAE amplitude (F maxDP) and the dominant Co-note frequency in the call of the Puerto Rican Coqui frog, Eleutherodactylus coqui. Data points indicate median values ± interquartile ranges. Gray circles, 2f 1-f 2; black stars, 2f 2-f 1. The diagonal (dashed) line represents equality between Co-note frequency and F maxDP (From Meenderink et al. (2010))

6.2.2 Ultrasonic Frogs

6.2.2.1 The Concave-Eared Torrent Frog (Odorrana tormota)

Odorrana tormota (previously Amolops tormotus) is known only from two provinces in central China (Zhou and Adler 1993). This species has unusually high-pitched calls containing substantial energy in the ultrasonic frequency range (above 20 kHz), and its hearing extends from less than or equal to 1 kHz to approximately 35 kHz (Narins et al. 2004; Feng et al. 2006), dramatically exceeding previously reported upper limits of anuran frequency sensitivityFootnote 1 (e.g., 8 kHz, Loftus-Hills and Johnstone 1970; 5 kHz, Fay 1988). Playback experiments in the animal’s natural habitat demonstrated that the ultrasonic elements are behaviorally relevant, and thus this extraordinary upward extension into the ultrasonic range of both the harmonic content of the advertisement calls and the frog’s hearing sensitivity is likely to have coevolved in response to the intense, predominantly low-frequency ambient noise from local streams (Narins et al. 2004; Feng et al. 2006; Feng and Narins 2008). Because amphibians are a distinct evolutionary lineage from microchiropterans and cetaceans [which have evolved ultrasonic hearing to minimize spectral overlap in the frequency bands used for sound communication (Sales and Pye 1974) and to increase hunting efficacy in darkness (Bradbury and Vehrencamp 2011)], ultrasonic perception in amphibians represents a new example of independent evolution. Moreover, this example illustrates how a matched filter, when subject to selection pressure in the form of ambient noise, can respond appropriately to maintain the signal-to-noise ratio necessary for communication of biologically significant signals.

6.2.2.2 The Hole-in-the-Head Frog (Huia cavitympanum)

In addition to Odorrana tormota, only one other anuran species, Huia cavitympanum , is currently known to have recessed tympanic membranes (Inger 1966). Odorrana tormota and H. cavitympanum are both southeast Asian species in the family Ranidae, yet they do not overlap in geographical distribution and are unrelated at the generic level (Cai et al. 2007; Stuart 2008). The habitats in which the frogs are found, however, are remarkably similar; males of both species call in close proximity to rushing streams that produce substantial broadband background noise. Given the similarity of the species’ acoustic environment and peripheral auditory morphology, Arch et al. (2008) predicted that they may have converged on the use of ultrasound for intraspecific communication. Recordings of the calls of H. cavitympanum in their natural habitat in Borneo obtained with ultrasonic detection and recording equipment demonstrated that males of this species are able to produce calls that are comprised entirely of ultrasound (Arch et al. 2008). To test the hypothesis that these frogs use purely ultrasonic vocalizations for intraspecific communication, a series of acoustic playback experiments with male frogs in their natural calling sites was performed (Arch et al. 2009). These workers found that the frogs responded with increased calling to broadcasts of conspecific calls containing only ultrasound. The field study was complemented by electrophysiological recordings from the auditory midbrain and by laser Doppler vibrometer measurements of the tympanic membrane’s response to acoustic stimulation. These measurements revealed that the frog’s auditory system is broadly tuned over high frequencies, with peak sensitivity occurring within the ultrasonic frequency range (>20 kHz). Thus, H. cavitympanum is the first non-mammalian vertebrate reported to communicate with purely ultrasonic acoustic signals.

6.2.2.3 Structural Basis of the Matched Filter

In a detailed study of the inner ears of the three species of frog known to detect ultrasounds (Odorrana tormota, Odorrana graminea, and Huia cavitympanum), Arch et al. (2012) attempted to identify morphological correlates of high-frequency sound detection. These workers found that the three ultrasound-detecting species have converged on a series of small-scale functional modifications of the basilar papilla (BP), the high-frequency hearing organ in the frog inner ear. These modifications include (1) reduced BP chamber volume, (2) reduced tectorial membrane mass, (3) reduced hair bundle length, and (4) reduced hair cell soma length. While none of these factors on its own could completely account for the ultrasonic sensitivity of the inner ears of these species, the combination of these factors appears to extend their hearing bandwidth and thus facilitate high frequency/ultrasound detection. Similar morphological modifications are seen in the inner ears of O. chloronota—a poorly known species from the mountains of Laos. In fact, the striking similarity of the BP features of O. chloronota to those of the three amphibian species known to detect ultrasound suggests that this species is a potential candidate for high-frequency hearing sensitivity. These data form the foundation for future functional work probing the physiological bases of ultrasound detection by a non-mammalian ear (Fig. 5.14).

Fig. 5.14
figure 14

Comparison of morphometric data from the basilar papillae of six amphibians. (a) Recess entrance area (REA); (b) Epithelium surface area (ESA); (c) Hair cell count (HCC); (d) Hair cell soma length; (e) Hair cell bundle length. Numbers indicate sample sizes. Letters denote statistically significant differences in pairwise comparisons using Tukey’s post hoc analysis with α = 0.05. If a pair of species shares a common letter, they are not significantly different in that trait. Horizontal bars indicate the three amphibian species known to detect ultrasound (US). Vertical arrow indicates putative ultrasound detector—O. chloronota. Inset (upper right) shows the BP from Rana pipiens, a North American species known not to detect ultrasound (Non-US) (Modified from Arch et al. (2012))

6.2.3 Eupsophus roseus : A Leptodactylid Frog from the South American Temperate Forest

In a recent test of the matched filter hypothesis, Moreno-Gomez et al. (2013) sought to test the concordance between the acoustic sensitivity of female frogs of E. roseus and (a) the spectral characteristics of the advertisement calls of conspecific males and (b) the spectral characteristics of the ambient noise in which these frogs breed. Audiograms measured from the torus semicircularis in the midbrain of anesthetized females exhibited two sensitivity peaks: one in the low-frequency range (LFR <700 Hz) and the second in the high-frequency range (HFR >700 Hz). Advertisement calls of conspecific males were characterized by three dominant harmonics of which the second and third fell within the bandwidth of the lowest thresholds in the female’s HFR. In fact, the mean cross-correlation coefficient between the audiograms and the conspecific vocalization spectra was 0.4 (95 % CI: 0.3–0.5). This coefficient was significantly higher than that between the audiograms and the background noise spectra over the 4 months for which data were available (Moreno-Gomez et al. 2013). Both this measured concordance and the mismatch between the auditory sensitivity of E. roseus females with both the local abiotic and biotic background noise are interpreted as adaptations to increase the signal-to-noise ratio in this communication system.

6.3 Birds

6.3.1 Unmatched Filters Between Predators and Prey

By exploiting call frequencies heard well by conspecifics but poorly by a prey species, animals may use a species-specific “private channel” to their advantage. Using the method of constant stimuli in an operant positive reinforcement conditioning procedure, behavioral audiograms of the great tit (Parus major) and its principal avian predator, the European sparrowhawk (Accipiter nisus), were determined (Klump et al. 1986). The hawk was 6.5 dB more sensitive than the tit at 2 kHz—the best frequency of both species. Although the high-frequency cutoff was very similar in both species, at 8 kHz, the great tit was about 30 dB more sensitive than the sparrowhawk. Figure 5.15 shows the differences in the unmasked thresholds between the sparrowhawk and the great tit at various frequencies. When confronted by a European sparrowhawk, the great tit uses three different vocalizations: (a) the mobbing call (dominant frequency: 4.5 kHz), (b) the scolding call (dominant frequency: 4 kHz), and (c) the “seeet” call (dominant frequency: 8 kHz). The latter call is mainly used by the great tit when it detects a sparrowhawk flying at some distance (Klump et al. 1986) and is the aerial predator call described by Marler (1955). At the dominant frequency of the “seeet” call, the hearing of the great tit is 30 dB more sensitive than that of the European sparrowhawk. This example illustrates that prey species may warn other potential prey of an impending predator by exploiting the mismatch between predator and prey auditory sensitivities.

Fig. 5.15
figure 15

The difference in the absolute thresholds between the great tit and the European sparrowhawk (thresholds in both species calculated using a d′ = 1.5) (From Klump et al. (1986))

6.4 Conclusions

In this chapter, the concept of matched filtering for detecting desired signals buried in noisy measurement signals is presented. It is shown that the matched filter is another name for the correlation detector or replica-correlation detector, which exploits prior knowledge in the form of an exemplar (or replica) of the desired signal. An example detection problem is used to demonstrate the matched filtering approach. The matched filter can be an effective detection tool when exemplars of the desired signal are available a priori. In the second part, several key examples of matched filters in the auditory systems of several selected sub-mammalian vertebrates are provided. These auditory systems implement matched filters by sculpting the receiver characteristics to the spectral and temporal features of the species-specific signals of biological importance. With the examples provided, it is hoped that the reader will more fully appreciate the adaptive value of the matched filter concept for reducing the effective noise and thus maximizing the signal-to-noise ratio.