Effect of Logistic Activation Function and Multiplicative Input Noise on DNN-kWTA Model

Lu, Wenhao; Leung, Chi-Sing; Sum, John

doi:10.1007/978-981-99-1639-9_17

Wenhao Lu¹⁰,
Chi-Sing Leung¹⁰ &
John Sum¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1791))

Included in the following conference series:

International Conference on Neural Information Processing

864 Accesses

Abstract

The dual neural network-based (DNN) k-winner-take-all (kWTA) model is one of the simplest analog neural network models for the kWTA process. This paper analyzes the behaviors of the DNN-kWTA model under these two imperfections. The two imperfections are, (1) the activation function of IO neurons is a logistic function rather than an ideal step function, and (2) there are multiplicative Gaussian noise in the inputs. With the two imperfections, the model may not be able to perform correctly. Hence it is important to estimate the probability of the imperfection model performing correctly. We first derive the equivalent activation function of IO neurons under the two imperfections. Next, we derive the sufficient conditions for the imperfect model to operate correctly. These results enable us to efficiently estimate the probability of the imperfect model generating correct outputs. Additionally, we derive a bound on the probability that the imperfect model generates the desired outcomes for inputs with a uniform distribution. Finally, we discuss how to generalize our findings to handle non-Gaussian multiplicative input noise.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Analysis of the DNN-kWTA Network Model with Drifts in the Offset Voltages of Threshold Logic Units

The Performance of the Stochastic DNN-kWTA Network

Neuro-Inspired Control

Keywords

1 Introduction

The goal of the winner-take-all (WTA) process is to identify the largest number from a set of n numbers [1]. The WTA process has many applications, including sorting and statistical filtering [2, 3]. An extension of the WTA process is the k-winner-take-all (kWTA) process [4, 5], which aims to identify the k largest numbers from the set. From the dual neural network (DNN) concept, Hu and Wang [5] proposed a low complexity kWTA model, namely DNN-kWTA. This model contains n input-output (IO) neurons, one recurrent neuron, and only $2n+1$ connections.

For ideal realization, the activation function of IO neurons should behave like a step function and there are no noise in the inputs. However, for circuit realization, the activation function often behaves like a logistic function [6, 7]. In addition, the operation of IO neurons may be affected by random drifts and thermal noise [8,9,10].

These two imperfections can affect the functional correctness. In [11, 12], Sum et al. and Feng et al. presented the analytical results of a noisy DNN-kWTA network, including the convergence and the performance degradation. However, Also, those results are based on the assumption that the noise are additive. In some situations, noise level are proportional to the signal levels. For instance, when we amplify the signal, the noise are amplified too. Hence, it is more suitable to use the multiplicative noise model to describe the behavior of the input noise [13, 14].

This paper analyzes the imperfect DNN-kWTA model with non-ideal activation function and existence of multiplicative input noise. We first assume that the multiplicative input noise are zero mean Gaussian distributed and then perform the analysis. Afterwards, we generalize our result to the non-Gaussian input noise. We derive an equivalent model to describe the behaviour of the imperfect DNN-kWTA model. From the equivalent model, we derive sufficient conditions to check whether the imperfect model can generate the desired results or not. We can use this condition to study the probability of the model generating correct outputs without simulating the neural dynamics. For uniformly distributed inputs, we derive a lower bound formula to estimate the probability that the imperfect mode can generate the correct outputs. Finally, we generalize our results to the non-Gaussian multiplicative input noise case.

This paper is organized as follows. Section 2 presents the background of the DNN-kWTA model. Section 3 studies the properties and performance of the DNN-kWTA models under the two imperfections. Section 4 extends the result to the non-Gaussian input noise case. Experimental results are shown in Sect. 5. Section 6 summarizes our results.

2 Basic DNN-kWTA

Figure 1 illustrates the DNN-kWTA structure, which consists of a recurrent neuron and n input-output (IO) neurons. The state of the recurrent neuron is denoted as y(t). Each of the IO neurons has an external input, denoted as $\left\{ u_1, ,u_n \right\} $. It of them is associated an output, denoted as $x_i$. All inputs $u_i$ are distinct and range from 0 to 1. In the context of the DNN-kWTA model, the recurrent state y(t) is governed by

$$\begin{aligned} \epsilon \frac{dy(t)}{dt} = \sum _{i=1}^{n}{x_i\left( t\right) -k}, \text{ where } x_i(t)=h(u_i\!-\!y\left( t\right) ) \text{ and } h(\varphi ) \!=\! \left\{ \begin{array}{ll} 1 &{} \text{ if } \varphi \ge 0\text{, }\\ 0 &{} \text{ otherwise. } \end{array} \!\!\right. \end{aligned}$$

(1)

where $\epsilon $ is the characteristic time constant, which depends on the recurrent neuron’s capacitance and resistance. In (1), $h(\cdot )$ denotes the activation function of IO neurons. In the original DNN-kWTA model, $h(\cdot )$ is an ideal step function. A nice property of the DNN-kWTA model is that its state converges to an equilibrium state in finite time. At the equilibrium state, only the IO neurons with the k largest inputs produce outputs of 1. All other neurons produce outputs of 0.

3 Logistic DNN-kWTA with Input Noise

3.1 DNN-kKWTA Under Imperfection

In realization, the activation function of IO neurons often resembles a logistic function [6], and noise is unavoidable in analog circuits. This paper considers the coexistence of these two imperfections in the DNN-kWTA model. The first imperfection is that the activation function is a logistic function, given by

$$\begin{aligned} h_\alpha (\varphi )=\frac{1}{1+e^{-\alpha \varphi }}, \end{aligned}$$

(2)

where $\alpha $ is the gain factor. Also, there are multiplicative noise at the inputs of IO neurons. That is, the noisy inputs are given by

$$\begin{aligned} u_i+\varepsilon _i\left( t\right) u_i, \end{aligned}$$

(3)

where $\varepsilon _i\left( t\right) u_i$ is the input noise for the i-th input. In this model, the noise level depends the normalized noise $\varepsilon _i\left( t\right) $, as well as input $u_i$. In this paper, we assume that $\varepsilon _i\left( t\right) $’s are Gaussian distributed with zero mean and variance of $\sigma ^2$.

With the two imperfections, the behaviour of the DNN-kWTA can be described as

$$\begin{aligned} \frac{dy(t)}{dt}= & {} \sum _{i=1}^{n}{{\widetilde{x}}_i(t)-k}, \end{aligned}$$

(4)

$$\begin{aligned} \widetilde{x}_i(t)= & {} h_\alpha (u_i+u_i \varepsilon _i\left( t\right) -y\left( t\right) ) , \end{aligned}$$

(5)

$$\begin{aligned} h_\alpha \left( \varphi '_i\right)= & {} \frac{1}{1+e^{-\alpha \varphi '_i}}, \text { where } \varphi '_i=u_i+u_i\varepsilon _i\left( t\right) -y\left( t\right) . \end{aligned}$$

(6)

In the presence of noise in the inputs, the outputs $\widetilde{x}_i(t)$ may change with time. Therefore, for the DNN-kWTA model with input noise, it is necessary to take multiple measurements of the outputs of IO neurons to obtain the average output values as the neurons’ outputs.

Figure 2 illustrates the effect of non-activation function and multiplicative Gaussian noise. In the first case, as shown in Fig. 2(a), When the gain factor is large enough and the noise variance is small, the recurrent state converges to around 0.5299 and thus the outputs of the network are correct. When the gain parameter $\alpha $ is reduced to 15, the recurrent state converges to 0.5145 and thus the outputs are incorrect, as shown in Fig. 2(b). If we increase the noise level to 0.2, the recurrent state converges to 0.5164 and thus the outputs are incorrect too, as shown in Fig. 2(c). Clearly, the gain parameter value $\alpha $ and the noise level $\sigma ^2$ can affect the operational correctness.

3.2 Equivalent Model

This subsection derive a model to simulate the dynamic behaviour of the model under the two imperfections. We use the Haley’s approximation for the Gaussian distribution [15].

Lemma 1

Haley’s approximation: A logistic function $\frac{1}{1+e^{-\rho z}}$ can be model by the distribution function of Gaussian random variables, given by

$$\begin{aligned} \frac{1}{1+e^{-\rho z}}\approx \int _{-\infty }^{z}{\frac{1}{\sqrt{2\pi }}e^{-\frac{v^2}{2}}}dv . \end{aligned}$$

(7)

where $\rho =1.702$.

From Lemma 1, the equivalent dynamics can be described by Theorem 1.

Theorem 1

For the imperfect DNN-kWTA model with the two mentioned imperfections, we can use the following equations to describe its dynamic behaviour, given by

$$\begin{aligned} \!\frac{dy(t)}{dt}= & {} \sum _{i=1}^{n}\bar{x}_i (t)-k, \end{aligned}$$

(8)

$$\begin{aligned} \bar{x}_i(t)= & {} h_{\tilde{\alpha }_i}(u_i-y(t)), \end{aligned}$$

(9)

$$\begin{aligned} h_{\tilde{\alpha }_i}(\varphi _i)= & {} \frac{1}{1+e^{-\tilde{\alpha } \varphi _i}}, \text { where } \tilde{\alpha } = \frac{1}{\sqrt{\frac{1}{\alpha ^2}+\frac{\sigma ^2u_i^2}{\rho ^2}}} \, \, \, \text { and } \varphi _i=u_i-y\left( t\right) . \end{aligned}$$

(10)

Proof:

From (4), the update of the recurrent state can be written as

$$\begin{aligned} \!y(t+\delta )\!=\!y(t)\!+\! \delta \frac{dy(t)}{dt}\!\!=\!y(t)\!+\!\!\!\int _{t}^{t+\delta } \!\!\frac{dy(\tau )}{d\tau } \!d\tau \!=\!y(t)\!+\!\left( \!\!\sum _{i=1}^{n}\!\!{\int _{t}^{t\!+\!\delta }\!\!{{\tilde{x}}_i (\tau )d\tau }\!-\!k\delta }\!\!\right) \end{aligned}$$

(11)

where $\delta $ is a small positive real number. The term $\int _{t}^{t+\delta } \tilde{x}_i (\tau ) d\tau $ can be expressed as $\int _{t}^{t+\delta } \tilde{x}_i (\tau ) d\tau = \lim \limits _{M\rightarrow \infty } \zeta M \sum _{j=1}^{M} \frac{{\tilde{x}}_i (t+j\zeta )}{M}$, where $\zeta \times M=\delta $. We can further rewrite $\int _{t}^{t+\delta } \tilde{x}_i (\tau ) d\tau $ as

$$\begin{aligned} \int _{t}^{t+\delta }\!\!\!{{\tilde{x}}_i\left( \tau \right) d\tau } \!=\!\delta \times \left( \text {mean of } {\tilde{x}}_i\left( t\right) \right) =\delta \times E\left[ {\tilde{x}}_i\left( t\right) \right] \!=\! \delta \times \bar{x}_i (t) \!=\!\delta \times E\left[ h_\alpha \left( \varphi '_i\right) \right] , \end{aligned}$$

where $\varphi '_i=u_i+\varepsilon _i\left( t\right) u_i -y\left( t\right) $ is the input of the i-th IO neuron. It contains the noise component $\varepsilon _i\left( t\right) u_i$. As $\varepsilon _i\left( t\right) $ is zero mean Gaussian distributed, we have

$$\begin{aligned} E\left[ h_\alpha (\varphi '_i)\right] =\int _{-\infty }^{\infty }{\frac{1}{1+e^{-\alpha (\varphi _i+\varepsilon _i u_i)}} \times \frac{1}{\sqrt{2\pi \sigma ^2}}e^{-\frac{\varepsilon _i^2}{2\sigma ^2}}d\varepsilon _i} , \end{aligned}$$

where $\varphi _i=u_i-y\left( t\right) $. Furthermore, from Lemma 1,

$$\begin{aligned} E\left[ h_\alpha \left( \varphi ^{\prime }_i\right) \right]= & {} \int _{-\infty }^{\infty }{\int _{-\infty }^{\varphi _i}{\frac{1}{\sqrt{2\pi \eta }}e^{-\frac{\left( v+u_i\varepsilon _i\right) ^2}{2\eta }}\times }\frac{1}{\sqrt{2\pi \sigma ^2}}e^{-\frac{\varepsilon _i^2}{2\sigma ^2}}dvd\varepsilon _i} \nonumber \\= & {} \frac{1}{\sqrt{2\pi \eta }}\frac{1}{\sqrt{2\pi \sigma ^2}} \int _{-\infty }^{\infty } \int _{-\infty }^{\varphi }{\exp \left( -\frac{v^2}{2\left( \eta +\sigma ^2 u^2_i\right) }\right) } \nonumber \\{} & {} \times \exp \left( -\frac{\left( \varepsilon _i+\frac{v u_i \sigma ^2}{\sigma ^2 u_i^2+\eta }\right) ^2}{2\eta \sigma ^2/\left( \sigma ^2u_i+\eta \right) }\right) dvd\varepsilon _i , \end{aligned}$$

(12)

where $\eta =\rho ^2/\alpha ^2$. Taking the integration with respect to $\varepsilon _i$ and applying Lemma 1 again, we obtain

$$\begin{aligned} E[h_\alpha (\varphi '_i)]=\frac{1}{1+e^{-\tilde{\alpha }_i s}} \triangleq \bar{x}_i(t) \triangleq h_{\tilde{\alpha }_i}(\varphi _i), \end{aligned}$$

(13)

where $\varphi _i\!\!=\!u_i\!-\!\!y(t)$, $\eta \!\!=\!\!\rho ^2/\alpha ^2$ and $\tilde{\alpha }_i\!=\!\frac{1}{\!\!\sqrt{\!\frac{1}{\alpha ^2}\!+\!\frac{\sigma ^2u_i^2}{\rho ^2}}}$. Equation (11) can be rewritten as

$$\begin{aligned} y(t+\delta )= & {} y(t)+\delta (\sum _{i=1}^{n} \bar{x}_i(t) -k )=y(t)+ \delta \frac{dy(t)}{dt}. \end{aligned}$$

(14)

With (13) and (14), (4)–(6) can be written as

$$\begin{aligned} \frac{dy(t)}{dt}= & {} \sum _{i=1}^{n} \bar{x}_i (t)-k, \end{aligned}$$

(15)

$$\begin{aligned} \bar{x}_i(t)= & {} h_{\tilde{\alpha }}(u_i-y(t)), \end{aligned}$$

(16)

$$\begin{aligned} h_{\tilde{\alpha }_i}(\varphi _i)= & {} \frac{1}{1+e^{-\tilde{\alpha }_i \varphi _i}}, \end{aligned}$$

(17)

where $\tilde{\alpha }_i = \frac{1}{\sqrt{\frac{1}{\alpha ^2}+\frac{\sigma ^2u_i^2}{\rho ^2}}}$ and $\varphi _i=u_i-y(t)$. The proof is completed. $\blacksquare $

It is important to note that we are not proposing a new model. Introducing the equivalent model, as stated in equations (15)–(17), helps us to analyze the properties of the imperfect model.

From Theorem 1, a convergence result of the DNN-kWTA network with the multiplicative Gaussian noise and non-ideal activation function is obtained. The result is presented in Theorem 2.

Theorem 2

For the imperfect DNN-kWTA network, the recurrent state y(t) converges to a unique equilibrium point.

Proof:

Recall that $\frac{dy}{dt}= \sum _{i=1}^{n}\frac{1}{1+e^{-\tilde{\alpha }_i\left( u_i-y\right) }}-k$. For very large y, $\frac{dy}{dt}=-k<0$. For very small y, $\frac{dy}{dt}=n-k>0$. Furthermore, it is worth noting that $\frac{dy}{dt}$ is a strictly monotonically decreasing function of y. Therefore, there exists a unique equilibrium point $y^{*}$ such that $\frac{dy}{dt}|_{y=y^{*}}=0$. Additionally, we can obtain the following properties of the model:

$$\begin{aligned} \text{ if } y(t)>y^{*}\text{, } \text{ then } \frac{dy}{dt}<0\text{, } \text{ and } \text{ if } y(t)<y^{*}\text{, } \text{ then } \frac{dy}{dt}>0\text{. } \end{aligned}$$

Suppose that at time $t_o$, the recurrent state $y(t_o)$ is greater than $y^{*}$. In this case, we have $\frac{dy}{dt} < 0$, and y(t) decreases with time until it reaches $y^{*}$. On the other hand, if at time $t_o$ the recurrent state $y(t_o)$ is less than $y^{*}$, then y(t) also decreases with time until it reaches $y^{*}$. This completes the proof. $\blacksquare $

Since the output $\bar{x}_i(t)$ of the imperfect model is not strictly binary, we need to introduce new definitions for “winner” neurons and “loser” neurons.

Definition 1

At the equilibrium, if $\bar{x}_i \ge 0.5$, then we call the i-th IO neuron is a winner. Otherwise, we call the i-th IO neuron is a loser.

There are some relationship among equilibrium point $y^{*}$, winners and losers. The results are summarized in Theorem 3.

Theorem 3

Consider that the inputs are $\left\{ u_1,\cdots ,u_n\right\} $. Denote the sorted inputs in the ascending order are $\left\{ u_{\pi _1},\cdots ,u_{\pi _n}\right\} $, where $\{u_{\pi _1},u_{\pi _2},\cdots ,u_{\pi _n}\}$ is the sorted index list. If $u_{\pi _{n-k}}<y^{*}\le u_{\pi _{n-k+1}}$, then the imperfect model can generate correct outputs.

Proof:

From (16) and (17), we know for a given y, if $u_i<u_{i'}$, then $h_{\tilde{\alpha }_i}(u_i-y) < h_{\tilde{\alpha }_{i'}}(u_{i'}-y)$. Also, $h_{\tilde{\alpha }_i}(0)=0.5$ and $h_{\tilde{\alpha }_i}(u_i-y)$ is an increasing function of $u_i$. Thus, if $u_{\pi _{n-k}}<y^{*}$, then $\bar{x}_{\pi _1}<\cdots<\bar{x}_{\pi _{n-k}}<0.5$. Hence IO neuron $\pi _1$ to IO neuron $\pi _{n-k}$ are losers. Similarly, if $u_{\pi _{n-k+1}} \ge y^{*}$, then $0.5 \ge \bar{x}_{\pi _{n-k+1}}>\cdots >\bar{x}_{\pi _{n}}$. Hence IO neuron $\pi _{n-k+1}$ to IO neuron $\pi _{n}$ are winners. The proof is completed. $\blacksquare $

There is a common misconception that we can use Theorem 3 to study the probability of the imperfect model operating correctly. This approach involves simulating the neural dynamics for many sets of inputs, obtaining the equilibrium point $y^{*}$ for each set, and then checking whether the imperfect model produces correct outputs or not. However, simulating the dynamics is quite time-consuming. Therefore, it is of interest to find a more efficient way to estimate the probability value of correct operation. The following theorem, based on the equivalent model, provides us with a convenient way to estimate the probability value without the need to simulate the neural dynamics.

Theorem 4

Denote the sorted inputs in the ascending order are $\left\{ u_{\pi _1},\cdots ,u_{\pi _n}\right\} $, where $\{u_{\pi _1},u_{\pi _2},\cdots ,u_{\pi _n}\}$ is the sorted index list. For the imperfect model,

$$\begin{aligned} \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y)\big |_{y=u_{\pi _{n-k}}}>k \, \, \text { and } \, \, \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y) \big |_{y=u_{\pi _{n-k+1}}} \le k, \end{aligned}$$

(18)

if and only if

$$\begin{aligned} u_{\pi _{n-k}}<y^{*}\le u_{\pi _{n-k+1}}. \end{aligned}$$

In addition, if

$$\begin{aligned} \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y)\big |_{y=u_{\pi _{n-k}}}>k \, \, \text { and } \, \, \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y) \big |_{y=u_{\pi _{n-k+1}}} \le k, \end{aligned}$$

then the model generates the desired outputs.

Proof:

Denote $H(y)=\sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y)-k$. As $y\rightarrow \infty $, $H(y)=-k$. Also, as $y\rightarrow -\infty $, $H(y)=n-k$. Since H(y) is a strictly monotonically decreasing function of y, we have $H(y)>0, \, \, \forall y <y^*$, and $H(y) \le 0, \, \, \forall y \ge y^*$. Hence $H(u_{\pi _{n-k}})>0$, if and only if, $u_{\pi _{n-k}}<y^{*}$. In addition, we have: $H(u_{\pi _{n-k+1}})\le 0$, if and only if, $u_{\pi _{n-k+1}} \ge y^{*}$. Furthermore,

$$\begin{aligned}{} & {} \text {if } \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y)\big |_{y=u_{\pi _{n-k}}}>k \, \, \text { and } \, \, \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y) \big |_{y=u_{\pi _{n-k+1}}} \le k, \nonumber \\{} & {} \text {then } u_{\pi _{n-k}}<y^{*}\le u_{\pi _{n-k+1}}. \end{aligned}$$

(19)

According to the result of Theorem 3, the condition “$u_{\pi _{n-k}}<y^{*}\le u_{\pi _{n-k+1}}$” implies that the model correctly identifies the winner and loser neurons for the given n numbers. $\blacksquare $

Theorem 4 provides us with an efficient way to estimate the probability value of the imperfect model correctly identifying the winner and loser neurons without the need to simulate the neural dynamics. To do so, we can consider many sets of inputs and sort the data for each input set. Then, we can compute the following expression:

$$ \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y)\big |_{y=u_{\pi _{n-k}}}-k \, \, \text { and } \, \, \sum _{i=1}^{n} h_{\tilde{\alpha }_i}(u_i-y) \big |_{y=u_{\pi _{n-k+1}}} -k $$

to determine whether the model can generate the correct output or not. It should be noticed that the aforementioned procedures are suitable for the data with any distribution. When the inputs are iid uniform random variables between zero and one, Theorem 5 provides us with a lower bound on the probability value.

Theorem 5

If the inputs are iid uniform random variables with a range from zero to 1, the probability $\text {Prob(correct)}$ that the imperfect model correctly identifies the winner and loser neurons can be expressed as:

$$\begin{aligned} \text {Prob(correct)} \ge 1-2\left( 1-\left( 1-\frac{2}{\tilde{\alpha }}\right) ^{n-1}\left( 1+\left( n-1\right) \frac{2}{\tilde{\alpha }}\right) \right) , \end{aligned}$$

where $\tilde{\alpha }=1/\sqrt{\frac{1}{\alpha ^2}+\frac{\sigma ^2}{\rho ^2}}$.

Proof

Since the complete proof is lengthy, we only outline the flow of the proof here. Since the effect of zero-mean Gaussian noise is equivalent to decreasing the gain factor of the logistic function, we can use the flow of the proof in Theorem 4 of [7] to obtain our result. $\blacksquare $

Probability theory tells us that any non-uniformly distributed data can be mapped into a uniform distribution through histogram equalization. This mapping does not affect the ordering of the original non-uniform inputs. Therefore, we can apply Theorem 5 to handle non-uniformly distributed data.

4 Non-gaussian Multiplicative Input Noise

Although we focus on the multiplicative input noise with the Gaussian distribution, our analysis can be extended to cases where the multiplicative input noise has a non-Gaussian distribution. This technique is based on the idea of the Gaussian mixture model (GMM) [16]. We can use the GMM concept to approximate the density function of the normalized noise component $\varepsilon _i(t)$. In the context of GMM, the density function of $\varepsilon _i(t)$ can be represented as follows:

$$\begin{aligned} f(\varepsilon _i(t))=\sum _{l=1}^{L} \frac{\Xi _l}{\sqrt{2 \pi \varsigma ^2_l}} \exp \Big \{ -\frac{(\varepsilon _i(t)-\mu _l)^2}{2 \varsigma ^2_l} \Big \}, \end{aligned}$$

(20)

where $\mu _l$ is the mean of the l-th component, $\varsigma ^2_l$ is the variance of the l-th component, and $\Xi _l$ is the weighting of the l-th component. Note that the sum of $\Xi _l$’s is equal to 1. By following the steps presented in Sect. 3, we can derive the equivalent dynamics for the case of non-Gaussian multiplicative input noise and non-ideal activation function. The equivalent dynamics are sated in the following theorem.

Theorem 6

The equivalent dynamics for the non-Gaussian multiplicative input noise and non-ideal activation function are given by

$$\begin{aligned} \frac{dy}{dt}= & {} \sum _{i=1}^{n} \bar{x}_i (t)-k, \end{aligned}$$

(21)

$$\begin{aligned} \bar{o}_i(t)= & {} h_i(u_i-y(t)), \end{aligned}$$

(22)

$$\begin{aligned} h_i(u_i-y(t))= & {} \sum _{l=1}^{L} \frac{\Xi _l}{1+e^{-\tilde{\alpha }_{i,l} (u_i-y(t)+ u_i \mu _i)}}, \end{aligned}$$

(23)

$$\begin{aligned} \tilde{\alpha }_{i,l}= & {} \frac{1}{\sqrt{\frac{1}{\alpha ^2}+ \frac{u_i^2 \varsigma ^2_{l}}{\rho ^2}}}. \end{aligned}$$

(24)

Similar to the approach presented in Sect. 3, we can also develop an efficient method to estimate the probability of the model generating correct winner and loser neurons for the case of non-Gaussian multiplicative input noise and non-ideal activation function.

Theorem 7

For the non-Gaussian multiplicative input noise and non-ideal activation function, if $\sum _{i=1}^{n} \left. h_i (u_i-y(t))\right| _{y(t)=u_{\pi _{n-k}}}>k$ and $\sum _{i=1}^{n} \left. h_i (u_i-y(t))\right| _{y(t)=u_{\pi _{n-k+1}}} \le k$, then the model has the correct operation.

5 Simulation Results

In Theorem 1 and Theorem 6, we introduce equivalent models to describe the behaviour of the imperfect DNN-kWTA model. Afterwards, based on the equivalent model, we propose the ways (Theorem 4, Theorem 5 and Theorem 7) to predict the performance of the imperfect model. The aim of this section is verified our results.

5.1 Effectiveness of Theorem 4

Three settings: $\left\{ n=6,k=2,\alpha =500\right\} $, $\left\{ n=11,k=2\right\} $ and $\left\{ n=21,k=5\right\} $ are considered. In this subsection, we consider that the inputs follow Beta distribution with $\text{ Beta}_{c,d}(x)=\frac{\mathrm {\Gamma }\left( c+d\right) }{\mathrm {\Gamma } \left( c\right) \mathrm {\Gamma }\left( d\right) }x^{c-1}\left( 1-x\right) ^{d-1}$, where $c=d=2$, and $\mathrm {\Gamma }(.)$ denotes the well known Gamma function. To study the probability value of the imperfect model performing correctly, we generate 10,000 sets of inputs.

Time-varying multiplicative input noise “$\varepsilon _i\left( t\right) u_i$” ’s are added into inputs, where $\varepsilon \left( t\right) $’s are zero-mean Gaussian distributed with variance of $\sigma ^2$. We consider three gain values:$\left\{ n=6,k=2,\alpha =500\right\} $,

$\left\{ n=11,k=2,\alpha =1000 \right\} $ and $\left\{ n=21,k=5,\alpha =2500\right\} $ .

When dealing with the non-uniform input case, we have two methods to measure the probability values of the imperfect model correctly identifying the winner and loser neurons. One way is to use the original neural dynamics stated in (4)–(6).

Another method is to use Theorem 4 to check the performance of the imperfect model. In this method, we only need to use (18) to determine whether the imperfect model can correctly identify the winner and loser neurons for each set of inputs. The results are shown in Fig. 3. It can be seen that the results obtained from Theorem 4 are quite close to the results obtained from the original neural dynamics over a wide range of noise levels and various settings.

For example, from Fig 3, for the case of $\{n=6,k=2,\alpha =500,\sigma =0.06309\}$, the two probability values from the two methods are 0.9416 and 0.09442, respectively.

5.2 Effectiveness of Theorem 5

In this subsection, we study the effectiveness of Theorem 5. For uniform inputs, there is an additional method to estimate the performance. The simulation settings are similar to those of Sect. 5.2, except that the inputs are uniformly distributed.

When the inputs are uniformly distributed, we can use the lower bound from Theorem 5 to estimate the chance of identifying the correct winners and losers. The results are shown in Fig. 4. There are three methods to estimate the probability values. The first method is based on the original neural dynamics, which is quite time consuming. The second method is from Theorem 4, in which we should have input data sets. The last method involves using the lower bound from Theorem 5. The advantage of this method is that there is no need to perform the time-consuming simulation of the neural dynamics or require input data sets.

The results are in Fig 4. First, it can be seen that the probability values obtained from Theorem 4 are very close to those obtained from simulating the original neural dynamics. The probability values obtained from Theorem 5 are lower than the values obtained the original neural dynamics and Theorem 4. It is because Theorem 5 gives lower bounds on the probability values. But, the advantage of Theorem 5 is that there is no need to have input data sets.

We can use our result to know the noise tolerant level of the model. For example, for $\{n=21,k=5,\beta =2500\}$ with the target probability value equal to 0.95, the input noise level $\sigma $ should be less than 0.0157 from Theorem 4, while the result of the low bound tells us that the input noise level $\sigma $ should be less than 0.00995.

5.3 Effectiveness of Theorem 7

We can Theorem 7 to predict the performance of the model for non-Gaussian distributed multiplicative input noise. To study the performance in this case, we consider that the inputs follow a uniform distribution with a range of (0, 1) in this subsection. We generated 10,000 sets of inputs for this purpose.

We consider that multiplicative input noise are “$\varepsilon _i\left( t\right) u_i$” ’s, where $\varepsilon _i\left( t\right) $’s are uniformly distributed in the range $[-\Delta /2,\Delta /2]$. We chose to use a uniform distribution to demonstrate that the GMM concept is capable of handling non-bell-shaped distributions. This is because the uniform distribution has a rectangular shape, which is significantly different from the Gaussian distribution.

In the simulation, for each noise level, we build a GMM with 11 components. We consider three settings: $\left\{ n=6,k=2,\alpha =500\right\} $, $\left\{ n=11,k=2,\alpha =1000 \right\} $ and $\left\{ n=21,k=5,\alpha =2000\right\} $.

To validate the effectiveness of Theorem 7, we also use the original neural dynamics to estimate the probability of the model having the correct operation. It should be noticed that this simulation method is quite time consuming. The results are shown in Fig. 5. From the figure, the result of Theorem 7 is very close to that of simulating the original neural dynamics.

Again, we can use Theorem 7 to predict the tolerant level for input noise. For example, for $\{n=6,k=2,\alpha =500\}$ with the target probability value equal to 0.95, Theorem 7 tells us that the input noise range $\delta $ should be less than 0.1999.

6 Conclusion

This paper presented an analysis of the DNN-kWTA model with two imperfections, namely, multiplicative input noise and non-ideal activation in IO neurons. We first developed an equivalent model to describe the dynamics of the imperfect DNN-kWTA model. It should be aware that the equivalent model is introduced for studying behaviour of the imperfect DNN-kWTA model and that it is not a new model. Using the equivalent model, we derive sufficient conditions for checking whether the imperfect model can correctly identify the winner and loser neurons. For uniform-distributed inputs, we provide a formula to estimate the lower bound on the probability value of the model with correct operation. Lastly, we extend our results to handle the non-Gaussian multiplicative input noise case. We validate our theoretical results through various simulations.

References

Touretzky, S.: Winner-take-all networks of $ O (n) $ complexity. Advances in Neural Information Processing Systems, (1) Morgan Kaufmann, 703–711 (1989)
Google Scholar
Kwon, T.M., Zervakis, M.: KWTA networks and their applications. Multidimensional Systems and Signal Processing 6(4), 333–346 (1995)
Article MATH Google Scholar
Narkiewicz J.D., Burleson W.P.: Rank-order filtering algorithms: A comparison of VLSI implementations. In the 1993 IEEE International Symposium on Circuits and Systems. IEEE, 1941–1944 (1993)
Google Scholar
Sum, J.P., Leung, C.S., Tam, P.K., Young, G.H., Kan, W.K., Chan, L.w.: Analysis for a class of winner-take-all model. IEEE transactions on neural networks 10(1), 64–71 (1999)
Google Scholar
Hu, X., Wang, J.: An improved dual neural network for solving a class of quadratic programming problems and its $k$-winners-take-all application. IEEE Transactions on Neural networks 19(12), 2022–2031 (2008)
Article Google Scholar
Moscovici, A.: High speed A/D converters: understanding data converters through SPICE, vol. 601. Springer Science & Business Media (2001)
Google Scholar
Feng, R., Leung, C.S., Sum, J., Xiao, Y.: Properties and performance of imperfect dual neural network-based $k$WTA networks. IEEE transactions on neural networks and learning systems 26(9), 2188–2193 (2014)
Article Google Scholar
Redouté, J.M., Steyaert, M.: Measurement of emi induced input offset voltage of an operational amplifier. Electronics Letters 43(20), 1088–1090 (2007)
Article Google Scholar
Kuang, X., Wang, T., Fan, F.: The design of low noise chopper operational amplifier with inverter. In: 2015 IEEE 16th International Conference on Communication Technology (ICCT). pp. 568–571. IEEE (2015)
Google Scholar
Lee, P.: Low noise amplifier selection guide for optimal noise performance. Analog Devices Application Note, AN-940 (2009)
Google Scholar
Feng, R., Leung, C.S., Sum, J.: Robustness analysis on dual neural network-based $k$WTA with input noise. IEEE transactions on neural networks and learning systems 29(4), 1082–1094 (2017)
Article Google Scholar
Sum, J., Leung, C.S., Ho, K.I.J.: On Wang $k$WTA with input noise, output node stochastic, and recurrent state noise. IEEE transactions on neural networks and learning systems 29(9), 4212–4222 (2017)
Article Google Scholar
Semenova, N., et al.: Fundamental aspects of noise in analog-hardware neural networks. Chaos: An Interdisciplinary Journal of Nonlinear Science 29(10) (2019)
Google Scholar
Kariyappa, S., et al.: Noise-resilient DNN: Tolerating noise in PCM-based AI accelerators via noise-aware training. IEEE Transactions on Electron Devices 68(9), 4356–4362 (2021)
Article Google Scholar
Haley, D.C.: Estimation of the dosage mortality relationship when the dose is subject to error. STANFORD UNIV CA APPLIED MATHEMATICS AND STATISTICS LABS, Tech. rep. (1952)
Google Scholar
Radev, S.T., Mertens, U.K., Voss, A., Ardizzone, L., Kothe, U.: Bayesflow: Learning complex stochastic models with invertible neural network. IEEE Transactions on Neural Networks and Learning Systems 33(4), 1452–1466 (2020)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong
Wenhao Lu & Chi-Sing Leung
Institute of Technology Management, National Chung Hsing University, Taichung, Taiwan
John Sum

Authors

Wenhao Lu
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Sing Leung
View author publications
You can also search for this author in PubMed Google Scholar
John Sum
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chi-Sing Leung .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, W., Leung, CS., Sum, J. (2023). Effect of Logistic Activation Function and Multiplicative Input Noise on DNN-kWTA Model. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1791. Springer, Singapore. https://doi.org/10.1007/978-981-99-1639-9_17

Download citation

DOI: https://doi.org/10.1007/978-981-99-1639-9_17
Published: 15 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1638-2
Online ISBN: 978-981-99-1639-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Effect of Logistic Activation Function and Multiplicative Input Noise on DNN-kWTA Model

Abstract

Similar content being viewed by others

Analysis of the DNN-kWTA Network Model with Drifts in the Offset Voltages of Threshold Logic Units

The Performance of the Stochastic DNN-kWTA Network

Neuro-Inspired Control

Keywords

1 Introduction

2 Basic DNN-kWTA

3 Logistic DNN-kWTA with Input Noise

3.1 DNN-kKWTA Under Imperfection

3.2 Equivalent Model

Lemma 1

Theorem 1

Proof:

Theorem 2

Proof:

Definition 1

Theorem 3

Proof:

Theorem 4

Proof:

Theorem 5

Proof

4 Non-gaussian Multiplicative Input Noise

Theorem 6

Theorem 7

5 Simulation Results

5.1 Effectiveness of Theorem 4

5.2 Effectiveness of Theorem 5

5.3 Effectiveness of Theorem 7

6 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation