Relationship between phase and amplitude generalization errors in complex- and real-valued feedforward neural networks

Hirose, Akira; Yoshida, Shotaro

doi:10.1007/s00521-012-0960-z

Relationship between phase and amplitude generalization errors in complex- and real-valued feedforward neural networks

ICONIP 2011
Published: 20 June 2012

Volume 22, pages 1357–1366, (2013)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Computing and Applications Aims and scope Submit manuscript

Relationship between phase and amplitude generalization errors in complex- and real-valued feedforward neural networks

Download PDF

Akira Hirose¹ &
Shotaro Yoshida¹

383 Accesses
10 Citations
Explore all metrics

Abstract

We compare the generalization characteristics of complex-valued and real-valued feedforward neural networks. We assume a task of function approximation with phase shift and/or amplitude change in signals having various coherence. Experiments demonstrate that complex-valued neural networks show smaller generalization error than real-valued networks having doubled input and output neurons in particular when the signals have high coherence, that is, high degree of wave nature. We also investigate the relationship between amplitude and phase errors. It is found in real-valued networks that abrupt change in amplitude is often accompanied by steep change in phase, which is a consequence of local minima in real-valued supervised learning.

Complex-Valued Feedforward Neural Networks Learning Without Backpropagation

Scaled Conjugate Gradient Learning for Complex-Valued Neural Networks

Conjugate Gradient Algorithms for Complex-Valued Neural Networks

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Researches on complex-valued neural networks (CVNNs) have revealed various aspects of their dynamics. However, at the same time, it is true that a complex number is represented by a pair of real numbers, namely real and imaginary parts, or amplitude and phase. Actually, a variety of useful neural dynamics theories are obtained by paying attention to the real and imaginary parts [1, 10, 11] or amplitude and phase [2, 3]. This fact sometimes leads to an assumption that a CVNN is almost equivalent to a double-dimensional real-valued neural network.

In this paper, we compare complex- and real-valued neural networks by focusing on their generalization characteristics. We investigate the generalization ability of feedforward complex-valued and double-dimensional real-valued neural networks, in particular when they learn and process wave-related data for function approximation or filtering. We observe the characteristics by feeding signals that have various degrees of wave nature by mixing a sinusoidal wave and white noise. Computer experiments demonstrate that the generalization characteristics of CVNNs are different from those of double-dimensional real-valued neural networks (RVNNs) depending on the degree of wave nature of the signals, or the coherence.

This paper is an extension of a conference paper [8]. A statistical evaluation and discussion on this topic are also given in Ref. [9]. Contrarily, we concentrate on the relationship between the amplitude and phase response occurring with the time shift and amplitude changes in the input signals.

This paper is organized as follows. Section 2 reviews a property of complex numbers by representing them as 2 × 2 matrices and discuss its effect on the supervised learning in non-layered feedforward neural networks. Section 3 presents the construction of the computer experiments and learning dynamics. In Sect. 4, we show the difference in the generalization characteristics experimentally. We find that the generalization ability of CVNNs is higher than that of double-dimensional RVNNs especially when they process signals having high degree of wave nature. We discuss the characteristics specific to respective networks. Section 5 is the conclusion.

2 Qualitative difference between complex- and real-valued neural networks

2.1 Complex number represented as real 2 × 2 matrix

First, we review the nature of a complex number [5]. As we focus on multiplication out of the four arithmetic operations of complex numbers, we can represent a complex number as a real 2 × 2 matrix. That is, with every complex number $c = a + \sqrt{- 1}\, b$, where a and b are real numbers, we associate a C-linear transformation

$$ T_{c} : \user2{C} \rightarrow \user2{C}, \quad z \mapsto cz = ax - by + \sqrt{-1} (bx + ay) $$

(1)

If we identify C with R ² by

$$ z = x + i y = \left( \begin{array}{l} x \\ y\\ \end{array} \right) $$

(2)

it follows that

$$ \begin{aligned} T_{c} \left( \begin{array}{l} x \\ y\\ \end{array} \right) &= \left( \begin{array}{l} ax - by \\ bx + ay\\ \end{array} \right) \\ & = \left( \begin{array}{ll} a & -b \\ b & a\\ \end{array} \right) \left( \begin{array}{l} x \\ y\\ \end{array} \right) \end{aligned} $$

(3)

In other words, the linear transformation T _c determined by c = a + i b is described by the matrix $\left( \begin{array}{ll} a & -b\\ b & a\\ \end{array} \right)$. Generally, a mapping represented by a 2 × 2 matrix is non-commutative. However, in the present case, it becomes commutative.

The most important point of this representation lies in the fact that it clearly expresses the function specific to the complex numbers as the rotation and amplification or attenuation as

$$ \left( \begin{array}{ll} a & -b \\ b & a\\ \end{array} \right) = r \left( \begin{array}{ll} \cos \theta & - \sin \theta \\ \sin \theta & \cos \theta\\ \end{array} \right) $$

(4)

where $r \equiv \sqrt{a^{2} + b^{2}}$ and $\theta \equiv \arctan{b/a}$ denote amplification or attenuation of amplitude and rotation angle applied to the complex signals z, respectively.

2.2 Phase rotation and amplitude amplification/attenuation in neural networks

Let us consider how the above feature of the complex number emerges in neural dynamics. Assume a task to realize a mapping that transforms an input x ^IN to an output x ^OUT shown in Fig. 1a through supervised learning that adjusts the synaptic weights w _ji. Simply, we have only a single teacher pair of input and output signals here. We consider a very simple feedforward neural network in the real number domain shown in Fig. 1b having a single-layer, 2-input, 2-output. For simplicity, we omit possible nonlinearity at the neurons, that is, the activation function is the identity function. Then, the general input–output relationship is described by using four real numbers a, b, c, and d as

$$ \left( \begin{array}{l} x_{1}^{{\bf OUT}} \\ x_{2}^{{\bf OUT}} \\ \end{array} \right) = \left( \begin{array}{ll} a & b \\ c & d \\ \end{array} \right) \left( \begin{array}{l} x_{1}^{{\bf IN}} \\ x_{2}^{{\bf IN}} \\ \end{array} \right) $$

(5)

In the present case, we have a variety of possible mappings realized by the learning because the number of parameters to be determined is larger than that of the condition, that is, the learning task is an ill-posed problem. The functional difference among the possible mapping functions emerges as the difference in the generalization characteristics. For example, learning can result in a degenerate mapping shown in Fig. 1c, which is often unuseful in practice.

Next let us consider the learning of the mapping in the complex domain, which transforms a complex value x ^IN = (x ^IN₁ , x ^IN₂ ) to another complex value x ^OUT = (x ^OUT₁ , x ^OUT₂ ). Figure 1d shows a CVNN, where the weight is a single complex value $w = | w | \exp (\sqrt{-1} \, \theta)$. The situation is expressed just like in (5) with the constraint (4) as

$$ \left( \begin{array}{l} x_{1}^{{\bf OUT}} \\ x_{2}^{{\bf OUT}} \\ \end{array} \right) = \left( \begin{array}{ll} | w | \cos \theta & - | w | \sin \theta \\ | w | \sin \theta & | w | \cos \theta \\ \end{array} \right) \left( \begin{array}{l} x_{1}^{{\bf IN}} \\ x_{2}^{{\bf IN}} \\ \end{array} \right) $$

(6)

The degree of freedom is reduced, and the arbitrariness of the solution is also reduced. That is, we have a unique solution in this case as follows. Figure 1d illustrates the result of the learning. The mapping is a combination of phase rotation and amplitude attenuation.

This property can be a great advantage when we deal with information related with waves such as electromagnetic wave, lightwave, and electron wave. This is an intuitive expectation, and investigated numerically in the following sections.

3 Construction of experiments and learning dynamics

We organize our experiment as follows.

Preparation of input signals: Variously weighted summation of sinusoidal wave (coherent wave) and non-wave data, that is, white noise having random amplitude and phase (or real and imaginary parts).
Definition of the task to learn: Identity mapping, which is expected to show the learning characteristics clearly, for the above signals with various degrees of wave nature.
Evaluation of generalization: Observation of the generalization error when the input signals shift in time and/or when the amplitude is changed.

3.1 Forward processing and learning dynamics

3.1.1 Complex-valued neural network

We consider a layered feedforward network having input terminals, hidden neurons, and output neurons. In the case of a CVNN, we employ a phase-amplitude-type sigmoid activation function and the teacher-signal-backpropagation learning process [3, 7] with notations of

$$ \user2{z}^{{\rm I}} = [z _{1}, \ldots, z_{i} , \ldots, z_{I}, z_{I+1}]^{T} \quad (\hbox{Input signal vector}) $$

(7)

$$ \user2{z}^{{\rm H}} = [z_{1}, \ldots, z_{h}, \ldots, z_{\rm H}, z_{{\rm H}+1}]^{T} \quad (\hbox{Hidden-layer output signal vector}) $$

(8)

$$ \user2{z}^{{\rm O}} = [z_{1}, \ldots, z_{o}, \ldots, z_{O}]^{T} \quad (\hbox{Output-layer signal vector}) $$

(9)

$$ {\bf W}^{{\rm H}} = [w_{hi} ] \quad (\hbox{Hidden neuron weight matrix}) $$

(10)

$$ {\bf W}^{{\rm O}} = [w_{oh} ] \quad (\hbox{Output neuron weight matrix}) $$

(11)

where $[\cdot]^{T}$ means transpose. In (10) and (11), the weight matrices include additional weights w _{h
I+1} and w _oH+1, equivalent to neural thresholds, where we add formal constant inputs z _I+1 = 1 and z _H+1 = 1 in (7) and (8), respectively. Respective signal vectors and synaptic weights are connected with one another through an activation function f(z) as

$$ \user2{z}^{{\rm H}} = f \left({\bf W}^{{\rm H}} \user2{z}^{{\rm I}} \right)\, , \quad\, \user2{z}^{{\rm O}} = f \left({\bf W}^{{\rm O}} \user2{z}^{{\rm H}} \right) $$

(12)

where f(z) is a function of each vector element $z\, (\in \user2{C})$ defined as

$$ f(z) = \tanh \left( \left| z \right|\right) \exp \left( \sqrt{-1} \, \arg z \right) $$

(13)

Figure 2 is a diagram to explain the supervised learning process. We prepare a set of teacher signals at the input $\hat{\user2{z}}_{s}^{{\rm I}} = [\hat{z}_{1 s}, \ldots, \hat{z}_{i s}, \ldots, \hat{z}_{{\rm I} s}, \hat{z}_{{\rm I}+1 \, s}]^{T}$ and the output $\hat{\user2{z}}_{s}^{{\rm O}} = [\hat{z}_{1 s}, \ldots, \hat{z}_{{\rm o} s}, \ldots, \hat{z}_{{\rm O} s}]^{T} \,\, (s = 1, \ldots, s, \ldots S)$ for which we employ the teacher-signal backpropagation learning. We define an error function E to obtain the dynamics [3, 7] as

$$ E \equiv \frac{1}{2} \sum_{s = 1}^{S} \sum_{o = 1}^{O} \left| z_{o} (\hat{\user2{z}}_{s}^{{\rm I}} ) - \hat{z}_{o s} \right|^{2} $$

(14)

$$ \left| w_{oh}^{{\rm new}} \right| = \left| w_{oh}^{{\rm old}} \right| - K \frac{\partial E}{\partial \left| w_{oh} \right|} $$

(15)

$$ \arg w_{oh}^{{\rm new}} = \arg w_{oh}^{{\rm old}} - K \frac{1}{| w_{oh} |} \frac{\partial E}{\partial ( \arg w_{oh} )} $$

(16)

$$ \begin{aligned} \frac{\partial E}{\partial \left| w_{oh} \right|} =& \left(1 - \left| z_{o} \right|^{2}\right) \left(\left|z_{o}\right| - \left|\hat{z}_{o}\right| \cos \left( \arg z_{o} - \arg \hat{z}_{o} \right) \right) \left| z_{h} \right| \cos \left( \arg z_{o} - \arg \hat{z}_{o} - \arg w_{oh} \right) \\ & - \left| z_{o} \right| \left| \hat{z}_{o} \right| \sin \left( \arg z_{o} - \arg \hat{z}_{o} \right) \frac{\left| z_{h} \right|}{\tanh^{-1} \left| z_{o} \right|} \sin \left( \arg z_{o} - \arg \hat{z}_{o} - \arg w_{oh} \right) \end{aligned} $$

(17)

$$ \begin{aligned} \frac{1}{| w_{oh} |} \frac{\partial E}{\partial ( \arg w_{oh} )} =& \left( 1 - \left| z_{o} \right|^{2} \right) \left( \left| z_{o} \right| - \left| \hat{z}_{o} \right| \cos \left( \arg z_{o} - \arg \hat{z}_{o} \right) \right) \left| z_{h} \right| \sin \left( \arg z_{o} - \arg \hat{z}_{o} - \arg w_{oh} \right)\\ &+ \left| z_{o} \right| \left| \hat{z}_{o} \right| \sin \left( \arg z_{o} - \arg \hat{z}_{o} \right) \frac{\left| z_{h} \right|}{\tanh^{-1} \left| z_{o} \right|} \cos \left( \arg z_{o} - \arg \hat{z}_{o} - \arg w_{oh} \right) \end{aligned} $$

(18)

where $(\cdot)^{{\rm new}}$ and $(\cdot)^{{\rm old}}$ indicate the update of the weights from $(\cdot)^{{\rm old}}$ to $(\cdot)^{{\rm new}}$ and K is a learning constant. The teacher signals at the hidden layer $\hat{\user2{z}}^{{\rm H}} = [\hat{z}_{1}, \ldots, \hat{z}_{h}, \ldots, \hat{z}_{H}]^{T}$ are obtained by making the output teacher vector itself $\hat{\user2{z}}^{{\rm O}}$ propagate backward as

$$ \hat{\user2{z}}^{{\rm H}} = \left(f \left( \left( \hat{\user2{z}}^{{\rm O}} \right)^{*} \hat{\bf W}^{{\rm O}} \right) \right)^{*} $$

(19)

where $(\cdot)^{*}$ denotes Hermite conjugate. Using $\hat{\user2{z}}^{{\rm H}}$, the hidden-layer neurons change their weights by following (15)–(18) with replacement of the suffixes o, h with h, i [4, 6].

3.1.2 Double-dimensional real-valued neural network

Similarly, the forward processing and learning of a double-dimensional RVNN is explained as follows. Figure 2 includes also this case. We represent a complex number as a pair of real numbers as $z_{i} = x_{2i-1} + \sqrt{-1} \, x_{2 i}$. That is, we have a double-dimensional real input vector z ^I_R , a double-dimensional hidden signal vector z ^H_R , and a double-dimensional output signal vector z ^O_R . A forward signal processing connects the signal vectors as well as hidden neuron weights W ^H_R and output neuron weights W ^O_R through a real-valued activation function f _R as

$$ \begin{aligned} \user2{z}_{{\rm R}}^{{\rm I}} =& \left[\overbrace{x_{1}, \quad x _{2}}^{\quad\hbox{real \& imaginary}}, \ldots, x_{2i-1}, x_{2i}, \ldots, x_{2I-1}, x_{2I}, x_{2I+1}, x_{2I+2}\right]^{T} \\ & \quad \left( = \user2{z}^{{\rm I}} \right) \quad (\hbox{Input signal vector}) \end{aligned} $$

(20)

$$ \begin{aligned} \user2{z}_{{\rm R}}^{{\rm H}} =& [x_{1}, x_{2}, \ldots, x_{2h-1}, x_{2h}, \ldots, x_{2H-1}, x_{2H}, x_{2H+1}, x_{2H+2}]^{T}\\ & \quad (\hbox{Hidden-layer output signal vector}) \end{aligned} $$

(21)

$$ \begin{aligned} \user2{z}_{{\rm R}}^{{\rm O}} =& [ x_{1}, x_{2}, \ldots, x_{2o-1}, x_{2o}, \ldots , x_{2O-1}, x_{2O}]^{T} \\ & \quad (\hbox{Output-layer signal vector}) \end{aligned} $$

(22)

$$ {\bf W}_{{\rm R}}^{{\rm H}} = [ w_{{\rm R} hi} ] \quad (\hbox{Hidden neuron weight matrix}) $$

(23)

$$ {\bf W}_{{\rm R}}^{{\rm O}} = [ w_{{\rm R} oh} ] \quad (\hbox{Output neuron weight matrix}) $$

(24)

$$ \user2{z}_{{\rm R}}^{{\rm H}} = f_{{\rm R}} \left( {\bf W}_{{\rm R}}^{{\rm H}} \user2{z}_{{\rm R}}^{{\rm I}} \right)\, , \quad \user2{z}_{{\rm R}}^{{\rm O}} = f_{{\rm R}} \left( {\bf W}_{{\rm R}}^{{\rm O}} \user2{z}_{{\rm R}}^{{\rm H}} \right) $$

(25)

$$ f_{{\rm R}} (x) = \tanh \left( x \right) $$

(26)

where the thresholds are $w_{{\rm R} \, h \, 2I+1}, \, w_{{\rm R} \, h \, 2I+2}, \, w_{{\rm R} \, h \, 2H+1}$, and $w_{{\rm R} \, h \, 2H+2}$ with formal additional inputs x _2H+1 = 1, x _2H+2 = 1, x _2H+1 = 1, and x _2H+2 = 1. We employ the conventional error backpropagation learning.

4 Computer experiments

4.1 Experimental setup

We chose the identity mapping as the task to be learned to show the network characteristics most clearly. To generate input signals as a function of time z(t) having several degrees of coherence, we added white Gaussian noise n(t) to a sinusoidal wave $\sin \omega t$ (angular frequency ω) with various weighting as $z(t) = a_{{\rm s}} \sin \omega t + a_{{\rm n}} n(t)$ where a _s and a _n denote equivalent amplitude. Then, the degree of wave nature is expressed as the signal-to-noise ratio SNR ≡ a _s/a _n where ${\rm SNR} = \infty$ means complete wave, while SNR = 0 corresponds to complete non-wave. The network parameters are as follows: Number of input neurons I = 16, hidden neurons H = 25, output neurons O = 16, learning constant K = 0.01, and the learning iteration = 3,000.

4.2 Results and discussion

Figures 3, 4, 5, 6 display typical examples of the learning curves and output signals when ${\rm SNR} = \infty$, 20dB, 10dB, and 0dB, respectively. Figure 3a shows an example of the learning curve when ${\rm SNR} = \infty$, that is, the signal is sinusoidal. We find that the learning is almost successfully completed for both the CVNN and RVNN. The learning errors converge roughly at zero, which means that there is only slight residual error at the learning teacher points. However, the curves are different from each other at the beginning of the learning. The curve of the CVNN shows quick decreases. Contrarily, that of the RVNN has a plateau just after the beginning and then steep decrease. This tendency is often observed in the RVNN to learn high-coherence signals, which implies that the RVNN is subject to local minima.

After the learning, we feed other input signals to investigate the generalization. As mentioned above, the wavelength is adjusted to span over the 16 neural input terminals. Figure 3b and c presents examples of the output amplitude and phase, respectively, showing from left-hand side to the right-hand side the ideal output of the identity mapping, the RVNN outputs, and the CVNN outputs of the 16 output neurons. The horizontal axis shows the amplitude changing from 0 to 1. Figure 3d and e shows the output amplitude and phase when the input signal is shifted in time. The horizontal axes present the time shift t normalized by the unit-wave duration T.

We find that the RVNN output amplitude and phase values are very different from the ideal ones. The learning has been conducted at 16 snapshots of the signal waveform, that is, four points in the amplitude (a = 0, 0.25, 0.5, 0.75) multiplied by four phase shifts, or time shifts t normalized by the signal wave period T (t/T = 0, 1/8, 2/8, 3/8) plus initial waveform phase at respective neurons $2i \pi/16\, (i = 0, 1, 2, \ldots , 15)$. (For the details of the learning process, see Ref. [9].) In each charts in Fig. 3b, c, d, or e, we have 16 curves corresponding to the 16 neuron outputs.

Figure 3b shows the output amplitude response when the input signal amplitude is changed. The ideal output (left-hand side) is given as the proportional outputs where 16 curves are identical. However, the RVNN output amplitudes are largely different. At the learning points of 0, 0.25, and 0.5, the curves are forced to converge at almost ideal values. The phase values at 0.25 and 0.5 in Fig. 3c also show values near to the ideal ones. However, in total, it deviates very largely from the ideal line, though at the 0-amplitude point the phase value means nothing.

At the last learning point, amplitude of 0.75, the state is different. The amplitude values do not converge but are scattered instead. The phase values also differ from the ideal ones. The result implies that the network sought optimal solution in the scattered amplitude condition, which is a local minimum.

In contrast, the 16 amplitude outputs of the CVNN in Fig. 3b are identical with one another, which situation is the same as the ideal one, though the amplitude curves show saturation at the large amplitude region. This result reflects directly the saturation characteristic of the neuron activation function. As an optimal learning result, the network shows slightly larger amplitude in the small input-amplitude region, while a little smaller amplitude in the large input region. The phase values in Fig. 3c are similar to ideal ones, though at around 0-amplitude they are meaningless and deviating.

Next, we observe the responses to the time shift (or phase shift) of the input signal. The horizontal axes in Figs. 3d and e show the time shift normalized by the wave period T. Figure 3d presents the output amplitude when the input amplitude is fixed at 0.5. Ideally, it should be 0.5 constantly. However, the RVNN outputs deviate very largely again. The phase values in Fig. 3e also deviate from the ideal ones. The large phase error regions in (e) correspond to the regions of steep amplitude changes in (d). Contrarily, the CVNN output amplitude values in Fig. 3d are almost constant. The value is a little different from 0.5 because of the nonlinearity of the neuron activation function. The phase values of the CVNN in Fig. 3e are almost identical with the ideal lines.

As seen above, the CVNN presents better generalization ability in both amplitude and phase for coherent signals. Its feature is obvious in the response including the phase rotation observed clearly as the phase stability against input amplitude changes as well as the linear phase changes versus the input phase shift. These characteristics match the single neuron dynamics in Fig. 1e in Sect. 2, which shows that the elemental process consists of phase rotation as well as amplitude change if needed.

Figures 4, 5, 6 show the data for SNR = 20dB, 10dB, and 0dB, respectively. As the degree of wave nature decreases, the generalization error increases. However, in any SNR case, both the amplitude and phase of the CVNN exhibit better generalization than those of the RVNN.

Note that the time required for the learning can also be longer, in particular for the signals with lower degree of wave nature (smaller the SNR, lower the coherence). This fact is attributed to the smaller degree of freedom of the CVNNs described in Sect. 2.2.

The correspondence between steep changes in amplitude and phase is sometimes observed also in these low coherence cases. For example, in Fig. 6b where SNR = 0dB, at amplitude of about 0.75, the amplitude has a sharp dip for several neuron outputs. Correspondingly, in Fig. 6c, we can find phase change in the phase outputs of the same neurons. Such changes are observed only in the RVNN where there is no implicit limitation of phase-and-amplitude elemental dynamics.

5 Conclusion

This paper investigated numerically the generalization characteristics in the feedforward complex-valued and real-valued neural networks (CVNN and RVNN). We compared a CVNN and a double-dimensional RVNN in a simple case where the networks deal with the task of function approximation. Computer experiments demonstrated that the CVNN exhibits better generalization characteristics in particular for signals having high degree of wave nature, that is, coherence. This fact is attributed to the smaller degree of freedom of the CVNN than that of the RVNN, resulting in a learning tendency to assume phase rotation and amplitude amplification or attenuation. We also investigated the relationship between the amplitude and phase errors. We found in the RVNN that abrupt change in amplitude is often accompanied by a steep change in phase. This phenomenon is a consequence of local minima in the RVNN and not observed in the CVNN. These characteristics of the CVNN are expected to be used in many applications to deal with wave phenomena and wave-related information processing.

References

Benvenuto N, Piazza F (1992) On the complex backpropagation algorithm. IEEE Trans Sign Process 40:967–969
Article Google Scholar
Georgiou GM, Koutsougeras C (1992) Complex domain backpropagation. IEEE Trans Circ Syst II 39(5):330–334
Article MATH Google Scholar
Hirose A (1992) Continuous complex-valued back-propagation learning. Electron Lett 28(20):1854–1855
Article Google Scholar
Hirose A (1994) Applications of complex-valued neural networks to coherent optical computing using phase-sensitive detection scheme. Inform Sci Appl 2:103–117
Article Google Scholar
Hirose A (2011) Nature of complex number and complex-valued neural networks. Frontiers of Electrical and Electronic Engineering in China 6(1):171–180
Article Google Scholar
Hirose A (2012) Complex-valued neural networks, 2nd (ed) Springer, Heidelberg
Book MATH Google Scholar
Hirose A, Eckmiller R (1996) Behavior control of coherent-type neural networks by carrier-frequency modulation. IEEE Trans Neural Netw 7(4):1032–1034
Article Google Scholar
Hirose A, Yoshida S (November 2011) Comparison of complex- and real-valued feedforward neural networks in their generalization ability. In: International Conference on Neural Information Processing (ICONIP) 2011 Shanghai. Springer, pp. 526–531
Hirose A, Yoshida S (2012) Generalization characteristics of complex-valued feedforward neural networks in relation to signal coherence. IEEE Trans Neural Netw Learn Syst 23:541–551
Google Scholar
Leung H, Haykin S (1991) The complex backpropagation algorithm. IEEE Trans Sign Process 39:2101–2104
Article Google Scholar
Nitta T (1997) An extension of the back-propagation algorithm to complex numbers. Neural Netw 10:1391–1415
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Information Systems, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Akira Hirose & Shotaro Yoshida

Authors

Akira Hirose
View author publications
You can also search for this author in PubMed Google Scholar
Shotaro Yoshida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akira Hirose.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hirose, A., Yoshida, S. Relationship between phase and amplitude generalization errors in complex- and real-valued feedforward neural networks. Neural Comput & Applic 22, 1357–1366 (2013). https://doi.org/10.1007/s00521-012-0960-z

Download citation

Received: 09 March 2012
Accepted: 14 May 2012
Published: 20 June 2012
Issue Date: June 2013
DOI: https://doi.org/10.1007/s00521-012-0960-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Relationship between phase and amplitude generalization errors in complex- and real-valued feedforward neural networks

Abstract

Similar content being viewed by others

Complex-Valued Feedforward Neural Networks Learning Without Backpropagation

Scaled Conjugate Gradient Learning for Complex-Valued Neural Networks

Conjugate Gradient Algorithms for Complex-Valued Neural Networks

1 Introduction

2 Qualitative difference between complex- and real-valued neural networks

2.1 Complex number represented as real 2 × 2 matrix

2.2 Phase rotation and amplitude amplification/attenuation in neural networks

3 Construction of experiments and learning dynamics