The Parallel Modification to the Levenberg-Marquardt Algorithm

Bilski, Jarosław; Kowalczyk, Bartosz; Grzanek, Konrad

doi:10.1007/978-3-319-91253-0_2

Jarosław Bilski¹⁸,
Bartosz Kowalczyk¹⁸ &
Konrad Grzanek^19,20

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10841))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

2329 Accesses
11 Citations

Abstract

The paper presents a parallel approach to the Levenberg-Marquardt algorithm (also called LM or LMA). The first section contains the mathematical basics of the classic LMA. Then the parallel modification to LMA is introduced. The classic Levenberg-Marquardt algorithm is sufficient for a training of small neural networks. For bigger networks the algorithm complexity becomes too big for the effective teaching. The main scope of this paper is to propose more complexity efficient approach to LMA by parallel computation. The proposed modification to LMA has been tested on a few function approximation problems and has been compared to the classic LMA. The paper concludes with the resolution that the parallel modification to LMA could significantly improve algorithm performance for bigger networks. Summary also contains a several proposals for the possible future work directions in the considered area.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Parallel Approach to the Levenberg-Marquardt Learning Algorithm for Feedforward Neural Networks

A New Computational Approach to the Levenberg-Marquardt Learning Algorithm

Parallel Levenberg-Marquardt Algorithm Without Error Backpropagation

Keywords

1 Introduction

The methods of Artificial Intelligence are broadly used in the modern world. Genetic algorithms, neural networks and their applications are the subject of many researches [11, 12, 14, 15]. The most common use of the modern AI can be found in the areas of optimization, classification and recognition of the given patterns [1, 2, 13, 16, 17, 21]. The Levenberg-Marquardt algorithm (also called LM or LMA) is a highly efficient and a popular teaching method for feedforward neural networks [9]. It is classified as a quasi-Newton method which finds a wide usage in areas of function minimization problems such as neural networks teaching [8]. One of the biggest downfalls of the LMA is a huge complexity contrary to classical teaching methods as e.g. the well known Back Propagation algorithm with all its variants [3, 7, 10]. That limitation makes LMA impractical as a teaching algorithm for bigger networks. The following paper describes the parallel modification for LMA which can be applied to any feedforward multi-layered neural network.

2 The Classic Levenbert-Marquardt Algorithm

The Levenberg-Marquard method is an iterational algorithm mostly used for non-linear function optimization. It combines the advantages of a steepest descent and the Gauss-Newton methods. In order to teach a neural network, LMA is used to minimize the following error function

$$\begin{aligned} E\left( {\mathbf{{w}}\left( n \right) } \right) = \frac{1}{2}\sum \limits _{t = 1}^Q {\sum \limits _{r = 1}^{{N_L}} {\varepsilon _r^{{{\left( L \right) }^2}}} \left( t \right) } = \frac{1}{2}\sum \limits _{t = 1}^Q {\sum \limits _{r = 1}^{{N_L}} {{{\left( {y_r^{\left( L \right) }\left( t \right) - d_r^{\left( L \right) }\left( t \right) } \right) }^2}} } \end{aligned}$$

(1)

where $\varepsilon _i^{\left( L \right) } $ stands for non-linear neuron error and can be depicted as

$$\begin{aligned} \varepsilon _r^{\left( L \right) }(t) = \varepsilon _r^{\left( {Lr} \right) }(t) = y_r^{\left( L \right) }(t) - d_r^{\left( L \right) }(t) \end{aligned}$$

(2)

while $d_r^{\left( L \right) }(t)$ is the r-th expected network response to the t-th teaching sample. In the Levenberg-Marquardt algorithm weights delta is given by

$$\begin{aligned} \varDelta \left( {\mathbf{{w}}(n)} \right) = - {\left[ {{\nabla ^\mathbf{{2}}}{} \mathbf{{E}}\left( {\mathbf{{w}}(n)} \right) } \right] ^{ - 1}}\nabla \mathbf{{E}}\left( {\mathbf{{w}}(n)} \right) \end{aligned}$$

(3)

where $\nabla \mathbf{{E}}\left( {\mathbf{{w}}(n)}\right) $ stands for the gradient vector

$$\begin{aligned} \nabla \mathbf{{E}}\left( {\mathbf{{w}}(n)} \right) = {\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{\varepsilon }}\left( {\mathbf{{w}}(n)} \right) \end{aligned}$$

(4)

and $\nabla ^\mathbf{{2}}{} \mathbf{{E}}\left( {\mathbf{{w}}(n)} \right) $ stands for the Hessian matrix

$$\begin{aligned} {\nabla ^\mathbf{{2}}}{} \mathbf{{E}}\left( {\mathbf{{w}}(n)} \right) = {\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{J}}\left( {\mathbf{{w}}(n)} \right) + \mathbf{{S}}\left( {\mathbf{{w}}(n)} \right) \end{aligned}$$

(5)

while $\mathbf{{J}}\left( {\mathbf{{w}}(n)}\right) $ is the Jacobian matrix

$$\begin{aligned} \mathbf{{J}}(\mathbf{{w}}\left( n \right) ) = \left[ {\begin{array}{*{20}{c}} {\frac{{\partial \varepsilon _1^{\left( L \right) }(1)}}{{\partial w_{10}^{(1)}}}}&{}{\frac{{\partial \varepsilon _1^{\left( L \right) }(1)}}{{\partial w_{11}^{(1)}}}}&{} \cdots &{}{\frac{{\partial \varepsilon _1^{\left( L \right) }(1)}}{{\partial w_{ij}^{(k)}}}}&{} \cdots &{}{\frac{{\partial \varepsilon _1^{\left( L \right) }(1)}}{{\partial w_{{N^L}{N^{L - 1}}}^{(L)}}}}\\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ {\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(1)}}{{\partial w_{10}^{(1)}}}}&{}{\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(1)}}{{\partial w_{11}^{(1)}}}}&{} \cdots &{}{\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(1)}}{{\partial w_{ij}^{(k)}}}}&{} \cdots &{}{\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(1)}}{{\partial w_{{N_L}{N_{L - 1}}}^{(L)}}}}\\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ {\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(Q)}}{{\partial w_{10}^{(1)}}}}&{}{\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(Q)}}{{\partial w_{10}^{(1)}}}}&{} \cdots &{}{\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(Q)}}{{\partial w_{ij}^{(k)}}}}&{} \cdots &{}{\frac{{\partial \varepsilon _{{N_L}}^{\left( L \right) }(Q)}}{{\partial w_{{N_L}{N_{L - 1}}}^{(L)}}}} \end{array}} \right] . \end{aligned}$$

(6)

The non-linear errors $\varepsilon _i^{\left( lr \right) } $ of the neurons in hidden layers are calculated as follows

$$\begin{aligned} \varepsilon _i^{\left( {lr} \right) }\left( t \right) \buildrel \wedge \over = \sum \limits _{m = 1}^{{N_{l + 1}}} {\delta _i^{\left( {l\mathrm{{ + }}1,r} \right) }\left( t \right) w_{mi}^{\left( {l + 1} \right) }}, \end{aligned}$$

(7)

$$\begin{aligned} \delta _i^{\left( {lr} \right) }\left( t \right) = \varepsilon _i^{\left( {lr} \right) }\left( t \right) f'\left( {s_i^{\left( {lr} \right) }\left( t \right) } \right) . \end{aligned}$$

(8)

That creates a possibility to calculate contents of the Jacobian matrix across all weights in the network

$$\begin{aligned} \frac{{\partial \varepsilon _r^{\left( L \right) }\left( t \right) }}{{w_{ij}^{\left( l \right) }}} = \delta _i^{\left( {lr} \right) }\left( t \right) x_j^{\left( l \right) }\left( t \right) . \end{aligned}$$

(9)

In the classic LMA all weights of a neural network are stored as a single vector. The $\mathbf{{S}}\left( {\mathbf{{w}}(n)} \right) $ factor from Eq. (5) is depicted as

$$\begin{aligned} \mathbf{{S}}\left( {\mathbf{{w}}(n)} \right) = {\sum \limits _{t = 1}^Q {\sum \limits _{r = 1}^{{N_L}} {\varepsilon _r^{\left( L \right) }\left( t \right) } \nabla ^2 } }\varepsilon _r^{\left( L \right) }\left( t \right) . \end{aligned}$$

(10)

In the Gauss-Newton method it is assumed that $\mathbf{{S}}\left( {\mathbf{{w}}(n)} \right) \approx 0$, so the Eq. (3) can be simplified

$$\begin{aligned} \varDelta \left( {\mathbf{{w}}(n)} \right) = - {\left[ {{\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{J}}\left( {\mathbf{{w}}(n)} \right) } \right] ^{ - 1}}{\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{\varepsilon }}\left( {\mathbf{{w}}(n)} \right) . \end{aligned}$$

(11)

For the Levenberg-Marquardt algorithm needs it is assumed that $\mathbf{{S}}\left( {\mathbf{{w}}(n)} \right) = \mu \mathbf I$ so the Eq. (3) takes the form

$$\begin{aligned} \varDelta \left( {\mathbf{{w}}(n)} \right) = - {\left[ {{\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{J}}\left( {\mathbf{{w}}(n)} \right) + \mu \mathbf{{I}}} \right] ^{ - 1}}{\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{\varepsilon }}\left( {\mathbf{{w}}(n)} \right) . \end{aligned}$$

(12)

Let

$$\begin{aligned} \begin{array}{c} \mathbf{{A}}\left( n \right) = - \left[ {{\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{J}}\left( {\mathbf{{w}}(n)} \right) + \mu \mathbf{{I}}} \right] \\ \\ \mathbf{{h}}\left( n \right) = {\mathbf{{J}}^T}\left( {\mathbf{{w}}(n)} \right) \mathbf{{\varepsilon }}\left( {\mathbf{{w}}(n)} \right) \end{array} \end{aligned}$$

(13)

so the Eq. (12) can be depicted as

$$\begin{aligned} \varDelta \left( {\mathbf{{w}}(n)} \right) = \mathbf{{A}}{\left( n \right) ^{ - 1}}{} \mathbf{{h}}\left( n \right) . \end{aligned}$$

(14)

At this stage the QR decomposition can be used to solve Eq. (14). This results with obtaining a desired weight update vector $\varDelta \left( {\mathbf{{w}}(n)} \right) $.

$$\begin{aligned} {\mathbf{{Q}}^T}\left( n \right) \mathbf{{A}}\left( n \right) \varDelta \left( {\mathbf{{w}}(n)} \right) = {\mathbf{{Q}}^T}\left( n \right) \mathbf{{h}}\left( n \right) , \end{aligned}$$

(15)

$$\begin{aligned} \mathbf{{R}}\left( n \right) \varDelta \left( {\mathbf{{w}}(n)} \right) = {\mathbf{{Q}}^T}\left( n \right) \mathbf{{h}}\left( n \right) . \end{aligned}$$

(16)

For the results presented in Sect. 4 the QR decomposition based on Givens rotations has been used. The summary of the Levenberg-Marquardt teaching algorithm can be presented in 5 steps:

1.
Calculate network outputs and errors across all teaching samples and solve the goal criterion.
2.
Run backpropagation method and calculate the jacobian matrix.
3.
Perform QR decomposition in order to obtain $\varDelta \left( {\mathbf{{w}}(n)} \right) $.
4.
Apply $\varDelta \left( {\mathbf{{w}}(n)} \right) $ for a neural network and calculate the goal criterion once again. If the new error is smaller than the original one commit the weights, divide $\mu $ by $\beta $ and proceed to step 1. If the new error is bigger than the original one, multiply $\mu $ by $\beta $ and go back to step 3.
5.
The algorithm is deemed to be finished once the gradient is reduced below the accepted threshold or the network error satisfies the predefined error goal.

3 Parallel Modification

As shown in the previous section (especially in Eq. (6)), the classic LMA requires computing a jacobian, which is a $\left[ no \cdot np\right] \times \left[ nw\right] $ matrix of error derivatives. Where no stands for a number of network outputs, np is a number of teaching samples and nw is a total number of weights in considered network. Then the Eq. (14) needs to be solved what involves the inversion of $\left[ nw\right] $ x $\left[ nw\right] $ size matrix. The clue of a presented modification is to create a set of jacobians, unique for each neuron of the network. Then the computation can be done in parallel for all neurons. In such case the algorithm complexity is significantly reduced to the biggest weight vector for the respective neuron.

The neural network shown in Fig. 1 is taken into further considerations. The network contains 2 inputs and 3 layers. The two hidden layers consist of 100 neurons while the output layer contains a single neuron. To simplify the calculations teaching vector is assumed to contain 100 samples. In the classic Levenberg-Marquardt algorithm (from now on also called LMC) the jacobian size is the following

$$\begin{aligned} \left[ no \cdot np\right] \times \left[ \sum \limits _{l = 1}^{L}{N_l \cdot \left( N_{l-1}+1\right) } \right] . \end{aligned}$$

(17)

By filling the above equation with considered network’s values the jacobian size calculates as follows

$$\begin{aligned} \left[ 1 \cdot 100\right] \times \left[ 100 \cdot 3 + 100 \cdot 101 + 1 \cdot 101\right] = \left[ 100\right] \times \left[ 10501\right] . \end{aligned}$$

(18)

In a parallel approach to the Levenberg-Marquardt algorithm (from now on also called LMP) the separate jacobian matrix is created for each neuron of the network so it’s size can be depicted as

$$\begin{aligned} \left[ np\right] \times \left[ \max {\left( N_{l-1}+1\right) } \right] , \end{aligned}$$

(19)

which in considered network translates into the biggest jacobian of size

$$\begin{aligned} \left[ 100\right] \times \left[ 101\right] . \end{aligned}$$

(20)

It is easy to see that the smaller jacobian matrix is, the faster QR decomposition is completed. Table 1 shows the comparison of jacobians sizes across all layers in the considered network for both LMC and LMP. Table 2 presents the average times of solving Eq. 14 for matrices of size $\left[ 10501\right] \times \left[ 10502\right] $ and $\left[ 101\right] \times \left[ 102\right] $. Intel core i7 CPU with the frequency set to 4.40 GHz has been used for computation.

Table 1. Sizes of jacobians in LMC and LMP comparison (LMP is divided into layers).

Full size table

Table 2. Average times for solving Eq. (14) for matrices of size $\left[ 10501\right] \times \left[ 10502\right] $ and $\left[ 101\right] \times \left[ 102\right] $ in milliseconds.

Full size table

4 Results

4.1 Approximating $y = 4x(1 - x)$ Function

The first teaching problem is a logistic curve approximation given by the formula $y = 4x(1 - x)$. A fully connected 1-5-1 network has been used. Teaching set consists of 11 samples to cover the argument range in $x \in \left[ 0, 1\right] $. As a teaching goal the maximum error of value 0.001 has been set. The best teaching results LMP algorithm achieved with the narrow initial weight values $w_{init} \in \left[ -0.5, 0.5\right] $ and the small value of $\beta \in \left[ 1.2, 4\right] $ factor. Figure 2 shows the networks outputs trained by LMC, LMP and the expected curve. Figure 3 shows the error for all teaching samples after a neural network training with LMP algorithm.

4.2 Approximating $y = \sin {x} \cdot \log {x}$ Function

The second teaching problem is a curve approximation given by the formula $y = \sin {x} \cdot \log {x}$. A fully connected 1-15-1 network has been used. Teaching set consists of 40 samples to cover the argument range in $x \in \left[ 0.1, 4\right] $. As a teaching goal the maximum error of value 0.001 has been set. The best teaching results LMP algorithm achieved with the narrow initial weight values $w_{init} \in \left[ -0.5, 0.5\right] $ and the small value of $\beta \in \left[ 1.2, 4\right] $ factor. Figure 4 shows the networks outputs trained by LMC, LMP and the expected curve. Figure 5 shows the error for all teaching samples after a neural network training with LMP algorithm.

5 Conclusion

Parallelization of the teaching algorithms seems to be a good direction for neural networks training optimization. This topic has been approached multiple times by many researchers e.g. [5, 18,19,20]. The parallel approach to the popular Levenberg-Marquardt neural network teaching algorithm has been presented in this paper. In the classic form the LMA is characterized by a very high computational complexity which goes to the square of a network size. The presented parallel modification to the Levenberg-Marquardt algorithm seems to overcome this limitation maintaining the reasonable accuracy and performance of the archetype. As shown in Sect. 4 in both approximation problems, networks trained by LMP achieved the assumed requirements. The LMP method seems to be a good direction for Levenberg-Marquardt algorithm optimization.

In the near future additional consideration in this matter can be taken. First, the complete parallel implementation of the LMP algorithm can be performed e.g. using CUDA platform and GPU capabilities. Then, more complex approximation problems can be trained e.g. higher dimension functions. Also in deep analysis of influence of parameter $\beta $ and initial network weights can be performed. Finally LMP implementation can be compared to similar neural network training algorithm parallel optimizations e.g. [4, 6].

References

Starczewski, A.: A new validity index for crisp clusters. Pattern Anal. Appl. 20(3), 687–700 (2015)
Article MathSciNet Google Scholar
Starczewski, A., Krzyżak, A.: Improvement of the validity index for determination of an appropriate data partitioning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 159–170. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59060-8_16
Chapter Google Scholar
Bilski, J., Wilamowski, B.M.: Parallel Levenberg-Marquardt algorithm without error backpropagation. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10245, pp. 25–39. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59063-9_3
Chapter Google Scholar
Bilski, J., Kowalczyk, B., Żurada, J.M.: Parallel implementation of the givens rotations in the neural network learning algorithm. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10245, pp. 14–24. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59063-9_2
Chapter Google Scholar
Bilski, J., Smola̧g, J.: Parallel realisation of the recurrent RTRN neural network learning. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) ICAISC 2008. LNCS (LNAI), vol. 5097, pp. 11–16. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69731-2_2
Chapter Google Scholar
Bilski, J., Smoląg, J.: Parallel architectures for learning the rtrn and elman dynamic neural network. IEEE Trans. Parallel Distrib. Syst. 26(9), 2561–2570 (2015)
Article Google Scholar
Bilski, J., Smoląg, J., Żurada, J.M.: Parallel approach to the Levenberg-Marquardt learning algorithm for feedforward neural networks. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) ICAISC 2015. LNCS (LNAI), vol. 9119, pp. 3–14. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19324-3_1
Chapter MATH Google Scholar
Marqardt, D.: An algorithm for last-sqares estimation of nonlinear paeameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)
Article Google Scholar
Hagan, M.T., Menhaj, M.B.: Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 5(6), 989–993 (1994)
Article Google Scholar
Werbos, J.: Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, Cambridge (1974)
Google Scholar
Cpałka, K., Łapa, K., Przybył, A.: A new approach to design of control systems using genetic programming. Inf. Technol. Control 44(4), 433–442 (2015)
Google Scholar
Łapa, K., Cpałka, K.: On the application of a hybrid genetic-firework algorithm for controllers structure and parameters selection. In: Borzemski, L., Grzech, A., Świątek, J., Wilimowska, Z. (eds.) Information Systems Architecture and Technology: Proceedings of 36th International Conference on Information Systems Architecture and Technology – ISAT 2015 – Part I. AISC, vol. 429, pp. 111–123. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28555-9_10
Chapter Google Scholar
Łapa, K., Cpałka, K., Galushkin, A.I.: A new interpretability criteria for neuro-fuzzy systems for nonlinear classification. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds.) ICAISC 2015. LNCS (LNAI), vol. 9119, pp. 448–468. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19324-3_41
Chapter Google Scholar
Khan, N.A., Shaikh, A.: A smart amalgamation of spectral neural algorithm for nonlinear Lane-Emden equations with simulated annealing. J. Artif. Intell. Soft Comput. Res. 7(3), 215–224 (2017)
Article Google Scholar
Liu, H., Gegov, A., Cocea, M.: Rule based networks: an efficient and interpretable representation of computational models. J. Artif. Intell. Soft Comput. Res. 7(2), 111–123 (2017)
Article Google Scholar
Notomista, G., Botsch, M.: A machine learning approach for the segmentation of driving Maneuvers and its application in autonomous parking. J. Artif. Intell. Soft Comput. Res. 7(4), 243–255 (2017)
Article Google Scholar
Rotar, C., Lantovics, L.B.: Directed evolution - a new Metaheuristc for optimization. J. Artif. Intell. Soft Comput. Res. 7(3), 183–200 (2017)
Article Google Scholar
Rutkowska, D., Nowicki, R., Hayashi, Y.: Parallel processing by implication-based neuro-fuzzy systems. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2001. LNCS, vol. 2328, pp. 599–607. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-48086-2_66
Chapter MATH Google Scholar
Smoląg, J., Bilski, J.: A systolic array for fast learning of neural networks. In: V NNSC, pp. 754–758 (2000)
Google Scholar
Smoląg, J., Bilski, J., Rutkowski, L.: Systolic array for neural networks. In: IV KSNiIZ, pp. 487–497 (1999)
Google Scholar
Villmann, T., Bohnsack, A., Kaden, M.: Can learning vector quantization be an alternative to SVM and deep learning? Recent trends and advanced variants of learning vector quantization for classification learning. J. Artif. Intell. Soft Comput. Res. 7(1), 65–81 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computational Intelligence, Czȩstochowa University of Technology, Czȩstochowa, Poland
Jarosław Bilski & Bartosz Kowalczyk
Information Technology Institute, University of Social Sciences, Łódź, Poland
Konrad Grzanek
Clark University, Worcester, MA, 01610, USA
Konrad Grzanek

Authors

Jarosław Bilski
View author publications
You can also search for this author in PubMed Google Scholar
Bartosz Kowalczyk
View author publications
You can also search for this author in PubMed Google Scholar
Konrad Grzanek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jarosław Bilski .

Editor information

Editors and Affiliations

Częstochowa University of Technology, Częstochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Kraków, Poland
Ryszard Tadeusiewicz
University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bilski, J., Kowalczyk, B., Grzanek, K. (2018). The Parallel Modification to the Levenberg-Marquardt Algorithm. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2018. Lecture Notes in Computer Science(), vol 10841. Springer, Cham. https://doi.org/10.1007/978-3-319-91253-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-91253-0_2
Published: 11 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91252-3
Online ISBN: 978-3-319-91253-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Parallel Modification to the Levenberg-Marquardt Algorithm

Abstract

Similar content being viewed by others

Parallel Approach to the Levenberg-Marquardt Learning Algorithm for Feedforward Neural Networks

A New Computational Approach to the Levenberg-Marquardt Learning Algorithm

Parallel Levenberg-Marquardt Algorithm Without Error Backpropagation

Keywords

1 Introduction

2 The Classic Levenbert-Marquardt Algorithm

3 Parallel Modification

4 Results

4.1 Approximating \(y = 4x(1 - x)\) Function

4.2 Approximating \(y = \sin {x} \cdot \log {x}\) Function

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

The Parallel Modification to the Levenberg-Marquardt Algorithm

Abstract

Similar content being viewed by others

Parallel Approach to the Levenberg-Marquardt Learning Algorithm for Feedforward Neural Networks

A New Computational Approach to the Levenberg-Marquardt Learning Algorithm

Parallel Levenberg-Marquardt Algorithm Without Error Backpropagation

Keywords

1 Introduction

2 The Classic Levenbert-Marquardt Algorithm

3 Parallel Modification

4 Results

4.1 Approximating \(y = 4x(1 - x)\) Function

4.2 Approximating \(y = \sin {x} \cdot \log {x}\) Function

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation