Keywords

1 Introduction

According to ANSI, software reliability is determined as the probability of software failure-free operation for a specific point of time in a conditioned environment [1, 2]. In the last four decades, most of the linear prediction models such as Software Reliability Growth Models (SRGMs) have been designed for software reliability prediction, weather forecasting, cost estimation and its time factors. Parametric neural network models are established on the linear Non-Homogeneous Poisson Process (NHPP). Lastly, it has been concluded that all parametric linear models cannot predict efficiently in all circumstances. Most of the nonparametric statistical prediction models such as artificial neural networks like Feed Forward Neural Network (FFNN) can predict different types of software reliability metrics like availability, time between failures and cumulative failures, etc. At last, it is concluded that the ensemble prediction architectural model is superior to other types of the parametric linear models and has also better prediction capability [35].

This work presents a non-parametric architectural model for software reliability forecasting based on ensemble techniques which can predict efficiently software reliability data than other neural network approach and linear mathematical parametric model like Duane growth model [6].

The remainder of the paper is organized as follows. In Sect. 2, some related works for forecasting of software reliability of cumulative failure data are presented. Section 3 discusses the proposed ensemble approach for software reliability forecasting. The experimental results and discussion is presented in Sect. 4. Finally, Sect. 5 concludes the paper.

2 Literature Survey

This section briefly discusses some related works based on various types of artificial neural network models for prediction of software reliability.

Karunanithi et al. [5] presented connectionist model based on feed forward and recurrent neural network. They observed that this model works well in all circumstances for different types of software reliability datasets than some analytical models.

Su et al. [4] proposed an artificial neural network modeling approach such as Dynamically Weighted Combinational Model (DWCM) for prediction of software failure history data and software reliability estimation. He compared his proposed model with some mathematical models and proved that his proposed model accurately predict than other linear mathematical models.

Sitte [7] used two types of software reliability prediction growth models such as FFNN and recalibration of analytical growth model for prediction. He claimed that both prediction models have better prediction capability for common datasets in all circumstances.

Tian and Noore [8] proposed an evolutionary connectionist approach for forecasting of cumulative software failure data. He used multiple delayed input and single output neural network architectural model. He observed that his proposed model performs efficiently than other prediction model.

Cai et al. [9] observed the effectiveness of an artificial neural network for software reliability prediction from software failure history data and found several new things such as the neural network modeling approach is the best approach to handling software failure data and smoothly trends than other traditional models. The training results of neural network modeling approach are more appropriate than other linear models. He also observed that neural network modeling approach is quantitatively bad for prediction of software defect prone and qualitatively good for classifying different modules of software.

Related work reveals that most of the prediction growth models used SRGMs. However, use of ensemble technique is limited. So, this work specially focuses only on ensemble technique of three standard artificial neural network models with a goal to obtain efficient and accurate prediction.

3 Proposed Work

This section portrays the proposed model for prediction of software failure data which utilizes ensemble techniques of the artificial neural network.

3.1 Ensemble Model

The prediction model based on the Ensemble technique is depicted in Fig. 1. Ensemble model is a three layer single input and single output architectural model such as an input layer, a component layer consists of feed forward, radial basis function neural network and an output layer is an average combination of output of all component layers. The component layer consists of three components and have used three types of activation functions for three components. The cumulative software failure data are organized in pair {T i , N i ′}, where T i is the execution time of software failure dataset as input of the ensemble model and N i is the cumulative number software failure data as output of the ensemble model. The output of the Ensemble model is the mean of all three neural network components and is defined as follows.

Fig. 1
figure 1

The architecture of ensemble model

$$ N_{i}^{'} = \frac{{N_{ i1} + N_{i2} + N_{i3} }}{3} $$
(1)

3.2 FFNN Component

Two types of FFNNs have been used for our Ensemble model such as FFNN1 and FFNN2. The FFNN model is shown in Fig. 2.

Fig. 2
figure 2

FFNN model

The node in the FFNN model is computed as the sum of weighted sum of input data and bias value and the mathematical definition of this process is defined in Eq. (2)

$$ \begin{aligned} a_{i} & = \sum\limits_{j = 1}^{n} {w_{ij} x_{j} + b_{i} } \\ y_{i} & = f_{i} \,(a_{i} ) \\ \end{aligned} $$
(2)

where \( a_{i} \) is the linear combination of input data and bias value \( b_{i} \) and \( w_{ij} \) is the weight matrices of FFNN model.

For FFNN1 we have used the transfer function such as log sigmoid as activation function and is defined in Eq. (3)

$$ f(n)\text{ = }\frac{1}{{1\text{ + e}^{ - n} }} $$
(3)

For FFNN2 we have used the transfer function such as tan sigmoid as activation function and is defined in Eq. (4)

$$ f(n)\text{ = } \frac{2}{{1\text{ + e}^{ - 2*n} }} - 1 $$
(4)

3.3 RBFN Component

Instead of using sigmoid function in FFNN, we have used the transfer function as radial basis function in the hidden layer of FFNN [10, 11] and defined as follows

$$ f(y) = \sum\limits_{i = 1}^{k} {w_{i} \,{\emptyset }(x - c_{i} )} $$
(5)

where k is the number of neurons in the hidden layer, x is the number software failure data as input, w i is the weight matrices of respective neuron i and c i is the centroid vector for neuron i.

The radial basis function \( {\emptyset }(x - c_{i} ) \) as given by

$$ {\emptyset }(x - c_{i} ) = \sqrt {\sum\limits_{i = 1}^{\text{k}} {(x - c_{i} )^{2} } } $$
(6)

3.4 Performance Measures

Two types of meaningful performance measures have been used to compare the reliability prediction error of ensemble model and its competent models. Here, the proposed ensemble model is trained with some part of software failure data and the rest of software failure data is used for testing purpose. For performance measurement we have used two types of errors called Relative Error (RE) and Average Error (AE) are defined as

$$ RE = \frac{{(\hat{y}_{i} - y_{i} )}}{{y_{\text{i}} }} * 100 $$
(7)
$$ AE\text{ = }\frac{1}{n}\sum\limits_{i = 1}^{n} {RE_{i} } $$
(8)

where, n is the total number of data samples, ŷ i is the predicted value and y i is the actual value.

4 Experimental Results and Comparison

In our experiment, this model is trained and tested with three benchmark datasets DS1, DS2 and DS3 [1] for software reliability forecasting. The dataset DS1 consists of 21,700 numbers of assembly instructions and 136 numbers of failures and collected from real-time command and control application. The dataset DS2 consists of 10,000 assembly instructions and 118 numbers of failures and collected from flight dynamic application. The dataset DS3 consists of 22,500 assembly instructions and 180 numbers of failures and collected from flight dynamic application. All these datasets have been normalized in the range between [0, 1] by min max formula.

The ensemble model is trained with 60, 60 and 55 % for DS1, DS2 and DS3, respectively, in our experiments. The remaining software data for each software failure history dataset is used for testing. We have used different training ratio of different datasets for better prediction results.

We compare our proposed ensemble model with one artificial neural network such as FFNN and another linear mathematical parametric model [6] called Duane growth model: The Duane growth model is given by β(t) = bt a b > 0, a > 0, where a is the is shape of the growth curve and b is the parameter size of the curve.

4.1 Performance Comparison

For ensemble model, we have chosen three components such as FFNN1, FFNN2, RBFN and each component consists of 5 neurons in the hidden layer. For FFNN model, we have taken five neurons in the hidden layer. The prediction results of Ensemble model for DS1, DS2 and DS3 are depicted in Fig. 3. The REs of different model on three datasets DS1, DS2, and DS3 are depicted in Fig. 4. The AEs on three datasets are shown in Table 1. It is concluded that ensemble modeling approach shows better performance than FFNN model and another linear mathematical modeling approach. For all datasets linear parametric growth model called Duane model shows worse performance than others. It is observed from the above results and discussion the Ensemble model is better than alternate traditional growth models.

Fig. 3
figure 3

Prediction results of ensemble model for a DS1, b DS2 and c DS3

Fig. 4
figure 4

Relative error of different models for a DS1, b DS2 and c DS3

Table 1 AEs for different models

5 Conclusion

In this paper, a non-parametric prediction approach has been proposed such as ensemble technique of three different artificial neural networks for software reliability prediction. The proposed ensemble model shows better prediction than other artificial neural network model and another traditional parametric mathematical model. It is observed that from the experimental results and discussions the proposed assembling approach proves the best model and shows lower prediction error than other competent model.