1 Introduction

It is often observed that, software developers give more emphasis on object-oriented development methodologies in the present day scenario, because of its inherent advantages of traditional development approach such as cohesion, coupling, inheritance etc. To assess the quality of the software, developed on the basis of object-oriented methodology, different software metrics are used. The usefulness of these metrics lies in their ability to predict the quality of the developed software. Software quality attributes that have been identified by ISO/IEC 9126 (Jung et al. 2004) are efficiency, functionality, portability, maintainability, reliability and usability. In recent years maintainability is considered as important quality parameter for achieving considerable success in software system (Misra 2005; Zhou and Baowen 2008; Malhotra and Chug 2014). Software maintainability means the capability of the software system or component to fix or correct the faults, improve performance, or adapt to changes in environment. A good number of researchers have concluded that the maintainability prediction model can be developed using source code metrics (Misra 2005; Chen and Huang 2009; Chidamber and Kemerer 1994; Li and Henry 1993; Basili et al. 1996; Damaševičius and Štuikys 2010; Misra 2007; Baski and Misra 2011; Misra and Akman 2008; Misra et al. 2011), which are used to measure the internal structure of software system i.e., complexity, coupling, cohesion, inheritance, and size. There are several source code metrics proposed by different researchers such as, Abreu MOOD metrics suite (Abreu and Carapuca 1994; Kang and Bieman 1995; Briand et al. 2000; Halstead 1977; Henderson-Sellers 1996; Li and Henry 1993; McCabe 1976; Lorenz and Kidd 1994), and CK metrics (Chidamber and Kemerer 1994) suite, to measure the internal structure of software systems. In this work, only those source code metrics are selected which have strong relationships with maintainability of software. So, we have considered CK metrics suite consisting different source code metrics such as depth of inheritance (DIT), weighted methods per class (WMC), number of children (NOC), response for a class (RFC), lack of cohesion in methods (LCOM), and Li and Henry metrics consisting different source code metrics such as data abstraction coupling (DAC), message passing coupling (MPC), and number of local methods (NOM) and size metrics [traditional line of code (SIZE1), and total number of attributes and methods of a class (SIZE2)] to develop a maintainability prediction model for predicting maintainability of object-oriented software systems. These source code metrics mostly emphasize on quality aspect of a class.

Performance of the maintainability prediction model depends on choosing the right set of object-oriented source code metrics. Feature selection is a process of selecting a suitable subset of object-oriented metrics from the available ones. In our work, RSA and PCA are considered in finding the right subset of source code metrics (Kumar and Rath 2015). In RSA, subset of features are identified to collectively improve predictive capability while in PCA, initial set of features data are considered and builds the derived values.

There are a number of mechanisms employed in literature for software maintainability prediction. Some of the extensively employed approaches for maintainability prediction are regression based model, association rule mining, clustering, neural network, Bayesian network, SVM etc. (Malhotra and Chug 2014; Zhou and Leung 2007; Kumar and Rath 2015; Kumar et al. 2015). Development of accurate maintainability prediction model to predict maintainability of class is still a challenging task in software engineering discipline. In this study, hybrid approach of neural network and Fuzzy logic (Neuro-Fuzzy approach) has been considered while developing a model to predict maintainability of object-oriented software. Parallel computing concept is used to accelerate training procedure of the neural network model (Kumar and Rath 2015). Parallel computing algorithms are classified into two subclasses consisting of node and training dataset parallelism. In node parallelism, neurons are mapped into different computing node for pipelining the process. In training dataset parallelism, a complete structure of neural network is assigned to each computing node. Each node conducts entire computation for neural network. In this study training dataset parallelism has been considered to accelerate the neural network training procedure.

The generic steps followed to predict maintainability of any object-oriented software are shown in Fig. 1. Initially the classes of the respective software are identified from the class diagram and next, the different metric values of a class are extracted using different tools such as CKJM, LOCMetric analyzer etc. as available in literature.Footnote 1 Then, different feature selection techniques are considered to find the right sets of features. Further, these metric sets are considered as input to the developed model in order to predict maintainability of all individual classes in a software.

Fig. 1
figure 1

Flow chart for maintainability prediction

This study intends to focus on:

  • Identification of suitable set of source code metrics for maintainability prediction.

  • Development of a maintainability prediction model using Neuro-Fuzzy approach.

  • Accelerating the Neuro-Fuzzy approach training procedure.

The rest of the paper is organized in this way: Sect. 2 shows the existing literature in the field of maintainability prediction. Section 3 highlights the experimental dataset used to develop the maintainability prediction model. The feature selection techniques and research methodology are described in Sects. 4 and 5, respectively. Section 6 presents the performance parameters used for evaluating the models. Section 7 presents the research framework and also highlights on the results for maintainability prediction, achieved by applying Neuro-Fuzzy approach. In Sect. 8, various threats to validity are discussed and Sect. 9 concludes the paper with scope for future work.

Table 1 Summary of empirical literature on maintainability

2 Related work

This section presents a review of literature on the use of software metrics and their application in maintainability prediction. In literature, different combination and subsets of software metrics have been analyzed and relationship is derived between the object-oriented metrics and maintainability as mentioned in Table 1. In Table 1, the first column indicates the name of the author, and the year in which the work was carried out. The second column indicates the different subsets of object-oriented metrics, considered to develop a model for predicting maintainability of object-oriented software. Last column of the table represents the techniques used to develop a maintainability prediction models.

From Table 1, it is observed that in all the studies made by different authors, independent variables are the different subset of the object-oriented metrics and the dependent variable is the maintainability of object-oriented software. This shows that the performance of maintainability prediction model depends on the software metrics which have been considered as input to develop a model. Selection of right set of feature is an important data preprocessing task in different application of data mining and machine learning (Huang and Chow 2005; Kabir et al. 2010). In this work, two different types of features selection techniques i.e. principal component analysis (PCA), and rough set analysis (RSA) have been considered to find right subset of software metrics (Pawlak 1982). The effectiveness of these feature selection techniques are evaluated using Neuro-Fuzzy approach (Adeli and Hung 1994).

From Table 1, our observations suggest that regression based analysis and their different forms are commonly used by various authors; but very less work has been carried out on using neural network models for maintainability prediction. Neural network models over the years have seen an explosion of interest, and their applicability across a wide range of problem domains. In this paper, Neuro-Fuzzy approach is used to develop the maintainability prediction model (Adeli and Hung 1994).

3 Experimental dataset and setup

The following subsections highlight on model used for prediction of quality parameter i.e., maintainability, considering different case studies. Data are normalized to obtain better accuracy and then dependent, along with independent variables are chosen for maintainability estimation.

Fig. 2
figure 2

Source code metrics

3.1 Experimental dataset

In this paper, two commercial object-oriented software published by Henry are used as case studies (Li and Henry 1993). Software such as quality evaluation system (QUES), and user interface system (UIMS) are chosen for maintainability prediction. These case studies were chosen mainly because many researchers recently used these case studies to evaluate the performance of their developed maintainability prediction models (Kumar and Rath 2015; Zhou and Leung 2007; Koten and Gray 2006; Elish and Elish 2009; Al-Jamimi et al. 2012; Aljamaan et al. 2013; Kumar and Rath 2015) and hence this study intends to be able to compare performance of proposed model with the performance of published models. QUES, and UIMS software have 71 and 39 number of classes. These softwares are developed using Classic-ADa programing language. Classic-ADa is an object-oriented programing language that adds the capability of object-oriented programing to ADa by providing object-oriented construct in addition to the ADa constructs (Li and Henry 1993).

3.2 Dependent variable: maintainability

Number of definition and metrics for software maintainability are defined by different authors (Li and Henry 1993; Zhou and Leung 2007; Banker et al. 1993). From literature, it is observed that most of the authors have used ‘maintainability index (MI)’ and ‘change metrics’ as the factors to determine maintainability of software. In this work, maintainability is defined as the amount of change (usually a number) brought in the code throughout the maintenance period. A line change can be considered as an ‘addition’ or ‘deletion’ of lines of code in a class during maintenance period (3 year) (Malhotra and Chug 2014; Li and Henry 1993; Zhou and Leung 2007; Aggarwal et al. 2005; Riaz et al. 2009, 1997). Figure 3 shows the data boxplots of maintainability for QUES and UIMS datasets. The line in the middle of box represents the median of the data, lower and upper quartile show the value of 25 and 75% of the data, and lower and upper extreme show the minimum and maximum value of the data. From Fig. 3, it is observed that the median value (middle line of the box plot) of CHANGE in QUES software is higher than those in the UIMS. This suggests that, UIMS software is more maintainable.

Fig. 3
figure 3

Descriptive statics of maintainability

Table 2 Descriptive statistics of classes for UIMS and QUES

3.3 Predictor variables: source code metrics

Number of software metrics have been proposed for different application such as fault prediction, maintenance effort, cost estimation etc. In this paper, ten different object-oriented static source code metrics such as depth of inheritance (DIT), weighted methods per class (WMC), response for a class (RFC), data abstraction coupling (DAC), lack of cohesion in methods (LCOM), number of children (NOC), message passing coupling (MPC), number of local methods (NOM), traditional line of code (SIZE1), and total number of attributes and methods of a class (SIZE2). The detail description of these metrics are described in Li and Henry (1993). Figure 2 shows the data boxplots of source code metrics for QUES and UIMS datasets. Table 2 displays the descriptive statistics such as: Min, 25%, Max, Mean, Median, 75% and Standard deviation for all the metrics across both projects.

From Fig. 2 and Table 2, observations made are:

  1. 1.

    In both the system, DIT metric of classes have low value of median. This low value shows that, both systems have used limited inheritance.

  2. 2.

    The value of ‘NOC’ in QUES software product, has all classes with NOC values to be zero. This indicates that there are no immediate sub-classes and hence NOC is not considered in computing maintenance effort.

  3. 3.

    The median value of coupling metrics (RFC, and MPC) in QUES are higher than those in the UIMS. This suggests that QUES software have high coupling between the classes.

  4. 4.

    The median value of cohesion metrics i.e., LOCM are similar for both system. This suggests that, both systems have similar cohesion.

  5. 5.

    Similar median value of NOM and SIZE2 are found in the both software, suggesting that both systems are similar class size at the design level. However, both software have significant difference in SIZE1.

Fig. 4
figure 4

Correlation between source code metrics

3.4 Cross correlation analysis

In this study, Pearson’s correlations (r Coefficient of correlation) is used to measure the linear relation between different source code metrics. Pearson’s correlations is used to measure the direction and strength of the linear relationship between two attributes. Figure 4 shows the Pearson’s correlation among all source code metrics for UIMS and QUES software system. The sign of the correlation coefficient defines the direction of the relationship, i.e., −ve or +ve. A +ve value of r indicates that the independent and dependent variables grow linearly (when independent variables increases, the dependent metrics also increases, and vice versa). While in case of −ve value of r, the independent variable is inversely proportional to dependent variable (when independent variables increases, the dependent metrics decreases, and vice versa) For the sake of simplicity, the graphs are represented using four different symbols as described below:

  • Black circle (●): r value between 0.7 and 1.0 indicate a strong positive linear relationship.

  • White circle (○): r value between 0.3 and 0.7 indicate a weak positive linear relationship.

  • Black square (■): r value between −1 and −0.7 indicate a strong negative linear relationship.

  • White square (□): r value between −0.7 and −0.3 indicate a weak negative linear relationship.

  • Blank circle: no linear relationship.

From Fig. 4, it is observed that there is a strong positive linear relationship between some metrics such as WMC is highly correlated with MPC, RFC, LCOM, NOM, SIZE1, and Change. This high correlation value between pair of metrics show that, although metrics measure different features of class design, there is a significant statistical reason to believe that classes with low (or high) metric values also have low (or high) values of other highly correlated metrics. In this paper, Principal component analysis (PCA) has been considered to extract new set of metrics from original metrics set that have low correlation value.

3.5 Effectiveness of metrics

In this study, three different set of metrics (one containing all metrics, selected metrics using RSA, extracted feature using PCA) are considered as input to develop maintainability prediction model. The dependent and independent variables of the model are shown in Table 3.

Table 3 Effectiveness of metrics
Fig. 5
figure 5

Framework of PCA calculation

4 Feature extraction and selection using PCA and RSA

Since the performance of the maintainability prediction model is highly influenced by the quality of the maintainability dataset, consisting of software metrics and the maintainability information, the selection of the right set of software metrics becomes an important step of the maintainability prediction process. In this study, two different types of features selection techniques have been considered to select right subset of object-oriented metrics out of total available object-oriented metrics which are able to predict maintainability of object-oriented software with higher accuracy.

4.1 Principal component analysis (PCA)

Principal component analysis (PCA) concept was first develop by Karl Pearson in 1901 (Kumar and Rath 2015). It is a statistical technique used for transfer to feature space of lower dimension having the most significant features from data space of high dimension. The detail steps of PCA is described in Fig. 5.

From Fig. 4, it is clear that many pairs of source code metrics have high (>0.7) correlation value. PCA is used to extract metrics from raw metrics that have low correlation value, we call the new PCA. In PCA, only those metrics are selected which have Eigenvalue being more than 1. Table 4 shows the rotated component metrics of raw data. In this paper, the rotation is performed to maximizes the sum of the variances of the squared coefficients within each eigenvector. The value greater then 0.7 (shown bold in Table 4) are the metrics, which are used to interpret the principal component. Table 4 also shows the eigenvalue, variance percentage and cumulative percentage.

Table 4 Rotated principal component

The interpretations of principal component for QUES software (similar for UIMS software) are given as follows:

  1. 1.

    PC1: WMC, RFC, LOCM, NOM, DAC, SIZE1, and SIZE2 are size, coupling, and cohesion metrics. PC1 contains the size, cohesion, and coupling metrics.

  2. 2.

    PC2: PC2 contains coupling metrics i.e., MPC that counts the number of send statements in class.

  3. 3.

    PC3: PC3 contains inheritance metrics i.e., DIT that measure the depth of inheritance tree of a class.

4.2 Rough set analysis

Pawlak described rough set analysis as a formal approximation of a conventional (CRISP) set (Pawlak 1982). Lowe and upper bound of the raw data are used to represent this formal approximation. The application of this formal approximation is to analysis of various data types, especially when dealing with inexact, vague and uncertain data. Figure 6 shows the steps followed to obtain reduced attribute set.

Fig. 6
figure 6

Framework of rough set theory

In this study, RSA is used as feature selection technique to select right set of metrics for improving performance of maintainability prediction model. In RSA, data needs to be classified before applying RSA. In this work, K-means clustering was used to classify the data. The approach followed in K-means clustering is shown in Fig. 7. The distance from each object to the centroid is computed using the euclidean distance concept.

Fig. 7
figure 7

Flow chart K-means clustering algorithm

Equation 4 shows the function for computing the euclidean distance.

$$\begin{aligned} d(x,y)=\sum _{i=1}^{p} |x_{i}-y_{i}| \end{aligned}$$
(4)

The cluster center’s of each source code metrics for our case studies are shown in Table 5.

Table 5 Cluster Center of source code metrics

After computing the value of cluster center, each source code metrics are categorized into three different groups. Table 6 shows the range of source code metrics in each group. The selected set of source code metrics are tabulated in Table 7. The reduced set of source code metrics obtained using rough set analysis are considered as input to develop maintainability prediction model. The selected set of source code metrics are tabulated in Table 7.

Table 6 Group range
Table 7 Reduciced attribute

5 Research methodology: Neuro-Fuzzy approach

Neuro-Fuzzy approach helps to build hybrid architecture of neural network and fuzzy logic which maps fuzzy inputs to a crisp output (Adeli and Hung 1994). In Neuro-Fuzzy approach, Fuzzy back propagation (Fuzzy BP) architecture is used to train the network as shown in Fig. 8.

Fig. 8
figure 8

Neuro-Fuzzy architecture

In Fig. 8, the input vector I is represented as: \(I=(I_{0}, I_{1}, I_{2}, I_{3},\ldots , I_{n})\) and weight W vector is represented as \(W=(W_{0}, W_{1}, W_{2}, W_{3},\ldots , W_{n})\). In Neuro-Fuzzy approach, output O is computed as:

$$\begin{aligned} O=f({\textit{NET}})=f\left( {\textit{CE}}\left( \sum _{i=0}^{n}W_{i}I_{i}\right) \right) \end{aligned}$$
(5)

where n is the no of inputs, I and W represent the input and weight vector respectively. NET is computed as:

$$\begin{aligned} {\textit{NET}}={\textit{CE}}({\textit{net}}) \end{aligned}$$
(6)

where net is defined as:

$$\begin{aligned} {\textit{net}}=\sum _{i=0}^{n}W_{i}I_{i} \end{aligned}$$
(7)

The CE function is the centroid of the triangular fuzzy number. It maps fuzzy weighted summation to a crisp value which is called defuzzification operation. If \({\textit{net}}=({\textit{net}}_{m},{\textit{net}}_{\alpha },{\textit{net}}_{\beta })\), then \({\textit{CE}}({\textit{net}})\) is defined as:

$$\begin{aligned} {\textit{CE}}({\textit{net}})={\textit{net}}_{m}+\dfrac{1}{3}\times ({\textit{net}}_{\alpha }-{\textit{net}}_{\beta })={\textit{NET}} \end{aligned}$$
(8)

In this paper, sigmoidal function is used as output function, which maps non-linearly input to output. Accordingly Eq. 5 can be represented as:

$$\begin{aligned} {\textit{Output}}={\textit{fun}}({\textit{NET}})=\dfrac{1}{1+e^{(-{\textit{NET}})}} \end{aligned}$$
(9)

In fuzzy neurons, both input and weight vector are represented by triangular left-right fuzzy numbers (LR-type). LR-type fuzzy numbers are type of representation for fuzzy numbers, proposed by Dubois and Prade in 1979’s (Dubois and Prade 1979). A fuzzy number M is of LR-type if there exist reference functions L (for left), R (for right) and scalars, \(\alpha>0, \beta >0\) with

$$\begin{aligned} \mu _{m }= \left\{ \begin{array}{lll} L\left( \frac{m-x}{\alpha }\right) &{} {\textit{for}} &{} x\le m\\ \\ R\left( \frac{x-m}{\beta }\right) &{} {\textit{for}} &{} x>m \end{array} \right. \end{aligned}$$
(10)

where m is the mean value of fuzzy number ‘M’. \(\alpha\) and \(\beta\) are refereed to as left and right spreads as shown in Fig. 9.

Considering \(\mu _{M}\) as the fuzzy number ‘M’ membership function. An LR-type of Fuzzy number ‘M’ can be defined as \((m,\alpha ,\beta )\). Thus for input vector \(I=(I_{0}, I_{1}, I_{2}, I_{3}, \ldots , I_{n})\), single input \(I_{i}\) is represented as \(I_{i}=(I_{mi}, I_{\alpha i}, I_{\beta i})\). Similarly for weight vector \(W=(W_{0}, W_{1}, W_{2}, W_{3},\ldots , W_{n})\), single weight \(W_{i}\) is represented as \(W_{i}=(W_{mi}, W_{\alpha i}, W_{\beta i})\).

Fig. 9
figure 9

Symmetric triangular LR-type fuzzy number

Fuzzy back propagation architecture Fuzzy BP is a three layered feed-forward architecture i.e., input, hidden, and output layer. The execution of Fuzzy BP model is a two stage process such as:

  1. 1.

    Learning or training The learning procedure of fuzzy BP follows the gradient descent method of minimizing error. Here, mean square error (MSE) for patterns p is represented as follow:

    $$\begin{aligned} E_{p}=\sum _{p=1}^{n}\dfrac{1}{2}\left( O_{p}''-O_{p}\right) ^{2} \end{aligned}$$
    (11)

    where O and O′′ are the actual and expected output respectively.

    During learning phase, the weights are updated using following equation:

    $$\begin{aligned} W_{k+1}=W_{k}+\varDelta W_{k} \end{aligned}$$
    (12)

    where \(\varDelta W_{k}\) is computed using the following equation:

    $$\begin{aligned} \varDelta W_{k}=-\eta \bigtriangledown E_{ik} + \alpha W_{k} \end{aligned}$$
    (13)

    where \(\bigtriangledown E_{ik}\) is given by

    $$\begin{aligned} \bigtriangledown E_{ik}=\frac{\partial E_{ik}}{\partial W}=\left( \frac{\partial E_{ik}}{\partial W_{m}}, \frac{\partial E_{ik}}{\partial W_{\alpha }}, \frac{\partial E_{ik}}{\partial W_{\beta }}\right) \end{aligned}$$
    (14)

    and the weight vector W is represented as: \(W=(W_{m}, W_{\alpha }, W_{\beta })\).

  2. 2.

    Inference After fuzzy BP execution, the model is trained using training data set and then it is ready for inference. Considering a set of patterns \(\bar{F_{p}}\) to be inferred, where \(\bar{F_{p}}=(\bar{F_{p1}}, \bar{F_{p2}}, \bar{F_{p3}},\ldots , \bar{F_{pn}})\) and \(\bar{F_{pi}}\) is an LR-type fuzzy number given by \(\bar{F_{pi}}=(\bar{F_{pmi}}, \bar{F_{p\alpha i}}, \bar{F_{p\beta i}})\). The objective is to obtain output \(O_{p}\) for the corresponding patterns \(F_{p}\). Algorithm 1 is used in inference of fuzzy BP.

figure c

6 Performance evaluation parameters

Software maintainability estimation accuracy for a model designed by using AI techniques is determined by using four different performance parameters such as Mean Absolute Error (MAE), Mean magnitude Relative Error (MMRE), and Standard Error of the Mean (SEM) (Menzies et al. 2006). They are represented as:

$$\begin{aligned} {\textit{MAE}}= & {} \frac{1}{n} \sum _{i=1}^{n} \left( |X'_{i}-X_{i}|\right) \end{aligned}$$
(18)
$$\begin{aligned} {\textit{MMRE}}= & {} \frac{1}{n} \sum _{i=1}^{n} \frac{|X_{i}-X'_{i}|}{X_{i}+0.05} \end{aligned}$$
(19)

In Eq. 19, a numerical value of 0.05 is added in the denominator in order to avoid numerical overflow (division by zero). In Eqs. 18, 19, \(X'_{i}\) and \(X_{i}\) show the estimated and actual value respectively.

$$\begin{aligned} {\textit{SEM}}= \frac{{\textit{SD}}}{\sqrt{n}} \end{aligned}$$
(20)

where SD is the sample standard deviation, and n is the number of samples.

7 Research framework and experimental results

Figure 10 shows our research framework consisting of various steps. The first step in the process is to compute the source code metrics and maintainability for UIMS and QUES software system. The source code metrics serve as the predictor or independent variables.

Fig. 10
figure 10

Framework of proposed work

As shown in Fig. 10, we conduct experiments with three different sets of source code metrics. One set of metrics consist of all the ten source code metrics. Another set consist of experiment with principal component analysis (PCA) as a preprocessing step for dimensionality reduction. PCA reduces dimensionality by selecting a subset of variables that preserves as much information present in the original set of variables. We also apply another technique using rough set theory for feature selection. We use rough set analysis to remove features with little and no effect on the dependent variable. These three sets of source code metrics are validated using hybrid approach of neural network and fuzzy logic i.e., Neuro-Fuzzy approach. Before applying Neuro-Fuzzy approach, data are normalized over the range between 0 to 1 i.e., [0 1] using Min–Max normalization technique (Kumar and Rath 2014). Normalization of data are required to adjust the defined range of input attribute and avoid the saturation of neurons. In this study, data are normalized using Min–Max normalization technique (Kaur et al. 2010). We also use fivefold cross validation to create different partitions of training and testing data and generalize the result of our analysis. Cross-validation method is based on statistical learning concept. It is used to compare and evaluate the models by separating the data into two groups. One group of separated data is used for learning or training the model and other data is used for validation of model. The basic form of cross-validation method is K-fold cross-validation method. In this method, data are separated into K equal or nearly equal size group or folds. For each model, \(K-1\) group or folds data are used for learning or training the model and one group data are used for validation of model. The objective of K-fold cross-validation method is to use each data point for both training and validation.

In this study, different subsets of static source code metrics are considered as input to develop a maintainability prediction model using Neuro-Fuzzy approach. Mean absolute error (MAE), mean absolute relative error (MARE), and standard error of the mean (SEM) are taken as a performance parameter to compare the models. Parameters like True error (e) and Estimate of true error (\(\hat{e}\)) are being used for evaluating models involving cross validation approach.

7.1 Performance evaluation

In this paper, a three layered architecture of Fuzzy BP is considered. The software metric was used as input data to train the network using Fuzzy BP. This input data is represented using LR type Fuzzy number. Figure 11 shows the triangular membership function in which \(C_{1}, C_{2}, C_{3}\) are the cluster centers which are found by using K-mean clustering algorithm. Table 9 contains the cluster centers \(C_{1}, C_{2}, C_{3}\) of software metrics suite for UIMS and QUES.

Table 8 LR-Type fuzzy no. equivalent for fuzzy sets associated with software metrics
Fig. 11
figure 11

Symmetric triangular LR-type fuzzy number

Fig. 12
figure 12

MSE versus number of iterations (epoch)

Table 9 Cluster Center of software metrics

Table 8 shows the LR-type fuzzy number equivalents of fuzzy terms for UIMS and QUES respectively. In this paper, fivefold cross-validation concept has been considered for both QUES and UIMS for comparing the models. Table 10 shows the obtained performance metrics for UIMS and QUES software products.

Table 10 Performance matrix

From Table 10, it may be concluded that the performance in estimating software maintainability is better when PCA in UIMS and QUES (A2 analysis) is considered. Figure 12 shows the variance of mean square error and iteration number of UIMS and QUES.

In this work, Pearson residual boxplots are used for visual comparison between all developed models. Figure 13 shows the Pearson residual boxplots for all developed models. Boxplot diagrams help to observe performance of all developed maintainability prediction based on a single diagram. The top and bottom of each box represent the 75 and 25% of Pearson residual. The line in the middle of each box represents the median of the Pearson residual. The 1st and 2nd sub-figure of Fig. 13 show the Pearson residual for UIMS and QUES case studies. From Fig. 13, it may be observed that, in both case studies i.e., UIMS and QUES, analysis A2 has narrowest box and the smallest whiskers, as well as the few number of outliers. Based on these boxplots, it is evident that analysis A2 shows best estimation accuracy as compared to other two analysis i.e., A1 and A3. Hence it is observed that the model developed by considering extracted set of features using PCA feature selection technique, yields better maintenance accuracy value.

Fig. 13
figure 13

Residual boxplot for UIMS and QUES

Fig. 14
figure 14

Training time versus number of computing node

7.2 Parallel computing concepts

Core i5 processor with 4GB RAM and storage capacity of 250GB hard disk are the hardware items used in this study. Here computing node is processed that runs on a physical core. In this study, various number of computing nodes are considered ranging from one to five. fivefold cross validation techniques are used for both case studies, with the following steps:

  • In case of five computing nodes, each fold data is assigned to each and every computing node.

  • In case of four computing nodes, two folds data are assigned to first computing node and onefold of data to the remaining computing nodes.

  • In case of three computing nodes, two folds data are assigned to first two computing nodes and onefold of data to the remaining computing nodes.

  • In case of two computing nodes, first three folds of data set are assigned to first computing nodes and reaming two folds data to the second computing nodes.

  • Finally, in case of single computing node, all the fivefolds data are assigned to the existing single computing node.

Figure 14 shows the time taken for training the model based on number of computing nodes. From Fig. 14, it can be inferred that normal process take less time as compared to single node on parallel computing process. Further, it is observed the time taken for training the model is reduced on an average 21.14% in QUES and 18.70% in UIMS, when the number of computing nodes are increased.

7.3 Comparison of models

From literature, it is observed that some authors such as Zhou and Leung (2007), Koten and Gray (2006), Elish and Elish (2009), Al-Jamimi et al. (2012), Aljamaan et al. (2013), Kumar and Rath (2015) and Kumar and Rath (2015) used same case studies for maintainability prediction using methods. They had considered performance parameter ‘MMRE’ to compare the performance of maintainability prediction models. Table 11 shows the MMRE value of the proposed work and the work done by Zhou and Leung (2007), Koten and Gray (2006), Elish and Elish (2009), Al-Jamimi et al. (2012), Aljamaan et al. (2013), Kumar and Rath (2015). From Table 11, it can be observed that, MMRE value is almost same in case of QUES, but our proposed model obtained better performance for maintainability prediction in case of UIMS. The proposed technique reduces computational time of Kumar and Rath (2015) by approx 50%.

Table 11 Performance based on MMRE for UIMS and QUES

8 Threats to validity

For the sake of completeness, it is necessary to indicate some of the existing threats to validity of the proposed work. In literature, some of these limitations are very common when any empirical study is conducted (Malhotra and Chug 2014; Zhou and Leung 2007; Koten and Gray 2006; Elish and Elish 2009; Al-Jamimi et al. 2012; Chandra 2012; Aljamaan et al. 2013). Like most of the existing works, the proposed work also suffers from following threats:

  1. 1.

    All case studies are designed in ADa language. However the models designed in this study are likely to be valid for other object-oriented programing languages. Further research can be extended to design a model for other programing paradigms too.

  2. 2.

    In this work, only eleven different source code metrics are considered for development of maintainability prediction models. Some of the static source code object-oriented metrics which are also used for object-oriented software can be further considered for maintainability prediction.

  3. 3.

    Number of psychological factors also affect the reliability of software. But in this study, factors such as history of development of the system, stakeholders of the system, different level of expertise for developers, standards in which software is developed are not considered.

9 Conclusion

In this paper, an effort has been made to design a model for maintainability prediction of object-oriented system by considering software metrics as input. Neuro-Fuzzy approach: Hybrid neural network and fuzzy logic approach is used to develop maintainability prediction model for two commercial object-oriented software. Training data parallelism concept was used to accelerate the training or learning procedure of Neuro-Fuzzy approach. The concept involved in usage of varying number of computing nodes was explored in this analysis. The performance of Neuro-Fuzzy model was assessed and compared with those of the work carried out by several researchers (Kumar and Rath 2015; Zhou and Leung 2007; Koten and Gray 2006; Elish and Elish 2009; Al-Jamimi et al. 2012; Aljamaan et al. 2013; Kumar and Rath 2015). The result shows that, Neuro-Fuzzy model can effectively predict the maintainability of object-oriented software systems. From this analysis, it is also observed that, the training time gets reduced on an average 21.14% in QUES and 18.70% in UIMS when the number of computing nodes are increased. Our result also suggested that, with the selected subset of metrics using feature selection techniques i.e., PCA, and RSA, predict maintainability with higher accuracy is more suitable.

In this work, analysis has been made, based on two commercial software datasets i.e., UIMS and QUES considered as case study, being developed in single language i.e., ADa. Future work of this study for other development paradigms, may be planned subsequently. Another interesting work in the future is to improve the performance of prediction models by coupling neural network with other techniques such as Clonal selection algorithm, particle swarm optimization etc.