Comparison of weighted grey relational analysis for software effort estimation

Hsu, Chao-Jung; Huang, Chin-Yu

doi:10.1007/s11219-010-9110-y

Comparison of weighted grey relational analysis for software effort estimation

Published: 14 September 2010

Volume 19, pages 165–200, (2011)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Software Quality Journal Aims and scope Submit manuscript

Comparison of weighted grey relational analysis for software effort estimation

Download PDF

Chao-Jung Hsu¹ &
Chin-Yu Huang¹

505 Accesses
43 Citations
Explore all metrics

Abstract

In recent years, grey relational analysis (GRA), a similarity-based method, has been proposed and used in many applications. However, we found that most traditional GRA methods only consider nonweighted similarity for predicting software development effort. In fact, nonweighted similarity may cause biased predictions, because each feature of a project may have a different degree of relevance to the development effort. Therefore, this paper proposes six weighted methods, including nonweighted, distance-based, correlative, linear, nonlinear, and maximal weights, to be integrated into GRA for software effort estimation. Numerical examples and sensitivity analyses based on four public datasets are used to show the performance of the proposed methods. The experimental results indicate that the weighted GRA can improve estimation accuracy and reliability from the nonweighted GRA. The results also demonstrate that the weighted GRA performs better than other estimation techniques and published results. In summary, we can conclude that weighted GRA can be a viable and alternative method for predicting software development effort.

Software Effort Estimation Using Grey Relational Analysis with K-Means Clustering

Case-based reasoning with optimized weight derived by particle swarm optimization for software effort estimation

Article 23 December 2017

Appropriate number of analogues in analogy based software effort estimation using quality datasets

Article 22 January 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

One of the long-existing challenges faced by software project managers is to predict software development effort and cost^{Footnote 1} (Boehm 1981). Accurate and reliable software effort estimation is the foundation of successful project management. Generally, software project managers need to obtain sufficient information regarding the resource distribution to make correct decisions at early development stages (Conte et al. 1986). The allocation of appropriate resources and the planning of reasonable schedules based on the effort estimation then become necessary. Furthermore, the software development process usually includes debugging and testing phases. Without the foresight of development effort, projects may conflict with the level of quality demanded and may possibly encounter failure. For example, underestimating the effort needed for software development may possibly cause the software product to have insufficient time to be tested and consequently force programmers to sacrifice software quality.

On the other hand, software quality cannot be viewed in isolation. In the past, Boehm’s advanced cost model was usually tied to a quality model. A report based on a sample of 63 completed projects showed that a reduction in overall costs and improved productivity can come from applying formal methods or measurement activities (Boehm 1981; Boehm et al. 1995). Besides, a recent survey from Agrawal and Chari (2007) found that many CMM level 5 projects incorporate estimation methods to determine software effort, quality control, and cycle time in a software development process. On average, the usage of estimation methods can significantly predict effort and cycle time around 12% and defects to about 49% of the actual. From the above information, we can note that an accurate and reliable effort estimation technique may be important to both software development process and software quality assurance (Ejiogu 2005; Fenton and Pfleeger 1998).

Many studies have been conducted to investigate different kinds of effort estimation techniques (Jørgensen and Shepperd 2007). Expert judgment, algorithmic models, and similarity-based methods are the main categories of software effort predictions (Boehm 1981; Conte et al. 1986). Generally, the similarity-based method is based on a similarity comparison (usually Euclidean distance) between project features and software development effort (Marir and Watson 1994; Shepperd and Schofield 1997). The nearest neighbour algorithm is generally used to find the most similar project. However, the similarity-based methods still have some drawbacks for application. Many studies have aimed to improve the estimated performance of similarity-based methods (Chiu and Huang 2007; Jørgensen et al. 2003; Leung 2002; Li et al. 2007a).

In recent years, grey relational analysis (GRA), one of the similarity-based methods, has been used extensively in many scientific fields (Liu and Lin 2006). Nevertheless, GRA has rarely been applied to estimate software development effort. The similarity of GRA measures the relative distance between project features and maximal or minimal distance differences (Deng 1989; Song et al. 2005; Wen et al. 2006). We find that most of the existing GRA-based software effort estimation methods only adopt nonweighted (or equally weighted) similarity of project features (Hsu and Huang 2006; Li et al. 2007a; Song et al. 2005). In fact, the relevant features should be given more influential and significant weights in similarity computations. The problem is that equally weighted features will cause downgrades to similarity computations (Auer et al. 2006; Huang and Chiu 2006; Keung and Kitchenham 2007). By contrast, improper weights assigned to irrelative features may contrarily cause a biased determination and could thereby affect the estimated performance (Li and Ruhe 2006; Li et al. 2009b). As a result, how to appropriately determine the weight for each feature may become a research problem when using weighted GRAs.

Therefore, in this paper, we will propose six weighted methods to be integrated into the conventional GRA. Numerical examples based on four datasets and some comparative criteria are used to demonstrate estimated performance. Furthermore, a sensitivity analysis between the parametric settings and the analogous numbers is discussed, and other estimation techniques and published results are then used as comparisons. Finally, we also present some guidelines and management metrics for using weighted GRAs. The following propositions are addressed in this paper. (1) The extent of weighted alterations involved in the GRA methods may lead to improvement in the estimated accuracy and reliability. (2) The different parametric settings and analogous numbers of the GRA methods may be a factor that affects the predicted result. (3) Weighted GRA may be an alternative and feasible method for predicting software development effort in the software development life cycle.

The remainder of this paper is structured as follows. In Sect. 2, we provide a survey of software effort estimation and basic concepts of the GRA method. After that, the proposed methods and experimental procedures are presented in Sect. 3. The explorative studies and numerical results will be demonstrated by the comparative criteria and sensitivity analysis in Sect. 4. Finally, a concluding discussion is described in Sect. 5.

2 Literature review

2.1 Software effort estimation survey

Over the past three decades, a variety of techniques have been proposed in the field of software development effort estimation (Jørgensen and Shepperd 2007). To begin with, Boehm presented two parametric software effort models–the constructive cost model (COCOMO I) (1981) and COCOMO II (1995), both of which are widely applied in practice (Benediktsson et al. 2003; Li et al. 2007b). The effort multipliers of COCOMO I and COCOMO II are used to capture characteristics of the software development that affect the effort to complete the project. If developers want to undertake effort estimation for a specific software project, they need to carefully examine their development process and grade a proper rating for effort multipliers.

Subsequently, the similarity-based method, such as case-based reasoning (CBR) (Marir and Watson 1994; Mendes et al. 2002a) or analogy (Shepperd and Schofield 1997), was discussed and developed along with similar historical projects. Several studies then tried to improve the performance of analogy by adjusting its similarity measure (Chiu and Huang 2007; Li et al. 2007a). Later, attention turned to other software effort estimation techniques unlike the parametric models, including neural network (NN) and classification and regression tree (CART) (Srinivasan and Fisher 1995), genetic algorithm (GA) (Huang and Chiu 2006), and fuzzy theory (Lima Júnior et al. 2003). These studies show that software effort estimation is an important issue in the software development process.

The GRA hardly appears to be used in software effort estimation, although it has been used in many other areas. Song et al. (2005) first introduced a GRA method called GRACE to predict software development effort that possessed superior merits. Later, Li et al. (2007a) adopted Song’s method as a comparative method for evaluating the accuracy of different methods. In our previous work (Hsu and Huang 2006; Hsu and Huang 2007), we also proposed an improved grey method to enhance predicted results, but the parametric settings, analogous numbers, and sensitivity analyses were not completely considered. On the other hand, most of the existing GRA-based methods only adopted nonweighted (or equally weighted) similarity of project features. In fact, nonweighted similarity measures may cause biased predictions, because each project feature may have a different degree of relevance to the development effort. All these problems motivated us towards continuous research on GRA-based methods.

More recently, some studies have improved the traditional similarity-based method by attaching weights on project features for similarity computation. Mendes et al. (2002a, b) developed a weighted CBR for estimating web hypermedia development effort and comparing several regression methods. They claimed that using the weighted CBR in an implementation stage to predict hypermedia development effort was more accurate than using the nonweighted CBR. Li and Ruhe (2006) evaluated weighted heuristics in the analogy method. The results indicated that the estimated effort of the weighted heuristic performed better than that of the equally weighted heuristic. Huang and Chiu (2006) integrated GA into an analogy to determine the weighted similarity of software effort features. They suggested that the weighted analogy should prove a feasible approach to improve the accuracy of software effort estimates. Auer et al. (2006) proposed a brute-force approach applied to an analogy to determine the optimal weights for each project feature. Li et al. (2009b) combined project selection technique and feature weighting with analogy to improve the prediction performance. These studies provide a good basis for introducing weights into similarity-based approaches.

2.2 Conventional GRA

In 1982, Deng introduced grey theory. After that, the grey theory has been applied to a wide range of applications (Deng 1982; Liu and Lin 2006). The grey of a system is absolute, and the fuzziness of a system is relative. The “grey” refers to the information between “black” and “white” (Deng 1989). The “black” means that the required information is totally unknown or unclear, while the “white” indicates that the required information is fully explored. Incomplete information brings great difficulties in the limited availability of data. Grey theory provides a helpful mechanism for seeking the intrinsic information of the system and does not need a specific relationship as an assumption (Song et al. 2005). Since most software projects have incomplete information and uncertain relations between project features and required development effort, grey theory may be suitable to be introduced into software effort estimation.

GRA is a quantitative technique that can be used to analyse the similarity among objects (e.g., software projects). This similarity is the measure of the relative distance between the pairs of object features. According to the definition, if the basic relationship between the features of two respective objects is close, their similarity will be highly related (Deng 2000). Before introducing the detailed computations of the GRA method, a matrix of multi-index sequences should be defined:

$$ X = \left[ {\begin{array}{*{20}c} {X_{1} } \\ {X_{2} } \\ : \\ {X_{i} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {X_{1} (1)} & {X_{1} (2)} & \ldots & {X_{1} (k)} & {X_{1} (Dep)} \\ {X_{2} (1)} & {X_{2} (2)} & \ldots & {X_{2} (k)} & {X_{2} (Dep)} \\ : & : & \ldots & : & : \\ {X_{i} (1)} & {X_{i} (2)} & \ldots & {X_{i} (k)} & {X_{i} (Dep)} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {X(1)} & {X(2)} & \ldots & {X(k)} & {X(Dep)} \\ \end{array} } \right], $$

(1)

where i = 1, 2,…,N, and N is the total number of projects; k = 1, 2,…,M, and M is the total number of features of a project. Each sequence X _i represents a project consisting of M features, and each X(k) represents the kth feature of dataset X. These features can be numeric or categorical values. X _i(Dep) stands for a dependent variable that denotes the known effort of the ith project.

Next, one new sequence is regarded as an observed project, which wants to predict its development effort and is used to compare similarity with other projects. Other sequences in dataset X are taken as comparative projects; the known efforts of which can be used as a comparable base for deriving an estimated effort for the observed project. The degree of similarity can be calculated by comparing these two sequences:

$$ X_{0} = \left[ {\begin{array}{*{20}c} {X_{0} (1)} & {X_{0} (2)} & \ldots & {X_{0} (k)} \\ \end{array} } \right], $$

(2)

and

$$ X_{i} = \left[ {\begin{array}{*{20}c} {X_{i} (1)} & {X_{i} (2)} & \ldots & {X_{i} (k)} \\ \end{array} } \right], $$

(3)

where X ₀ is the observed project with k features, and each X _i is a comparative project. The similarity measure between the features of the observed project and that of the comparative project is defined as the grey relational coefficient (GRC) (Deng 2000; Wen et al. 2006):

$$ \gamma (X_{0} (k),X_{i} (k)) = {\frac{{\min \Updelta_{0i} + \zeta \max \Updelta_{0i} }}{{\Updelta_{0i} (k) + \zeta \max \Updelta_{0i} }}}, $$

(4)

where

$$ \Updelta_{0i} (k) = \left\{ {\begin{array}{*{20}c} {\left| {X_{0} (k) - X_{i} (k)} \right|,} \hfill & {{\text{if }}X_{0} (k){\text{ and }}X_{i} (k){\text{ are numerical}}} \hfill \\ {1,} \hfill & {{\text{if }}X_{0} (k){\text{ and }}X_{i} (k){\text{ are categorical and }}X_{0} (k) \ne X_{i} (k)} \hfill \\ {0,} \hfill & {{\text{if }}X_{0} (k){\text{ and }}X_{i} (k){\text{ are categorical and }}X_{0} (k) = X_{i} (k)} \hfill \\ \end{array} } \right., $$

(5)

and

$$ \min \Updelta_{0i} = \mathop \forall \limits^{\min } i\mathop \forall \limits^{\min } k\left| {X_{0} (k) - X_{i} (k)} \right|, $$

(6)

$$ \max \Updelta_{0i} = \mathop \forall \limits^{\max } i\mathop \forall \limits^{\max } k\left| {X_{0} (k) - X_{i} (k)} \right|. $$

(7)

Notice that ζ stands for a distinguishing coefficient that is limited from 0 to 1. In Eq. (4), the GRC scale takes both the global maximum difference and the global minimum difference into account. Thus, its similarity can be seen as a measurement that is distinct from traditional similarity-based methods.

Finally, the grey relational grade (GRG) between the observed project X ₀ and the comparative project X _i can be quantified by giving an average value of the GRCs as follows (Liu and Lin 2006):

$$ \Upgamma_{0i} = \frac{1}{M}\sum\limits_{k = 1}^{M} {\gamma (X_{0} (k),X_{i} (k))} . $$

(8)

The GRG value can be treated as a comparable basis and applied to similarity judgment. For instance, if the similarity order is Γ_0,a > Γ_0,b, the comparative project X _a is much closer to the observed project X ₀ than the project X _b is.

3 Weighted GRA

3.1 Weighted GRA

The relative importance between the project features and development effort should be considered within the similarity measure. When the weighted similarity of the GRA method is taken into account, Eq. (8) can be modified as follows:

$$ \Upgamma_{0i} = \sum\limits_{k = 1}^{M} {\beta_{k} \gamma (X_{0} (k),X_{i} (k))} , $$

(9)

where

$$ \sum\limits_{k = 1}^{M} {\beta_{k} = 1} . $$

(10)

Notice that β _k is a stationary weight given to the kth feature. Because the relationship between the project features and development effort is still an open issue (Dolado 2001), applying a weighted GRA then poses an annoying problem in determining the appropriate weight for each feature. From previous studies (Jørgensen and Shepperd 2007), human judgment or expert opinion could be one of the solutions to assigning feature weights. However, experts may be reluctant to set the weights manually due to the additional effort required to analyse project features, and expert opinion is somehow subjective. Therefore, we propose six weighted methods based on statistical techniques as follows.

3.1.1 Nonweight (or equal weight)

In the general case, the nonweight or equal weight can be defined as:

$$ \beta_{k} = \frac{1}{M}, $$

(11)

where M is the total number of features, meaning that each feature has an equal impact on similarity computations. Obviously, both Eqs. (8) and (11) are a special case of Eq. (9). This method is used as a baseline method for our experiments.

3.1.2 Distance-based weight

The distance measurement compares dissimilarity corresponding to the dependent variable (i.e., known effort) (Freedman et al. 1997). The distance-based weight can be defined as:

$$ \beta_{k} = {\frac{{{\frac{1}{{{\text{Distance}}(k)}}}}}{{\sum\nolimits_{k = 1}^{M} {{\frac{1}{{{\text{Distance}}(k)}}}} }}}, $$

(12)

where

$$ {\text{Distance}}(k) = \sqrt {\sum\limits_{i = 1}^{N} {(X_{i} (k) - X_{i} (Dep))^{2} } } . $$

(13)

Equation (13) is a kind of Euclidean distance (Marir and Watson 1994). Accordingly, features with a close distance to the dependent variable should be assigned a higher weight.

3.1.3 Correlative weight

Similarly, we can also use correlation analysis to determine feature weights. The correlation coefficient calculates any of a wide variety of similarities. If a feature’s correlation coefficient corresponding to the dependent variable is significant, the feature and the dependent variable will exhibit a perfect relationship. The correlative weight is defined as:

$$ \beta_{k} = {\frac{{\left| {{\text{Correlation}}(k)} \right|}}{\sum\nolimits_{k = 1}^{M}{\left|{\text{Correlation}}(k)\right|}}}, $$

(14)

where Correlation(k) denotes a Pearson correlation coefficient between the kth feature and the dependent variable (Hogg and Craig 1995). If a correlation coefficient is negative, the absolute value is taken.

3.1.4 Linear weight

The linear weight assumes that there is a linear relationship between the dependent variable and the independent variables (i.e., project features). The linear function can be defined as follows (Dolado 2001; Huang and Chiu 2006; Jørgensen et al. 2003):

$$ X(Dep) = \sum\limits_{k = 1}^{M} {a_{k} X(k)} + c, $$

(15)

$$ \beta_{k} = {\frac{{\left| {a_{k} } \right|}}{{\sum\nolimits_{k = 1}^{M} {\left| {a_{k} } \right|} }}}, $$

(16)

where a _k is a coefficient corresponding to the kth feature and c is a constant. The coefficients a _k are an aggregation of independent variables that affect the dependent variable in a linear manner. Thus, these coefficients can somehow show the different degree of relevance to effort and can be translated into a correspondent weight β _k for the kth feature.

3.1.5 Nonlinear weight

By contrast, the functional form between the software development effort and project features may also be assumed to be a nonlinear relationship (Dolado 2001; Huang and Chiu 2006; Jørgensen et al. 2003). The nonlinear function can be defined as follows:

$$ X(Dep) = \sum\limits_{k = 1}^{M} {a_{k} X(k)^{{b_{k} }} } + c, $$

(17)

where b _k is an exponent of the kth feature. This nonlinear relationship adjusts the independent variables more dramatically than the linear relationship. The coefficient a _k can also be transferred into correspondent weights β _k, similar to Eq. (16). Note that the coefficients of linear and nonlinear equations can be solved computationally (Freedman et al. 1997; Hogg and Craig 1995).

3.1.6 Maximal weight

In an extreme case, if we only consider a maximum similarity to determine a maximal weight, the weights of other features in which their similarities are smaller than the maximum are set to zero. This weighted assignment follows an assumption that the minor similarity features may not greatly affect the computations for retrieving the most similar case. The maximal weight can be defined as:

$$ \beta_{k} = \left\{ {\begin{array}{*{20}c} {1,} \hfill & {{\text{if}}\,\gamma (X_{0} (k),X_{i} (k)) = \max \gamma (X_{0} (k),X_{i} (k)),{\text{and}}\, \, X_{i} (k)\,{\text{is}}\,{\text{numeric}}} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right., $$

(18)

where

$$ \max \gamma (X_{0} (k),X_{i} (k)) = \mathop \forall \limits^{\max } i\mathop \forall \limits^{\max } k \, \gamma (X_{0} (k),X_{i} (k)). $$

(19)

Consequently, this weighted method effectively decreases the use of similarity features down to one. This method can also be treated as a comparative method in the following experiments.

3.2 Implementation

In order to evaluate the weighted GRAs, the experimental procedure is illustrated in Fig. 1.

The description of each step is presented as follows:

Step 1 Collect data from past projects in order to estimate the effort for development in the new project. In this paper, we use public datasets and choose the first project to start the procedure.
Step 2 Set the chosen project as an observed project.
Step 3 Normalize project sequences to range from 0 to 1 (Hsu and Huang 2006; Liu and Lin 2006; Song et al. 2005).
Step 4 Calculate GRC between the comparative projects and the observed project and set the distinguishing coefficient.
Step 5 Adjust an appropriate weight for each project feature and rank several of the most similar projects from the same dataset.
Step 6 Choose a suitable number of analogous projects to predict the development effort for the observed project. If all projects in the dataset have been estimated, this procedure goes to Step 7. Otherwise, the procedure will choose the next project and go back to Step 2.
Step 7 Evaluate the estimated performance by using comparison criteria.

As shown in Fig. 1, the experimental procedure is an iterative loop. Consider N projects in the dataset. In each loop, one project is chosen from the dataset as the observed project, which is used as a testing example in order to estimate software development effort. The other N-1 projects are treated as a training base (i.e., comparative projects) to compare similarity. This procedure runs N times until the last project in the dataset is estimated. This kind of validation is a part of “leave-one-out cross-validation” (also known as jackknife validation) (Huang and Chiu 2006; Shepperd and Schofield 1997; Song et al. 2005).

Some settings for the experiments need to be discussed: (1) datasets and feature selection; (2) the similarity measure and distinguishing coefficient; (3) effort adaptation and number of analogous project; (4) evaluation criteria and statistical tests.

3.2.1 Datasets and feature selection

Four real datasets were used to conduct the experiments. The detailed information of these datasets is shown in Table 1. These datasets are widely used as a comparative standard in many other studies (Huang and Chiu 2006; Jeffery et al. 2000; Li and Ruhe 2006; Liu et al. 2008; Mendes et al. 2005; Samson et al. 1997; Shepperd and Schofield 1997; Srinivasan and Fisher 1995).

Table 1 Datasets information

Comparison of weighted grey relational analysis for software effort estimation

Abstract

Similar content being viewed by others

Software Effort Estimation Using Grey Relational Analysis with K-Means Clustering

Case-based reasoning with optimized weight derived by particle swarm optimization for software effort estimation

Appropriate number of analogues in analogy based software effort estimation using quality datasets

1 Introduction

2 Literature review

2.1 Software effort estimation survey

2.2 Conventional GRA

3 Weighted GRA

3.1 Weighted GRA

3.1.1 Nonweight (or equal weight)

3.1.2 Distance-based weight

3.1.3 Correlative weight

3.1.4 Linear weight

3.1.5 Nonlinear weight

3.1.6 Maximal weight

3.2 Implementation

3.2.1 Datasets and feature selection

3.2.2 Similarity measure and distinguishing coefficient

3.2.3 Effort adaptation and number of analogous projects

3.2.4 Evaluation criteria and statistical tests

4 Experiments and discussions

4.1 Experiment 1: sensitivity analysis between distinguishing coefficients and analogous numbers

4.1.1 Comparison of accuracy with distinguishing coefficients and analogous numbers

4.1.2 Sensitivity analysis between analogous numbers and distinguishing coefficients

4.2 Experiment 2: comparison of accuracy with weighted GRA

4.2.1 Comparison of accuracy between weighted and nonweighted GRA

4.2.2 Comparison of accuracy with other methods

4.2.3 Comparison of accuracy with published results

4.3 Discussions

5 Conclusions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation