Keywords

1 Introduction

To date, the ISO has recognized five functional size measurement (FSM) methods for software as compliant with ISO 14143-1:

  • Four are considered as 1st generation FSM methods: MKII: ISO 20698, IFPUG: ISO 20926, NESMA: ISO 24570, and FISMA: ISO 29881.

  • One is referred to as a 2nd generation FSM method: COSMIC – ISO 19761 [6].

These FSM methods work best when the information to be measured – the functional user requirements – is fully known. However, this is most often not the case in the early phases of software development projects when only the non-detailed information is commonly available [4, 13, 14]: approximation techniques are then necessary to tackle this lack of details and to come up within a relevant range of candidate functional sizes.

Similarly, as pointed out in [8], “a rapid size measurement will be acceptable if it can be produced faster and still can deliver a reliable approximation of the detailed size measurement”. As observed by Desharnais et al. [11], when the software documentation is lacking it is not possible to apply all the detailed measurement rules and the measurers must then fall back on approximation techniques for sizing the requirements without enough details. While the Desharnais et al. [11] research work focused on the IFPUG FSM [12] method, their key findings are relevant to all FSM methods. For instance, in [11] a number of contexts were identified where the detailed measurement rules cannot be used, such as:

  • The documentation is not precise enough for the application of the detailed measurement rules.

  • The amount of work required to apply the detailed measurement rules to obtain precise measures of the software, and the work required subsequently to update the detailed measurement results, is perceived by management as being too expensive.

Santillo [13] further states that the “functional size of software to be developed can be measured precisely [only] after the functional specification stage: this stage is often completed relatively late in the development process.”

A few researchers have developed approximation approaches [21] for measuring software functional size by analyzing historical data from completed projects; however, few of them have investigated how the performance of such approximation techniques in contexts with missing information, as encountered in the early phases of software projects.

In previous works, a fuzzy logic-based EPCU approach for approximate functional size in COSMIC was proposed by Valdés et al. using as a reference the Equal Size Bands Approach defined by Vogelezang et al. [8]: to do so did not require access to the details of this dataset. For organizations that do not have historical data, this fuzzy logic-based model could be useful to approximate functional size early in the development process.

This paper aims to improve the EPCU approach for approximate functional size without historical data [21, 24], considering the dataset used by Vogelezang et al. [8] to define the Equal Size Bands Approach. More specifically, this paper is focusing on the output variable domain: indeed, in the previous works, the dataset was not accessible and the assumption about the largest category (Very Large) was that its average value (16.4 CFP) represented adequately the full quartile, which meant that most of the largest sizes were close this average. Later access to the detailed dataset and personal feedback from the author of Vogelezang et al. [8] provided additional insights about the dataset, clarifying in particular that the average value (cutoff) for the largest category did not consider larger functional processes.

The rest of this paper is organized as follows. Section 2 describes the related works. Section 3 presents an analysis from the dataset used to generate the Equal Size Band Approximation Approach, and the validation of using a continuous range of possible values from 2 CFP with a “natural” upper boundary, or cutoff, stated at 16.4 CFP for the output variable in the EPCU model approximation approach. Section 4 describes the improvement applied to the EPCU model approximation approach. Section 5 presents the results gathered from the application of the EPCU model approximation approach using the same input data in the case study defined in [24] and the comparison with the previous work. Section 6 presents the conclusions.

2 Related Works

In 1997, Meli [18] proposed two techniques for two distinct types of size approximation, but did not report on their performance:

  • Early Function Points (EFP), a faster version of the IFPUG 4.0 approximation method, and

  • Extended Function Points (XFP), derived from the EFP after the application of three correction factors.

In 2003, Desharnais et al. [11] analyzed two approximation techniques used in the industry, Function Points Simplified (FPS) [15] and Backfiring [16], using two verification criteria selected from ISO 14143-3: accuracy and convertibility. They reported that, in the organizational context of their study, the FPS technique performed better with an accuracy range of 5 %.

In 2004, Conte et al. [14] extended the Early & Quick (E&Q) technique to the COSMIC FSM method, and indicated that further tests would be needed to make adjustments to the proposal, or to confirm it. This E&Q technique is based on (direct) analogy and (derived) analysis: it is a human-based size approximation method, and is impacted by the approximator ability to “recognize” the components of the system as belonging to the proposed classes.

In 2007, Vogelezang et al. [8] proposed a size approximation technique based on size bands using the quartile approach and reported on a study of 50 projects for the identification of such size bands. They also investigated the influence of distinct factors in approximate sizing and reported that, within their sample of 50 projects, the only factor that had a substantial influence on the size of an average functional process in each of the quartiles was the number of functional processes [8].

In 2007, the COSMIC Group published the ‘Advanced and Related Topics’ [5] document which describes two types of sizing approximation:

  • Early Sizing: for use early in the life cycle of a project, before the Functional User Requirements (FUR) are detailed and specified.

  • Rapid Sizing: for use when there is not enough time to measure using the standard method.

These two types of sizing approximation can be considered in the early phases of a development project. In general, an approach to approximate sizing some scaling factor for the type(s) of artifact(s) of the FUR of the software to be measured must be defined locally [5], requiring, for instance, that an average size of the artifacts to be measured be established locally – see Fig. 1.

Fig. 1.
figure 1

Scaling of sizes between different levels of granularity [5]

This scaling factor represents the size expected to be measured when the functional user requirements are at the level of detail where an accurate measurement can be made because all the necessary details are available [8]. This solution needs historical data in order to produce an adequate scaling factor.

In [5], four approaches to approximate sizing of new ‘whole’ sets of requirements are presented. Each approach is based on two main assumptions:

  • There exist historical data to determine the scaling factor (average, or size bands).

  • The whole set of requirements is described, or at least there is a commitment, defined by the requirements, about the scope of the software to be developed.

The four approaches described in [5] are:

  1. (1)

    The Average Functional Process approach. The approximate size of the new piece of software is estimated to be equal to (Number of Functional Processes x Average Size from historical data).

  2. (2)

    The Fixed Size Classification approach. A statement of requirements is analyzed to identify the functional processes and to classify each of them into one of three or more size classes, called, for instance: Small, Medium, and Large. A corresponding scaling factor is next assigned to each functional process, from historical data.

  3. (3)

    The Equal Size Bands Approach. The functional processes are first classified into a small number of size bands. In the next step, the average sizes of each band are calculated (preferably calibrated locally), and then these average functional sizes are multiplied by the number of functional processes of the new piece of software, in each band respectively, to obtain the total approximate size.

  4. (4)

    The Average Use Case approach. This example extends (1) to a higher level of granularity.

In 2011, Santillo [13] proposed the use of the Analytic Hierarchy Process [17], a technique that provides a means for making choices among sizing alternatives, particularly when a number of concurrent objectives have to be satisfied.

In 2012, Valdés et al. [21] proposed a solution using the fuzzy logic model from [24], referred to as the Estimation of Projects in a Context of Uncertainty – the EPCU model. This study, as in [11], analyzed the performance of an approximation technique using fuzzy logic [7, 9, 19 and 20] in an early phase context. For comparison purposes, the experiment was carried out also with the Equal Size Bands approach from Vogelezang et al. [8] which had led to an MMREFootnote 1 = 11 % and SDMREFootnote 2 = 9 %: in their experiment, using a reference software system [10] with a full set of stable requirements and its stated measured functional size, for this case study the Equal size band approach provided better approximation results.

In 2013 Almakadmeh [23] designed a framework to assign scaling factors for identifying the level of granularity of functional requirements specifications of software. In [23] two variants of the criteria to assess the levels of granularity were defined: the first one considers a software functional component and the second considers the elements of the UML use-case model. In order to rank the levels of granularity identified, the scaling factors used in [8] were selected; next, the scaling factor assignation is based on conducting an analogy-based comparison with similar pieces of software in which the functional size of those pieces of software is accurately measured using the COSMIC measurement method.

A workshop on discussion on approximate COSMIC FSM at the IWSM/MENSURA 2013 conference reported that “the approximation methods described in the in-progress COSMIC Guideline on Approximation rely on a common principle, namely that the only precisely defined level of granularity of functional user requirements is the functional process level of granularity” [22]. It also mentioned that the approximation methods were based on two approximation principles or a combination of them: Scaling and Classification, which concepts had been identified first in [23], respectively as scaling factors and levels of granularity.

Also in 2013, De Marco et al. [26] investigated to what extend some COSMIC-based approximate sizing could be useful for project managers for early effort estimation for Web applications: an empirical analysis was reported employing data from 25 Web applications to assess whether the two approximate sizes (number of COSMIC Functional Processes (FP) or the Average Functional Process approach) can be exploited to get accurate effort estimates. The conclusion is that the use of COSMIC-based approximate countings were a suitable approach for early effort estimates, while the estimates obtained with the approximate sizes are worse than those achieved employing the size obtained from the application of the standard COSMIC method.

In 2014, De Vito et al. [27] considered the need of a simplified and rapid COSMIC measurement that should avoid the use of scaling factors since incorrect calibrations of the scaling factors can lead to inaccurate approximations, proposed a simplified measurement process (Quick/Early) that can be applied on the use case models and aims to reduce the measurement time. This Quick/Early Approximation approach precision is directly proportional to the level of granularity of the analyzed use cases model: this means that the use cases have to be at least stable requirements –which does not happen too often in the very early stages; still, they conclude that accuracy of Quick/Early is good.

Also in 2014, Valdés et al. [24] reported on a case study of a simulation of the early approximation step using the EPCU model for an industry project for which only the names of the use cases were made available to the participants, without any additional documentation. This case study confirmed that the EPCU Size Approximation approach does not require local calibration and is useful when there are no historical data available; in addition it is less expensive than the calibration of the equal size band approach which requires historical data.

Table 1. Approximation techniques analysis highlights

For the case study with a real industrial project, the EPCU Size Approximation approach came up with better results with a MMRE of 45 % in comparison an MMRE = 63 % for the Equal Size Bands Approach, while both approaches led to underestimated results.

In order to integrate the highlights of the literature review in [22] and after, Table 1 was adapted from [24]. This table shows that the validity of most approximations techniques is dependent on the representativeness of the samples with respect to the software being approximated: said differently, to date most of the approximation methods need to be calibrated locally and this requires local historical data. However, in practice, most organizations do not have such data: as previously pointed out by Morgenshtern: “Algorithmic models need historical data, and many organizations do not have this information. Additionally, collecting such data may be both expensive and time consuming” [1] and approximation techniques based on historical data are of little use for organizations without such data. Alternatives must therefore be developed for such contexts of approximation.

3 Analyzing the Functional Process Sizes in the Quartile Analysis from the Equal Size Bands Approximation Approach Dataset

3.1 Data Set Description

The Vogelezang dataset has been used to generate the equal size bands approximation approach [5]: it includes 47 projects related to four sectors (Banking, Government, Insurance, Logistics). See Table 2. More specifically:

  • The project size range for the Banking sector goes from 11 CFP to 2743 CFP, with a project average functional size is 476 CFP with 1345 Functional Process (FP), and a total size of 12375 CFP for all projects.

  • For the Government sector, the total size of the set of projects is 3845 CFP with 838 FP where the project average functional size is 481 FP and the project size range for the sector is 64 to 2364 CFP.

  • For the Insurance sector, the size range is from 84 CFP to 1311 CFP, the project average functional size is 551 CFP with 342 FP with a total size of all projects is 3305 CFP.

  • For the Logistics sector, the number of FP considered is 321, with a total size of 3766 CFP, with a project average functional size of 538 CFP, and a project size range from 193 to 1164 CFP. See Table 2.

Table 2. Dataset characterization

The dataset contains two general analyses labeled Q-Size and Q-Number, and eight specific analyses by sector, labeled Q-Size (sectori) and Q-Number (sectori). For this new study we will consider the integrated analysis, the concept of both is described below.

  • For Q-Size the total measured size [CFP] is divided into four quartiles and the average FP size is calculated from there - see Table 3.

  • For Q-number the total number of functional processes is divided into four quartiles and the average and the average FP size is calculated from there - see Table 4.

    Table 3. Q-Size considering four sectors
    Table 4. Q-Number considering four sectors

In Table 3, it can be observed that:

  • Q1(Small FP) contains most of the FP (55 %) which sizes is up to 6 CFP with an average of 3.7 CFP.

  • Q2 (Medium FP) contains 26 % of the total FP with a range of functional size from 6 to 10 CFP with an average of 7.7 CFP.

  • Q3 (Large FP) contains 14 % of FP with an FP average size of 14.6 CFP and the range going from 10 CFP to 25 CFP.

  • Q4 (Very Large FP) contains 5 % of the total FP (142 FP with an average of 44.1 CFP) and defines a range larger than 25 CFP.

In Table 4, each quartile contains the 25 % of the FP: the average size for Q1 (Small FP) is 2.7 CFP and the range is up to 4 CFP, the Q2 (Medium FP) defines a range of functional size from 4 to 6 CFP with an average of 4.3 CFP. In the Q3 (Large FP) the average size of FP is 7.1 CFP and the range goes from 6 CFP to 8 CFP. The quartile Q4 (Very Large FP) defines the range from 8 CFP and larger, with an average FP size of 18.6 CFP.

Table 5. Quartiles closeness

In Table 5, the analysis of the differences of the average size for the Q-Size and the Q-Number shows that for the Q-Number the average sizes for each quartile are closer than in the Q-Size approach.

3.2 Comparison of the 2014 and 2015 Study

In the 2012 and 2014 case study [21, 24], the output variable in the EPCU model was defined using a continuous range of possible values from 2 CFP with a “natural” upper boundary, or cutoff, stated at 16.4 CFP, considering the assumption about the Very Large category is that the average value (16.4) and that it adequately represents the full quartile, which means that most of the sizes are around the average, as is described in [5].

In 2105, using the analyzing the dataset for the Q-Size (Banking), we found that the 16.4 CFP average is for the Q3 (Large FP), including 14 % of the FP and the range goes from 10 to 31 CFP, and that there is another quartile Q4 (Very Large FP) with an average of 51.6 CFP: this means that the average of 16.4 CFP does not represent a relevant value to be used as cutoff for the Banking sector or the sectors described in the dataset used to define equal size bands approximation approach [5].

4 Improving the EPCU Model Approximation Approach

4.1 Redefining the Output Variable

To tackle the lack of historical data issue discussed in the previous studies [21, 24], and considering that there is no universal average functional process from which a scaling factor for early size measurement can be derived [8], the Equal Size Bands Approach, or Quartile, approach (Example 3) defined by Vogelezang et al. [8] was selected in the previous work [21, 24], as the basis for the COSMIC approximate sizing task using the EPCU model approach for business applications.

Vogelezang [8] used measurements on business application development projects, each having a total size greater than 100 CFP. The quartile values from this dataset were as follows: Small = 4.8 CFP, Medium = 7.7 CFP, Large = 10.7, and Very Large = 16.4 CFP [8] – see Fig. 2.

Fig. 2.
figure 2

Quartile size values of Functional Processes (FP)

As discussed in section B, the value 16.4 CFP does not represent a relevant value to be used as cutoff for all the sectors described in the dataset used to define equal size bands approximation approach.

Because this paper is about a functional size not about number of FP’s, the Q-Size analysis about the quartiles are used – see Table 3.

Considering this assumption, the average for each quartile is Q1 = 3.7 CFP, Q2 = 7.7 CFP, Q3 = 14.6 CFP, Q4 = 44.1 CFP. Considering the range defined by the quartiles for the Q-Size approach will be [3.7 CFP to 44.1 CFP]. Consequently the range for the output variable is from 2 CFP (minimum functional size using COSMIC for a FP) to 44 CFP, with four linguistic values (fuzzy sets) defined: Low, Average, High, and Very High - see Fig. 3.

In Fig. 3, it can be observed that he range is continuous, but the difference between the quartile averages makes that the Large and Very Large Fuzzy Sets are wider than the Small and Average - see Table 5.

Fig. 3.
figure 3

Output variable schema

5 Application of the EPCU Model Approximation Approach Improved

The 2012 case study used a reference system [21] with its full set of stable requirements and its stated measured functional size, and an industry project [24] for which only the names of the use cases were shared with the participants through a survey form: no other information was shared with the participants who had to determine the size of the project functional requirements through their own evaluation of the input variables.

In order to compare the results about the improvement realized to the EPCU context used to approximate functional size, the same data used in [24] was used, because it is considered as a more realistic simulation of the early approximation step using the EPCU model.

5.1 The Measurement Reference: Software System ALFA

The requirements of the ALFA software system scope were stated in a set of 14 Use Case descriptions. To establish the measurement reference for the comparison of the approximation against measurement results based from the detailed documentation of these 14 use cases, the detailed descriptions of the use cases were used by one of the researcher, certified as a COSMIC measurer (CCFL), to obtain the COSMIC measurement size based on the complete detailed documentation.

Table 6 presents, for each use case, the detailed COSMIC measurement results, including the data movement types and their functional size in COSMIC CFP units.

Table 6. COSMIC size of the use cases in the ALFA project

The total functional size for the ALFA software requirements is 250 CFP (bottom line of Table 6), distributed across 14 use cases with a mean of 17.9 CFP per use case, a median of 16 CFP, and a standard deviation of 13 CFP [24].

5.2 Participants Tasks in the Experiment

In the 2014 study the detailed use case information relative to the ALFA project requirements was not made available for the practitioners.

Furthermore, the practitioners participating in the experiment:

  • were not familiar with the COSMIC method,

  • they had no historical data for approximating the FSM using COSMIC,

  • they did not participate in the definition of the EPCU context or know the EPCU model.

The only information the 2014 participants had access to was the list of use cases identified and their own experience with the business process related to the project. Only a form with a list of the 14 use cases identified by the case study (with the real names) was sent by email to 12 practitioners from this organization – see Table 7; only eight set of answers were received.

The participants were asked to perform the following (full data shown in Appendix A):

  1. 1.

    Classify each of the 14 use cases using the linguistic values: Small, Medium, Large and Very Large.

  2. 2.

    Classify the number of objects of interest for each of the fourteen use cases using the linguistic values: Few, Average, and Many.

  3. 3.

    Assign values for the two input variables previously defined from the EPCU context (the functional process size, the quantity of objects of interest related to the functional process [24]) for each of the 14 use cases: considering the subjective classification relative functional size of the use cases and the subjective classification about the number of objects of interest in each use case, each value assigned within the range of 0 to 5 ε R.

The Input variables value assignation provided by the practitioners was next fed into the refined EPCU fuzzy logic model.

5.3 Data Analysis

In Table 8, the first column gives the practitioner’s ID, the second column gives the 250 CFP reference size for the ALFA system, the third column the functional size calculated using the Equal Size Bands Approach, and the related magnitude of the relative error (MRE) is shown in the fourth column. The columns five and six show the functional size calculated using the EPCU approximation approach using the cutoff in 16.4 CFP (as was established in [24]) and the related MRE respectively. The column seven shows the functional size calculated using the EPCU improved approximation approach developed in this paper (Sect. 4) - i.e., using the cutoff of 44 CFP. The rightmost column indicates the MRE calculated from each size approximated by practitioners.

Table 7. Experiment information request form
Table 8. Experiment results using EPCU size approximation approach

Comparing results using the Equal Size Bands approach against the reference functional size.

As mentioned in [24], comparing the Equal Size Bands approach, the mean magnitude of the relative error (MMRE) for this dataset is 63 % with respect to the reference size of 250 CFP, and the standard deviation of the MRE (SDMRE) is 5 %. – see Table 8 and Fig. 4.

The maximum MMRE is 67 % (Practitioner 1 and 6) and the minimum is 54 % (Practitioner 8) – see Fig. 4.

Fig. 4.
figure 4

Case study results for each practitioner

Comparing results using an EPCU Size Approximation approach (using a cutoff = 16.4 CFP) against the Equal Size Bands Approach.

In [24] the results show that considering the MMRE the functional size approximated with the EPCU Model (MRE = 45 %) is more accurate than the approximation using the “Equal Size bands Approach” [8] (MMRE = 63 %). The difference between the MMRE obtained using the “Equal Size bands Approach” and the EPCU Size Approximation approach is 18 %. In [24] all the practitioners using the both approximation approach obtain estimation results under the reference size 250 CFP- see Table 8 and Fig. 4.

The behavior observed in Table 8 is that the “Equal Size bands Approach” has a smaller SD (SDMRE = 5 %) than the results obtained with the EPCU model approach SDMRE = 18 %: the difference between the SD between the “Equal Size bands Approach” and the EPCU Size Approximation approach is 13 % - see Table 8 and Fig. 4.

Comparing results using an EPCU Improved Size Approximation approach (using a cutoff = 44 CFP) against the reference functional size.

Considering the data in Table 8, the MMRE and SDMRE of the EPCU improved size approximation for all 8 practitioners are presented in the two bottom lines of Table 8 in the columns seven and eight:

  • the MMRE with the EPCU model is 43 %,

  • the SDMRE is 34 %.

  • the maximum MMRE value with the EPCU model is 97 % (Practitioner 8) and the minimum value is 4 % (Practitioners 7 and 2) – see Fig. 5.

    Fig. 5.
    figure 5

    EPCU improved apprximation approach results for each practitioner

Comparing results using an EPCU Size Approximation approach (using a cutoff = 16.4 CFP) against EPCU Improved Size Approximation approach (using a cutoff = 44 CFP).

In Table 8 in the bottom lines, it can be observed that the MMRE for EPCU Size Approximation approach (using a cutoff = 16.4 CFP) is 45 % with a SDMRE about 18 %. For the EPCU Improved Size Approximation approach (using a cutoff = 44 CFP), the MMRE is 43 % and the SDMRE is 34 %.

The EPCU Size Approximation approach (using a cutoff = 16.4 CFP) shows less dispersion and the EPCU Improved Size Approximation approach (using a cutoff = 44 CFP) shows better results because the MMRE is low.

From the data in Table 8 and Fig. 4, it can be seen that the use of EPCU Size Approximation approach in the early phases, considering the kind of information that is usually available at this phase, presents better results than the use of the un-calibrated “Equal Size Bands Approach”.

Figure 5 shows that using a cutoff about 16.4 CFP the approximation of functional size is underestimating; using the cutoff about 44 CFP, the results are above and below from the real value, as discussed by De Marco “An estimation is a prediction that is equally likely to be above or below the actual result” [25].

An important feature of the EPCU Size Approximation approach is that the context does not have to be calibrated: it does not use bands, but rather a continuous range in ε R, which is represented by a membership function up to the upper boundary defined.

In summary, this 2015 case study reports a better performance with respect to the equal size band approach than reported in the earlier 2012 and 2014 case study [21, 24].

6 Conclusions

This research aimed to improve the EPCU approach for approximate functional size without historical data [21, 24], considering the dataset used by Vogelezang et al. [8] to define the Equal Size Bands Approach.

In this paper, the improvement made to the EPCU Functional Size Approximation Approach consisted in defining for the output variable, a continuous range of possible values with a “natural” upper boundary, or cutoff, at 44 CFP, the average of the functional size for FP in Q4 related to the Q-Size approach - Table 3.

As in [24], the EPCU Size Approximation approach does not require local calibration and is useful when there are no historical data available; in addition it is less expensive than the calibration of the equal size band approach which requires historical data.

For the experiment with a real industrial project, the 2015 EPCU Improved Size Approximation approach (cutoff in 44 CFP) presented better results with a MMRE of 43 % in comparison to the 2012 Equal Size Bands Approach (MMRE = 63 %) and EPCU Size Approximation approach (cutoff at 16.4 CFP) with an MMRE = 45 %.

For last two approaches, it is possible to observe an underestimate of functional size using them; on the other hand, using the EPCU Improved Size Approximation approach (cutoff at 44 CFP), the results are above and below the real value.

In summary, in this 2015 case study, a cutoff at 44 CFP presents more realistic results, because it considers FP with a larger functional size (including a wide range of FP from the dataset), something that it is not happening using the cutoff at 16.4 CFP.

Planned further work includes the collection of a set of projects with their use cases or their functional process identified, in order to conduct a more in depth analysis of the EPCU Improved Size Approximation approach.

This will include a comparison of the behavior of the EPCU Improved Size Approximation approach, considering the output variable range using the defined quartile for each sector (Q-Size (sectori)) and a more in depth analysis considering specific projects for each sector.