1 Introduction

Software system is one of the essential elements in our modern society, which has widely applied in numerous safety–critical domains (Ivanov et al. 2018; De Melo and Sanchez 2008; Fiondella et al. 2013; Özdamar and Alanya 2001; Zachariah 2015). The emphasis of modern software development has changed significantly over the years. As identified by Bosch and Bosch-Sijtsema (2010), the latest trends in software industry accelerate the complexity and dependency of software development. The first trend is software product lines build-up. Bosch and Bosch-Sijtsema (2010) defined software product lines consist of platforms which can be used by many products in the organization. Each development team will select and configure components from the platforms in order to build a reliable and consistent product based on the individual functionality. The adoption of software production lines can be helpful on cost and time management, but it also brings extra dependency into the product and organization, which could cause the added complexity (Bosch and Bosch-Sijtsema 2010; Clements and Northrop 2002). The second trend is software global development within several organizations across different countries. Many software companies have placed several sites globally or partnered with remoted companies, mostly located in India and China (Garcia-Crespo et al. 2010; Garg et al. 2014; Carmel and Agarwal 2001; Herbsleb and Moitra 2001; Sangwan et al. 2006). Software global development has its overwhelming advantages but also faces many challenges such as culture differences, time zone and maturity of software engineering, all of which contribute to the elevation of the complexity of dependent management to a new level (Bosch and Bosch-Sijtsema 2010; Cascio and Shurygailo 2003). The third trend is the establishment of software ecosystems. In recent years, software development has transformed from a solo activity within the organization to a highly collaborative ecosystems, which can be placed globally (Storey et al. 2017). Such ecosystems allow the development of new functionality occurs outside of the organization, however blur projects/tasks boundaries (Singer et al. 2013; Harman et al. 2012; Ghazawneh and Henfridsson 2013; Basole and Karla 2011). Hence, software ecosystems also contribute to elevate the dependency level between products and organizations (Bosch and Bosch-Sijtsema 2010).

Since software systems are included in most areas of human activities so that software quality assurance and software reliability prediction are very critical in various industries (Condori-Fernandez and Lago 2018; El-Sebakhy 2009). However, delivering high-quality and reliable software products is not easy (Ponnurangam and Uma 2005; Zhu and Pham 2018a, b). Despite of being widely studied and of interested to global market, software quality is still a complex and costly task for researchers and practitioners. One of the fundamental software quality characteristics is reliability. Software reliability described in Lyu (2007) as “the probability of failure-free software operation for a specified period of time in a specified environment”.

To estimate the remaining software faults in software program, predict software reliability and software failure rate given the time of interest, and plan release time, a great number of nonhomogeneous Poisson process (NHPP) software reliability growth models (SRGMs) were developed. However, most of SRGMs have not addressed the random effect of application environments. Only a few studies have incorporated the random effect of environments or fault reduction factor in SRGMs. For instance, Teng and Pham (2006) assumed that the random effects were represented by a unit-free environment factor. A generalized SRGM with the unit-free environment factor was developed to represent both software testing and operation phase. This unit-free environment factor was modelled as a beta or gamma distribution in order to propose two specific SRGMs. Fault reduction factor (FRF) is the number of the removed faults corresponding to the failures (Musa 1980), which could be affected by other factors, e.g., imperfect debugging, delay debugging, etc. Hsu et al. (2011) considered the FRF as a time-variable function and further incorporated it in the SRGM to improve the accuracy of failure prediction. Pham (2014) incorporated the uncertainty of the operation environment into a software Vtub-shaped fault detection rate model. Specifically, software fault detection rate follows a Vtub-shape function and the uncertainty of the operation environments is represented by a random variable, modeled by gamma distribution. Chang et al. (2014) incorporated the idea of uncertainty of operating environments into the testing-coverage SRGM. Minamino et al. (2017) proposed a two-dimensional SRGM based on a CES type time function, which is a generalized form of Cobb–Douglas function with testing-time and testing-effort factor. Inoue et al. (2016) proposed a bivariate SRGM with the uncertainty of the change of software failure-occurrence phenomenon at the change-point. Zhu and Pham (2018c) incorporated a single factor and the impact of this factor in the SRGM. Recently, Qiu et al. (2019) proposed the stress testing method with influencing factors that cause systems work under certain stress and explored the mathematical relationship between mean time to failure and influencing factors.

The motivations of this paper are described as follows. First, even some studies incorporated the uncertainty of environments or a single factor in the model development, however they cannot represent a generalized SRGM with random application environments. Given the great changes in software development, such complicated and human-centered software development process needs to be addressed more appropriately. Second, environmental factors (EFs), such as amount of programming effort, programmer organization, human nature, testing environment, program complexity, design methodology, were firstly defined by Zhang and Pham (2000) from the perspectives of software complexity, human nature, team collaboration and the interaction with hardware systems. Recent survey investigations (Zhu et al. 2015; Teng and Pham 2017) have also revealed the significant impacts of EFs on software reliability and provided the latest rank of the importance level of EFs in software development. Thus, how to incorporate multiple EFs and the associated randomness induced by these EFs into the development of SRGM is essential yet challenging.

Therefore, we aim to propose a generalized SRGM incorporating multiple EFs and the associated randomness induced by these EFs under the martingale framework, in which researchers and practitioners are able to obtain a specific SRGM according to the individual application environments. Martingale framework, specifically, Brownian motion, is introduced to reflect the associated randomness. We consider the associated randomness is reflected on the process of software fault detection. Section 2 discusses the importance of EFs and introduces two specific EFs from recent studies (Zhu et al. 2015; Teng and Pham 2017), percentage of reused modules (PoRM) and frequency of program specification change (FoPSC). Section 3 first introduces the martingale framework and reviews the related work. Next, we propose a generalized framework of multiple-environmental-factors NHPP (MEF-NHPP) SRGM and further develop a specific MEF-NHPP SRGM incorporating two specific EFs, PoRM and FoPSC. Sections 4 first discusses parameter estimation and comparison criteria and then illustrates two numerical examples with the real-world Open Source Software (OSS) project data sets to demonstrate the prediction powder of the proposed generalized framework of MEF-NHPP SRGM. Section 5 draws the conclusion and describes the future research directions.

2 Environmental factors

Thirty-two EFs were first identified by Zhang and Pham (2000) from four phases of software development and the interactions with hardware subsystems. For example, one of the EFs, named program complexity, is defined to measure the program size in terms of the kiloline of code. Other EFs, such as requirements analysis is used to verify the understanding of the requirements generating from customers. Testing environment is the specific environment set up in testing phase in order to simulate the operational environment and detect software faults. Testing effort can be identified by testing expenditures, testing causes or the years of working. The definitions and detailed discussion of all EFs can be found in references (Zhang and Pham 2000; Zhu and Pham 2017).

Fifteen years later, Zhu et al. (2015) reinvestigated the impact of these EFs on software reliability and aimed to provide the latest ranking of the EFs, the correlation between factors, reduce the dimension of the EFs and compare the findings with the previous studies (Zhang and Pham 2000; Zhang et al. 2001). Most EFs on the top ten group in the previous studies (Zhang and Pham 2000; Zhang et al. 2001) still list on the top ten group in the latest investigation (Zhu et al. 2015). The latest top ten EFs in developing single-release software are FoPSC, testing effort, relationship of detailed design to requirement, testing environment, testing coverage, program complexity, programmer skill, PoRM, testing methodologies and domain knowledge. Later, Zhu and Pham (2017) launched another survey study to examine the impact of the EFs on software reliability in developing multiple-releases software. The top ten EFs in developing multiple-releases software are PoRM, amount of programming effort, requirement analysis, FoPSC, level of programming technologies, testing effort, relationship of detailed design to requirement, testing coverage, program workload and program complexity.

As demonstrated from the previous studies (Zhang and Pham 2000; Zhu et al. 2015; Zhu and Pham 2017; Zhang et al. 2001), EFs have significant impacts on reliability in software development; hence, it is plausible to incorporate multiple EFs in software reliably model to improve software reliability prediction accuracy. In order to illustrate the effectiveness of the proposed generalized framework of MEF-NHPP SRGM in considerations of the ranking and significance levels of EFs and practical applications, we thus develop a specific MEF-NHPP SRGM in Sect. 3 with two specific EFs, PoRM and FoPSC. In the following Sects. 2.1 and 2.2, we express the reasons of selecting these two EFs and their corresponding distributions based on the collected data.

2.1 PoRM

PoRM (Zhu and Pham 2018c; Zhang and Pham 2000) is defined as follows

$$ PoRM = \frac{{S_{0} }}{{S_{N} + S_{0} }} $$
(1)

where \( {\text{S}}_{0} \) represents the kiloline of code in the existing modules. \( {\text{S}}_{\text{N}} \) denotes the kiloline of code in the new modules (Zhu and Pham 2018c).

The PoRM data was collected from various industries such as manufacturing, high technology, online retailing, IT service and research institution (Zhu and Pham 2018c). The participants had different positions including managers, testing engineers, programmers and other roles contributed to software development. To provide valid and reliable responses, survey participants were working on software development-related area or IT department during the data collection period. The collected PoRM (Zhu and Pham 2018c) is shown in Fig. 1. As the research results obtained in reference (Zhu and Pham 2018c), Gamma distribution is employed to model PoRM with parameters \( \gamma_{1} \) and \( \theta_{1} , \) expressed as \( {\text{PoRM}}\sim {\text{Gamma}}\left( {6.487, 14.726} \right) \).

Fig. 1
figure 1

The collected PoRM data

2.2 FoPSC

Lehman (1980) summarized the Program Evolution Laws. The first law of Program Evolution is continuing change, which expresses that large program is never completed and will continue evolving. Changes of specifications occur since the initial development until product delivery, which increases the risk of adding extra software cost but could add more values and improve software reliability (McGee and Greer 2010). Changes of specifications, studied by Harker et al. (1993), mostly due to the reasons such as fluctuations within the organization or market, consequence of system-usage, customer migratory issues, the increased understanding of requirements and adaption issues. Later, many studies have also discussed the importance of the changes of specifications from the perspectives such as product strategy, hardware/software environment/interaction, testability and functionality enhancement (Nurmuliani et al. 2004a, b; Carlshamre 2002; Shi et al. 2013).

Meanwhile, FoPSC is one of the significant EFs on the top ten list affecting software reliability in both survey investigations (Zhu et al. 2015; Zhu and Pham 2017). We define FoPSC as the total times of all the specifications have been changed in all the historical versions in software development. In this study, we will use the percentage of all the changes in a project to estimate the parameters. We employ the data sets provided in references (Shi et al. 2013; Loconsole and Borstler 2005) to estimate the distribution of FoPSC. The collected FoPSC data is illustrated in Fig. 2.

Fig. 2
figure 2

The collected FoPSC data

Considering the definition of FoPSC, gamma distribution or beta distribution is appropriate for modeling FoPSC. We compare the log-likelihood value of gamma distribution and beta distribution for the collected data. It concludes that beta distribution is a better fit for FoPSC. Parameter estimation of the beta distribution is also obtained from the collected data, stated as follows: \( {\text{FoPSC }}\sim Beta \left( {1.411, 7.409} \right) \).

3 A generalized MEF-NHPP SRGM

3.1 Some related work

The underlying assumptions of NHPP SRGM are the detection of software fault is a NHPP and the software failure intensity is proportional to the software fault detection rate and the remaining fault in the program. Most NHPP SRGMs are proposed in terms of the equation given as follows (Zhu and Pham 2018a, b, c; Teng and Pham 2006; Musa 1980; Hsu et al. 2011; Pham 2014; Goel and Okumoto 1979; Pham 2007).

$$ \frac{d}{dt}m\left( t \right) = h\left( t \right)\left[ {N\left( t \right) - m\left( t \right)} \right] $$
(2)

where \( m\left( t \right) \) is the expected number of software failures detected by time \( t \), \( N\left( t \right) \) represents the fault content function, \( h\left( t \right) \) is the software fault detection rate per unit of time. Depending on the model consideration, \( N\left( t \right) \) and \( b\left( t \right) \) are modeled as a constant or a time-dependent function, respectively. For example, Goel-Okumoto model (Goel and Okumoto 1979) assumed that \( h\left( t \right) = b \) and \( N\left( t \right) = a. \) Inflection S-shaped model (Pham 2007) assumed that \( h\left( t \right) = \frac{b}{{1 + \beta e^{ - bt} }} \) and \( N\left( t \right) = a. \) PNZ model (Pham 2007) assumed that \( h\left( t \right) = \frac{b}{{1 + \beta e^{ - bt} }} \) and \( N\left( t \right) = a\left( {1 + \alpha t} \right) \).

In order to identify the random development/application environments or the impact of the EF on software reliability, a stochastic software fault detection is adopted with applying \( h\left( t \right) \) to be \( h\left( {t,\eta } \right) \), where \( \eta \) represents the random environment effect or the EF. Equation (2) will be reformulated as follows

$$ \frac{d}{dt}m\left( {t,\eta } \right) = h\left( {t,\eta } \right)\left[ {N\left( t \right) - m\left( {t,\eta } \right)} \right] $$
(3)

As an illustration, Pham (2014) considered \( h\left( {t,\eta } \right) = h\left( t \right) \eta \), named dynamic multiplicative noise model, and \( N\left( t \right) = N \) in the model, in which \( \eta \) is a random variable. Pham and Pham (2019) considered \( h\left( {t,w} \right) = h\left( t \right) + \dot{M}\left( {t,w} \right) \) as a dynamic additive noise model in the software reliability model. \( \dot{M}\left( {t,w} \right) \) denotes the derivative of \( M \) as regards time \( t \). \( M\left( t \right) \) is defined as a martingale as regards the filtration (\( {\mathcal{F}}_{t} , t \ge 0 \)).

Later, inspired by references (Pham 2014; Pham and Pham 2019), Zhu and Pham (2018c) considered the dynamic multiplicative model as well as the additive noise model in the SRGM, which is described as \( h\left( {t,\eta } \right) = h\left( t \right) + \lambda_{0} G\left( {t,\eta } \right) + \dot{B}\left( t \right) \). A software reliability model considering a single EF and its impact was developed by Zhu and Pham (2018c), given as follows

$$ \frac{d}{dt}m\left( {t,\eta } \right) = \left[ {h\left( t \right) + \lambda_{0} G\left( {t,\eta } \right) + \dot{B}\left( t \right)} \right]\left[ {N\left( t \right) - m\left( {t,\eta } \right)} \right] $$
(4)

where \( \eta \) represents the EF, PoRM. \( G\left( {t,\eta } \right) \) is a time-dependent function. \( \lambda_{0} \) is the coefficient associated with the \( G\left( {t,\eta } \right) \). Standard Gaussian white noise is represented by \( \dot{B}\left( t \right) \), in which

$$ \frac{d}{dt}B\left( t \right) = \dot{B}\left( t \right) $$
(5)

where \( B\left( t \right) \) denotes Brownian motion.

Brownian motion is a martingale as well (Mikosch 1998; Mörters and Peres 2010). Mikosch (1998) indicated that { \( B\left( t \right):t \ge 0 \) } and { \( B^{2} \left( t \right) - t:t \ge 0 \) } both are martingale in regard to the nature filtration (\( {\mathcal{F}}_{t} , t \ge 0 \)) given { \( B\left( t \right):t \ge 0 \) } denotes as Brownian motion. One of the martingale properties can be applied to Eq. (4) is

$$ E\left[ {\mathop \smallint \limits_{v}^{t} h\left( {s,w} \right)ds} \right] = \mathop \smallint \limits_{v}^{t} h\left( s \right)ds $$
(6)

Meanwhile, \( \dot{B}\left( t \right) \) is a standard Gaussian process with the covariance structure shown as follows

$$ E\left[ {\dot{B}\left( t \right)\dot{B}\left( u \right)} \right] = \delta \left( {u - t} \right), 0 < t < u $$
(7)

where \( \delta \) is the Dirac Delta measure. Thus, the general solution of Eq. (4) obtained by reference (Zhu and Pham 2018c) is given as follows

$$ m\left( {t,\eta } \right) = N\left( t \right) - N\left( 0 \right)e^{{ - \mathop \smallint \limits_{0}^{t} \left[ {h\left( s \right) + \lambda_{0} G\left( {s,\eta } \right) + \dot{B}\left( s \right)} \right]ds}} - \mathop \smallint \limits_{0}^{t} e^{{ - \mathop \smallint \limits_{u}^{t} \left[ {h\left( s \right) + \lambda_{0} G\left( {s,\eta } \right) + \dot{B}\left( s \right)} \right]ds}} N^{\prime}\left( u \right)du $$
(8)

However, reference (Zhu and Pham 2018c) only considered a single EF in the software reliably model, while many other EFs also have the significant impacts on software reliability in software development (Zhu et al. 2015; Zhu and Pham 2017). Thus, we propose a theoretic framework of MEF-NHPP SRGM incorporating multiple EFs and the associated randomness in the following section.

3.2 A generalized MEF-NHPP SRGM

Considering the significant impacts of EFs on software reliability in software development in recent survey investigations (Zhu et al. 2015; Zhu and Pham 2017) and the great changes in the complicated and human-centered software development process, we develop a generalized MEF-NHPP SRGM incorporating multiple EFs and the associated randomness induced by these EFs under the martingale framework.

The assumptions of the proposed MEF-NHPP SRGM are described below.

  1. (1)

    Software fault detection is a NHPP.

  2. (2)

    Software failure intensity is proportional to the remaining faults in the program.

  3. (3)

    The manifestation of software failures is due to the remaining faults in the program.

  4. (4)

    Software faults are independent.

  5. (5)

    Multiple EFs are considered in the proposed model. All EFs are independent. We do not consider correlation between EFs in this study.

  6. (6)

    The randomness induced by the impact of EFs, is imposed on software fault detection rate and modeled by martingale process; specifically, Brownian motion.

Hence, a theoretic MEF-NHPP software reliability model is proposed as follows

$$ \frac{d}{dt}m\left( {t, \underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) = \left[ {h\left( t \right) + \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {t,\eta_{i} } \right) + \dot{B}\left( t \right)} \right]\left[ {N\left( t \right) - m\left( {t,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right)} \right] $$
(9)

where \( m\left( {0, \underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) = 0. \) \( \underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}} \) represents the n-dimensional vector. \( \eta_{i} \) is a random variable and represents \( {\text{EF}}_{i} , \) \( i = 1, 2, \ldots , n. \) \( m\left( {t,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) \) represents the expected number of software failures detected by time t considering multiple EFs. \( h\left( t \right) \) represents software fault detection rate per unit of time without the impact of EFs. \( G_{i} \left( {t,\eta_{i} } \right) \) is a time-dependent function, which also denotes the effect brought by \( {\text{EF}}_{i} , i = 1, 2, \ldots , n \), on software fault detection rate per unit of time. \( \lambda_{i} \) denotes the coefficient associated with \( G_{i} \left( {t,\eta_{i} } \right) \). \( N\left( t \right) \) is the fault content function. \( \dot{B}\left( t \right) \) is a standard Gaussian white noise, as presented in Eq. (5).

With the applications of the martingale property and the general solution from references (Zhu and Pham 2018c; Pham and Pham 2019), the mean value function of the proposed MEF-NHPP SRGM is obtained

$$ m\left( {t,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) = N\left( t \right) - N\left( 0 \right)e^{{ - \mathop \smallint \limits_{0}^{t} \left( {h\left( s \right) + \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {s,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) + \dot{B}\left( s \right)} \right)ds}} - \mathop \smallint \limits_{0}^{t} e^{{ - \mathop \smallint \limits_{u}^{t} \left( {h\left( s \right) + \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {s,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) + \dot{B}\left( s \right)} \right)ds}} N^{\prime}\left( u \right)du $$
(10)

By applying Eqs. (6) and (7) on \( h\left( t \right) + \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {t,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) + \dot{B}\left( t \right) \), we can obtain the following equation

$$ \mathop \smallint \limits_{0}^{t} \left[ {h\left( s \right) + \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {s,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) + \dot{B}\left( s \right)} \right]ds = \mathop \smallint \limits_{0}^{t} \left[ {h\left( s \right) + \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {s,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right)} \right]ds + B\left( t \right) $$
(11)

By substituting Eq. (11) to Eq. (10), the mean value function is thus obtained

$$ \bar{m}\left( {t,\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}} \right) = N\left( t \right) - N\left( 0 \right)e^{{ - \mathop \smallint \limits_{0}^{t} h\left( s \right)ds}} e^{{\frac{t}{2}}} e^{{ - \mathop \smallint \limits_{0}^{t} \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {s, \eta_{i} } \right)ds}} - \mathop \smallint \limits_{0}^{t} e^{{ - \mathop \smallint \limits_{u}^{t} h\left( s \right)ds}} e^{{\frac{t - u}{2}}} e^{{ - \mathop \smallint \limits_{u}^{t} \mathop \sum \limits_{i = 1}^{n} \lambda_{i} G_{i} \left( {s,\eta_{i} } \right)ds}} N^{\prime}\left( u \right)du $$
(12)

Let

$$ G_{i} \left( {t,\eta_{i} } \right) = \eta_{i} v_{i} \left( t \right) $$
(13)

where \( v_{i} \left( t \right) \) is a time-dependent function, which also represents the effect of time on \( EF_{i} \), \( i = 1, \ldots , n \).

As discussed in the model assumptions, all EFs are independent in this study. Each EF will be represented by a random variable. To present an explicit solution of Eq. (12), we apply the expectation on both sides of Eq. (12) with respect to \( \eta_{1,} \eta_{2} , \ldots , \) \( {\text{and}} \,\eta_{n} \). Hence, the mean value function of the proposed MEF-NHPP SRGM can be expressed as

$$ \bar{m}_{{\underbrace {{\eta_{1} , \eta_{2} , \ldots ,\eta_{n} }}_{{}}}} \left( t \right) = N\left( t \right) - N\left( 0 \right)e^{{ - \mathop \smallint \limits_{0}^{t} h\left( s \right)ds}} e^{{\frac{t}{2}}} \left[ {\mathop \prod \limits_{i = 1}^{n} \mathop \smallint \limits_{0}^{\infty } e^{{ - \mathop \smallint \limits_{0}^{t} \lambda_{i} \eta_{i} v_{i} \left( s \right)ds}} f\left( {\eta_{i} } \right)d\eta_{i} } \right] - \mathop \smallint \limits_{0}^{t} N^{\prime}\left( u \right)e^{{ - \mathop \smallint \limits_{u}^{t} \left( {h\left( s \right) - \frac{1}{2}} \right)ds}} \left[ {\mathop \prod \limits_{i = 1}^{n} \mathop \smallint \limits_{0}^{\infty } e^{{ - \mathop \smallint \limits_{u}^{t} \lambda_{i} \eta_{i} v_{i} \left( s \right)ds}} f\left( {\eta_{i} } \right)d\eta_{i} } \right]du $$
(14)

Equation (14) is the generalized mean value function in consideration of multiple EFs and the associated randomness. If the distribution of each EF is known, by applying the Laplace transform of each probability density function, we have high possibility to obtain a closed-form expression of Eq. (14).

3.3 A specific MEF-NHPP SRGM

As discussed above, to demonstrate the performance of the proposed MEF-NHPP SRGM, we apply two specific EFs, PoRM and FoPSC, into the proposed model. The mean value function of the specific MEF-NHPP SRGM is thus obtained as follows

$$ \begin{aligned} \bar{m}_{{\underbrace {{\eta_{1} , \eta_{2} }}_{{}}}} \left( t \right) & = N\left( t \right) - N\left( 0 \right)e^{{ - \mathop \smallint \limits_{0}^{t} h\left( s \right)ds}} e^{{\frac{t}{2}}} \left[ {\mathop \prod \limits_{i = 1}^{2} \mathop \smallint \limits_{0}^{\infty } e^{{ - \mathop \smallint \limits_{0}^{t} \lambda_{i} \eta_{i} v_{i} \left( s \right)ds}} f\left( {\eta_{i} } \right)d\eta_{i} } \right] \\ & \quad - \mathop \smallint \limits_{0}^{t} N^{\prime}\left( u \right)e^{{ - \mathop \smallint \limits_{u}^{t} \left( {h\left( s \right) - \frac{1}{2}} \right)ds}} \left[ {\mathop \prod \limits_{i = 1}^{2} \mathop \smallint \limits_{0}^{\infty } e^{{ - \mathop \smallint \limits_{0}^{t} \lambda_{i} \eta_{i} v_{i} \left( s \right)ds}} f\left( {\eta_{i} } \right)d\eta_{i} } \right]du \\ \end{aligned} $$
(15)

where \( \eta_{1} \) denotes PoRM, \( \eta_{2} \) denotes FoPSC. \( \lambda_{1} \) and \( \lambda_{2} \) are the coefficients associated with the function \( G_{1} \left( {t,\eta_{1} } \right) \) and \( G_{2} \left( {t,\eta_{2} } \right) \), respectively. \( v_{1} \left( t \right) \) and \( v_{2} \left( t \right) \) represent the time-dependent function and reflect the effect of time on PoRM and FoPSC, respectively.

Gamma distribution, with parameters \( \gamma_{1} \) and \( \theta_{1} \), is applied to model PoRM \( . \) The probability density function (PDF) of PoRM is given as follows

$$ f\left( {\eta_{1} } \right) = \frac{{\theta_{1}^{{\gamma_{1} }} \eta_{1}^{{\gamma_{1} - 1}} e^{{ - \theta_{1} \eta_{1} }} }}{{\varGamma \left( {\gamma_{1} } \right)}} $$
(16)

Beta distribution, with parameters \( \beta_{1} \) and \( \beta_{2} \), is applied to model FoPSC. The PDF of FoPSC is given as follows

$$ f\left( {\eta_{2} } \right) = \frac{{\varGamma \left( {\beta_{1} + \beta_{2} } \right)\eta_{2}^{{\beta_{1} - 1}} (1 - \eta_{2} )^{{\beta_{2} - 1}} }}{{\varGamma \left( {\beta_{1} } \right)\varGamma (\beta_{2} )}} $$
(17)

The Laplace transform is given as follows

$$ \mathop \smallint \limits_{0}^{\infty } xe^{ - sx} f\left( x \right)dx = - \frac{{dF^{*} \left( s \right)}}{ds} $$
(18)

By applying Eq. (18), the Laplace transform of Eq. (16) is given as follows

$$ F_{{\eta_{1} }}^{*} \left( s \right) = \left[ {\frac{{\theta_{1} }}{{\theta_{1} + s}}} \right]^{{\gamma_{1} }} $$
(19)

By applying Eq. (18), the Laplace transform of Eq. (17) is given as follows (Teng and Pham 2006)

$$ F_{{\eta_{2} }}^{*} \left( s \right) = e^{ - s} \times HG\left( {\left[ {\beta_{2} } \right],[\beta_{1} + \beta_{2} } \right], s) $$
(20)

where \( {\text{HG}}\left( {\left[ {\beta_{2} } \right],[\beta_{1} + \beta_{2} } \right], s) \) is the generic hypergeometric function such as

$$ HG\left( {\left[ {a_{1} , a_{2} , \ldots ,a_{m} } \right],[b_{1} , b_{2} , \ldots ,b_{n} } \right], s) = \mathop \sum \limits_{j = 0}^{\infty } \left[ {\frac{{s^{j} \mathop \prod \nolimits_{i = 1}^{m} \frac{{\varGamma \left( {a_{i} + j} \right)}}{{\varGamma \left( {a_{i} } \right)}}}}{{\mathop \prod \nolimits_{i = 1}^{n} \frac{{\varGamma \left( {b_{i} + j} \right)}}{{\varGamma \left( {b_{i} } \right)}}j!}}} \right] $$
(21)

Therefore, the Laplace transform of beta distribution can be further written as

$$ \begin{aligned} F_{{\eta_{2} }}^{*} \left( s \right) & = e^{ - s} \left[ {\mathop \sum \limits_{j = 0}^{\infty } \frac{{\varGamma \left( {\beta_{1} + \beta_{2} } \right)\varGamma \left( {\beta_{2} + j} \right)s^{j} }}{{\varGamma (\beta_{2} )\varGamma \left( {\beta_{1} + \beta_{2} + j} \right)j!}}} \right] \\ & = \mathop \sum \limits_{j = 0}^{\infty } \frac{{\varGamma \left( {\beta_{1} + \beta_{2} } \right)\varGamma \left( {\beta_{2} + j} \right) \times Poisson\left( {j,s} \right)}}{{\varGamma (\beta_{2} )\varGamma \left( {\beta_{1} + \beta_{2} + j} \right)}} \end{aligned} $$
(22)

\( {\text{where}}\,Poisson\left( {j,s} \right) = \frac{{s^{j} e^{ - s} }}{j!} \).

Substituting Eqs. (19) and (22) into Eq. (15), the mean value function of the specific MEF-NHPP SRGM is thus obtained as follows

$$ \bar{m}_{{\underbrace {{\eta_{1} , \eta_{2} }}_{{}}}} \left( t \right) = N\left( t \right) - \left[ {\frac{{\theta_{1} }}{{\theta_{1} + \mathop \smallint \nolimits_{0}^{t} \lambda_{1} v_{1} \left( s \right)ds}}} \right]^{{\gamma_{1} }} \left[ {\mathop \sum \limits_{j = 0}^{\infty } \frac{{\varGamma \left( {\beta_{1} + \beta_{2} } \right)\varGamma \left( {\beta_{2} + j} \right) \times Poisson\left( {j, \mathop \smallint \nolimits_{0}^{t} \lambda_{2} v_{2} \left( s \right)ds} \right)}}{{\varGamma (\beta_{2} )\varGamma \left( {\beta_{1} + \beta_{2} + j} \right)}}} \right]\left[ {N\left( 0 \right)e^{{ - \mathop \smallint \limits_{0}^{t} h\left( s \right)ds}} e^{{\frac{t}{2}}} + \mathop \smallint \limits_{0}^{t} N^{\prime}\left( u \right)e^{{ - \mathop \smallint \limits_{u}^{t} \left( {h\left( s \right) - \frac{1}{2}} \right)ds}} du} \right] $$
(23)

where \( Poisson\left( {j,\mathop \smallint \limits_{0}^{t} \lambda_{2} v_{2} \left( s \right)ds} \right) = \frac{{\begin{array}{*{20}c} {\left( {\mathop \smallint \nolimits_{0}^{t} \lambda_{2} v_{2} \left( s \right)ds} \right)} \\ \end{array}^{j} e^{{ - \mathop \smallint \nolimits_{0}^{t} \lambda_{2} v_{2} \left( s \right)ds}} }}{j!} \).

Different formulations of \( h\left( t \right), \) \( v_{i} \left( t \right) \) and \( N\left( t \right) \) considering different testing scenarios assumptions can be substituted into Eq. (23) to obtain the final solution. As an example, let \( h\left( t \right) = \frac{b}{{1 + ce^{ - bt} }}, \) \( v_{1} \left( t \right) = e^{{ - a_{1} t}} , \) \( v_{2} \left( t \right) = e^{{ - a_{2} t}} \) and \( N\left( t \right) = \frac{1}{k}e^{kt} , \) where \( b, c, a_{1} , a_{2} , \) and \( k \) are the coefficient of \( h\left( t \right), v_{1} \left( t \right), v_{2} \left( t \right), \) and \( N\left( t \right), \) respectively. Substituting \( h\left( t \right), v_{1} \left( t \right), v_{2} \left( t \right) \) and \( N\left( t \right) \) into Eq. (23), the mean value function of the proposed specific MEF-NHPP SRGM with the selected functions is obtained as follows

$$ \bar{m}_{{\underbrace {{\eta_{1} , \eta_{2} }}_{{}}}} \left( t \right) = \frac{1}{k}e^{kt} - \frac{{e^{{\frac{t}{2}}} }}{{c + e^{bt} }}\left[ {\frac{{\theta_{1} }}{{\theta_{1} + \frac{{\lambda_{1} }}{{a_{1} }}\left( {1 - e^{{ - a_{1} t}} } \right)}}} \right]^{{\gamma_{1} }} \left[ {\mathop \sum \limits_{j = 0}^{\infty } \frac{{\varGamma \left( {\beta_{1} + \beta_{2} } \right)\varGamma \left( {\beta_{2} + j} \right) \times Poisson\left( {j,\frac{{\lambda_{2} }}{{a_{2} }}\left( {1 - e^{{ - a_{2} t}} } \right)} \right)}}{{\varGamma (\beta_{2} )\varGamma \left( {\beta_{1} + \beta_{2} + j} \right)}}} \right]\left[ {\frac{c + 1}{k} - \frac{c}{{k - \frac{1}{2}}}e^{{\left( {k - \frac{1}{2}} \right)t}} - \frac{1}{{b + k - \frac{1}{2}}}e^{{\left( {b + k - \frac{1}{2}} \right)t}} + \frac{c}{{k - \frac{1}{2}}} + \frac{1}{{b + k - \frac{1}{2}}}} \right] $$
(24)

where \( Poisson\left( {j,\frac{{\lambda_{2} }}{{a_{2} }}\left( {1 - e^{{ - a_{2} t}} } \right)} \right) = \frac{{\begin{array}{*{20}c} {\left[ {\frac{{\lambda_{2} }}{{a_{2} }}\left( {1 - e^{{ - a_{2} t}} } \right)} \right]} \\ \end{array}^{j} e^{{ - \frac{{\lambda_{2} }}{{a_{2} }}\left( {1 - e^{{ - a_{2} t}} } \right)}} }}{j!} \).

4 Applications

Recently, the increasing adoption of OSS by individuals, software companies and government-supported organizations has promoted the wide application of OSS in our modern society. We employ two ApacheFootnote 1 OSS project data sets, named Whirr and Juddi, to elucidate the effectiveness and performance of the proposed generalized MEF-NHPP SRGM. As an illustration, we compare the performance of failure prediction of the specific MEF-NHPP SRGM with other SRGMs given in Table 1. Note that only the single-environmental-factor (SEF) model (Zhu and Pham 2018c) considers EF.

Table 1 NHPP SRGMs

4.1 Parameter estimation and comparison criteria

Least squares estimation (LSE) and maximum likelihood estimation are commonly applied to estimate the unknown parameters. LSE finds the optimal parameter values by minimizing \( S \), described as follows

$$ S = \mathop \sum \limits_{i = 1}^{n} \left[ {m\left( {t_{i} } \right) - y_{i} } \right]^{2} $$
(25)

where \( {\text{y}}_{\text{i}} \) is the observed number of software failures at time \( t_{i} , \) \( i = 1, 2, \ldots , {\text{n}}. \) \( m\left( {t_{i} } \right) \) is the expected number of software failures at time \( t_{i} , i = 1, 2, \ldots , n \). LSE is employed to estimate the unknown parameters in this study. We consider the parameters for PoRM and FoPSC are estimated from the real data, as seen in Sect. 2; hence, we will only need to estimate other seven unknown parameters seen in Eq. (24), which are \( k \), \( b \), \( c, \) \( \lambda_{1} , \) \( \lambda_{2} , \) \( a_{1} , \) and \( a_{2} . \) The genetic algorithm is applied to solve the Eq. (25) and estimate the unknown parameters.

Four comparison criteria (Pham 2007; Huang and Kuo 2002; Li et al. 2012), mean squared error (MSE), predictive-ratio risk (PRR), predictive power (PP) and Variation, are employed to evaluate the model performance, listed as follows.

$$ MSE = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left[ {m\left( {t_{i} } \right) - y_{i} } \right]^{2} }}{n - N} $$
(26)
$$ PRR = \mathop \sum \limits_{i = 1}^{n} \left[ {\frac{{m\left( {t_{i} } \right) - y_{i} }}{{m\left( {t_{i} } \right)}}} \right]^{2} $$
(27)
$$ PP = \mathop \sum \limits_{i = 1}^{n} \left[ {\frac{{m\left( {t_{i} } \right) - y_{i} }}{{y_{i} }}} \right]^{2} $$
(28)
$$ Variation = \sqrt {\frac{1}{n - 1}\mathop \sum \limits_{i = 1}^{n} \left[ {y_{i} - m\left( {t_{i} } \right) - Bias} \right]^{2} } $$
(29)

where \( Bias = \frac{1}{n}\sum _{i = 1}^{n} \left[ {m\left( {t_{i} } \right) - y_{i} } \right]. \) n denotes the total number of observations. N denotes the number of unknown parameters in each model. MSE evaluates the distance of the failure prediction to the observed data. PRR evaluates the distance of the predicted failures to the observed failures against the predicted failures; while PP evaluates the distance of the predicted failures to the observed failures against the observed failures.

We understand the evaluated model tends to give better prediction as the number of unknown parameters increases. That is the reason we employ the above four criteria to compare the models from different perspectives. For all the four comparison criteria, the evaluated model has better prediction power as the criteria become smaller.

4.2 Numerical example I

In this first numerical example, Whirr OSS project data, collected from September 2010 to April 2013, is applied to demonstrate the model performance. Whirr is a group of libraries to run cloud service. Whirr OSS project data is denoted as data set I (DS-I) in this paper. Table 2 describes time unit \( t_{i} \), software failures between \( t_{i - 1} \) and \( t_{i} \), denoted as \( y_{i} - y_{i - 1} \), and cumulative software failures by \( t_{i} \), denoted as \( y_{i} \), as seen in Eqs. (2529). For example, the observed software failures between time unit \( t_{1} \) and \( t_{2} \) is 6. The cumulative software failures by time unit \( t_{2} \) is 12. The observed software failures between time unit \( t_{2} \) and \( t_{3} \) is 6. The cumulative software failures by time unit \( t_{3} \) is 18. The observed software failures between time unit \( t_{31} \) and \( t_{32} \) is 3. The cumulative software failures by time unit \( t_{32} \) is 136.

Table 2 DS-I

The first 24 observations of DS-I,\( y_{1} , \ldots , y_{24} \), are considered as training set to estimate the parameters of the selected SRGMs. The observations, \( y_{25} , \ldots , y_{32} \), are thus considered as testing set. Table 3 presents the selected criteria comparison and the estimated parameters of SRGMs based on the training set by applying LSE. Comparing with other SRGMs without considering EF (Pham 2007) and considering a single EF, SEF model (Zhu and Pham 2018c), the proposed MEF-NHPP SRGM has the smallest values of all four criteria, as seen in Table 3. As an illustration of model comparison, we only present Figs. 3 and 4 in this section. Figure 3 illustrates the comparison of the actual failures with failure prediction by SEF model. Figure 4 illustrates the comparison the predicted software failures based on the proposed MEF-NHPP SRGM and the actual failures.

Table 3 Parameter estimates and model comparisons of DS-I
Fig. 3
figure 3

DS-I comparison of actual failures with failure prediction by SEF model

Fig. 4
figure 4

DS-I comparison of actual failures with failure prediction by the proposed MEF-NHPP SRGM

Based on the research outcomes obtained in references (Teng and Pham 2006; Musa 1980; Hsu et al. 2011; Pham 2014; Zhu and Pham 2018c), software reliability model with considering the random environments has better performance in terms of failure prediction and reliability estimation. Hence, we only present the criteria comparison of SEF model and the proposed MEF-NHPP SRGM for the testing set. The values of MSE, PRR, PP and Variation of SEF model for the testing set are 273.750, 0.176, 0.131 and 34.903, respectively. The values of MSE, PRR, PP and Variation of the proposed MEF-NHPP SRGM for the testing set are 161.750, 0.099, 0.078 and 26.445, respectively. The proposed MEF-NHPP SRGM has smaller values of all four criteria for the testing set, as compared with SEF model. We thus conclude that the proposed MEF-NHPP SRGM has the best fitting performance.

The given dataset, as seen in Table 2, describes software failures from time unit \( t_{1} \) to \( t_{32} \). One of the great advantages of SRGMs is to estimate software failures based on the time of interest and determine the optimal release time of the software product. Software practitioners and researchers are also interested in the software failure prediction after time unit \( t_{32} . \) Thus, we estimate the software failures after time unit \( t_{32} \) based on the proposed MEF-NHPP SRGM. Figure 5 shows the predicted software failures between time unit \( t_{i} \) to \( t_{i + 1} , i = 32, 33, \ldots , 41, \) which provides a practical reference for software development team to decide the time to stop testing and how much testing resource will be allocated to the project.

Fig. 5
figure 5

DS-I software failure prediction between time unit \( t_{i} \) to \( t_{i + 1} , i = 32, 33, \ldots , 41 \)

4.3 Numerical example II

The second numerical example applies Apache Juddi OSS project data, collected from February 2009 to February 2014. Juddi OSS project data is denoted as data set II (DS-II) in this paper. Table 4 describes time unit \( t_{i} \), software failures between \( t_{i - 1} \) and \( t_{i} \), denoted as \( y_{i} - y_{i - 1} \), and cumulative software failures by \( t_{i} \), denoted as \( y_{i} \), as seen in Eqs. (2529). For example, the observed software failures between time unit \( t_{1} \) and \( t_{2} \) is 2. The observed software failures between time \( t_{2} \) and \( t_{3} \) is 8. The cumulative software failures by time unit \( t_{2} \) is 9. The observed software failures between time unit \( t_{32} \) and \( t_{33} \) is 0. The cumulative software failures by time unit \( t_{33} \) is 185.

Table 4 DS-II

The first 28 observations of DS-II,\( y_{1} , \ldots , y_{28} \), are considered as training set to estimate the parameters of the selected SRGMs in the second numerical example. The observations, \( y_{29} , \ldots , y_{33} \), are thus considered as testing set. Based on LSE method, we estimate the unknown parameters of the selected SRGMs by the training set. Accordingly, the model comparisons between the proposed MEF-NHPP SRGM and other SRGMs based on the selected criteria can be obtained. As listed in Table 5, the proposed MEF-NHPP SRGM has the smallest values of the selected criteria, such as MSE, PP and Variation, compared with other SRGMs. In term of the comparison criterion PRR, PNZ model carries the smallest PRR value, however PNZ model also has much larger MSE value compared with the proposed MEF-NHPP SRGM. PRR is the criterion that evaluates the distance of the predicted failures to the observed failures against the predicted failures; in other words, it assigns larger penalty to the model which underestimates the failures. MSE is generally considered as the priority criterion since it penalizes larger prediction errors more than others. Hence, the proposed model is concluded to be the best fit since it has the lowest value of MSE, PP and Variation for the training set, compared with other SRGMs incorporating a single EF (Zhu and Pham 2018c) and without considering EF (Pham 2007). As an illustration, Fig. 6 displays the comparison between the failures predicted by SEF model and the actual failures. Figure 7 displays the comparison between the failures predicted by the proposed generalized MEF-NHPP SRGM and the actual failures.

Table 5 Parameter estimates and model comparisons of DS-II
Fig. 6
figure 6

DS-II comparison of actual failures with failure prediction by SEF model

Fig. 7
figure 7

DS-II comparison of actual failures with failure prediction by the proposed MEF-NHPP SRGM

Moreover, the criteria comparison of SEF model and the proposed MEF-NHPP SRGM for the testing set are described as follows. The values of criteria such as MSE, PRR, PP and Variation of SEF model for the testing set are 736.800, 0.162, 0.115 and 50.793, respectively. The values of criteria such as MSE, PRR, PP and Variation of the proposed MEF-NHPP SRGM for the testing set are 289.800, 0.056, 0.045 and 37.357, respectively. The proposed MEF-NHPP SRGM has smaller values of all four criteria for the testing set. Therefore, the proposed MEF-NHPP SRGM is concluded as the best fit.

Software failure prediction can be calculated based on the proposed MEF-NHPP SRGM. Indeed, we provide software failure prediciton after time unit \( t_{33} \) for failure DS-II as well. Figure 8 shows software failure predition between time unit \( t_{i} \) to \( t_{i + 1} , i = 33, 34, \ldots , 42 \) based on the proposed generalized MEF-NHPP SRGM. Software failure prediction can be a great help for testing resource allocation, software multiple releases planning and the determiantion of software optimal release time.

Fig. 8
figure 8

DS-II software failure prediction between time unit \( t_{i} \) to \( t_{i + 1} , i = 33, 34, \ldots , 42 \)

4.4 Reliability prediction

Software reliability within \( \left( {t, t + x} \right) \) can be determined after the unknown parameters estimated by LSE. Software reliability is calculated by the equation stated as follows

$$ R\left( {x |t} \right) = e^{{ - \left[ {m\left( {t + x} \right) - m\left( t \right)} \right]}} $$
(30)

All other models have not considered EFs, except SEF model. Since OSS projects are significantly impacted by the EFs (Zhu and Pham 2017), indeed, we only compare the reliability prediction predicted by SEF model and the proposed MEF-NHPP SRGM. Given time unit \( t = 32 \) and \( t = 33 \) for DS-I and DS-II and varying \( x \) from time unit 0 to 1.2 in Eq. (30), Figs. 9 and 10 illustrate the comparison of the reliability prediction calculated by SEF model and the proposed MEF-NHPP SRGM for these two data sets, respectively. As seen from Figs. 9 and 10, the reliability value predicted by the proposed MEF-NHPP SRGM is less than SEF model for both data sets.

Fig. 9
figure 9

Reliability prediction comparison of DS-I

Fig. 10
figure 10

Reliability prediction comparison of DS-II

5 Conclusions and future research

Given the great changes in software development, such complicated and human-centered software development process needs to be addressed well. Meanwhile, recent survey investigations (Zhu et al. 2015; Zhu and Pham 2017) have revealed the significant impacts of EFs on software reliability and provided the latest rank of the importance level of EFs in software development. Hence, how to incorporate multiple EFs and the randomness caused by these EFs into the development of software reliability model is essential yet challenging.

We firstly develop a generalized MEF-NHPP SRGM with multiple EFs and the associated randomness. Each EF is modeled as a random variable. The randomness induced by the EFs is elucidated by the martingale framework. We then incorporate a stochastic software fault detection process in the model due to the associated randomness. Software practitioners and researchers are able to obtain a specific MEF-NHPP SRGM according to the individual application environments from the proposed generalized MEF-NHPP SRGM. In order to elucidate the effectiveness of the proposed MEF-NHPP SRGM, we select two EFs, PoRM and FoPSC, from recent studies (Zhu et al. 2015; Zhu and Pham 2017) to further develop a specific MEF-NHPP SRGM. Lastly, two OSS data sets are employed to demonstrate the predictive power in terms of software failure and reliability of the proposed generalized MEF-NHPP SRGM.

Future research can be drawn from many directions. First, the dependencies between EFs and the impact of such dependencies on software reliability can be further investigated in the next step. Secondly, the impact of EFs is reflected on software fault detection process in this study. The investigation of such impact on the total fault content can be conducted.