Abstract
Computer systems play an essential role in the era of science and technology, where hardware and software work together with full accuracy. In this paper, the performance of computer systems has been analyzed under hardware repair, software upgradation, and load recovery using Weibull distribution for all random variables with different scale and standard shape parameters. For this purpose, a reliability model is developed with the help of the regenerative point technique and semi-Markov approach. The system is failing due to hardware, software, or load failure. The Weibull distribution is widely used in reliability and life data analysis due to its versatility. It is assumed that all types of failure and repair rates follow the Weibull distribution, and a single repairman can attend to all kinds of failures. Numerical calculations for mean time to system failure, availability, and profit function highlight the importance of the study.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Without the use of the computer, any industries, engineering activities, medical science, and all types of academic sections seems to be a handicap in this era. The performance of a computer depends upon the quality of hardware and software, so high-quality software and hardware are required to improve the performance as well as to complete the task in a specified required time. As the use of computer systems grows, so does the risk of the system failing. Hardware, software, or load failure can cause the system to fail and economic loss. Repair, upgrade, and recovery processes of the system are required a perfect server to overcome such problems in the system.
Scientists and engineers always focus on enhancing the reliability of computer systems by considering cold standby redundant systems and other relevant policies. They threw light on system reliability, and its application, such as Gopalan and Nagarwalla (1985) scrutinized a two-unit redundant system with a single server with repair and preventive maintenance related to age replacement. Malik and Pawar (2010) examined the economic aspects of a redundant system subjected to inspection for online repair and no repair under an abnormal environment, and they scrutinized reliability measures for single-unit systems under distinct failure types operating in abnormal environmental conditions in 2011. Gupta et al. (2013) highlighted the performance of a two distinct unit redundant system using Weibull failure and repair laws.
Barak et al. (2014) evaluated a redundant system with inspection having a single server subject to normal and abnormal weather conditions. Kumar and Saini (2014) used the Weibull distribution for failure and repair to investigate the reliability of a single-unit stochastic system with preventative maintenance. Kumar et al. (2015) described a cold standby stochastic system behavior subjected to maximum repair time. Kumar et al. (2016a, 2016b) looked at the performance of a redundant system utilizing the Weibull distribution for failure and repair operations, and they studied a two-unit redundant system having seniority and Weibull allocation for failure and repair. With the general distribution, Kumar and Goel (2016) examined the benefit and availability of a two-unit redundant system with inspection and preventative maintenance. Redundant systems with identical unit and single server failures have been studied by Yadav and Barak (2016), and a cold standby system with a single server exposed to inspection and a refreshment facility has been investigated by Barak et al. (2017).
Gahlot et al. (2018) used copula linguistics and system series setup to explore repairable system performance with varied failures and fixes. Kumar et al. (2018a, 2018b) looked at two separate unit redundant stochastic systems that were subjected to priority, preventive maintenance, and Weibull failure and repair to meet the customer's economic requirements. Kumar et al. (2018a, 2018b) evaluated the financial aspect of a warm standby system having a single server facility. Kadyan et al. (2020) exposed a non-identical repairable stochastic system with three units for operation with a cold standby facility. Kumar et al. (2020) threw light on reliability measures that increase the soft water treatment supply plant performance. Gupta et al. (2021) scrutinized the reliability measures of a generator in a steam turbine power plant system and its use in various sectors. Kumar et al. (2021) evaluated the reliability measures of a cold standby system subject to refreshment. Assessment of some proposed replacement models involving moderate fix-up has been studied by Waziri and Yakasai (2022). Aikhuele (2022) analyzed the development of a statistical reliability-based model for the estimation and optimization of a spur gear system. Maihulla et al. (2022) analyzed the reliability and performance of a series–parallel system using Gumbel–Hougaard Family Copula.
The researchers in the previous study were unable to push the concept of load recovery and software upgrade very far. Keeping the above study in mind, the performance of a computer system has been investigated using Weibull distributions for all random variables with varying scale and standard shape parameters during hardware repair, software upgrade, and load recovery in this manuscript, as shown in the Fig. 1. The Weibull distribution is widely used in reliability and life data analysis due to its versatility. Depending on the values of the parameters, the Weibull distribution can be used to model a variety of life behaviors. We will now examine how the values of the shape parameter, and the scale parameter, affect such distribution characteristics as the shape of the curve, the reliability, and the failure rate. A dependability model is created for this purpose using the regenerating point technique and a semi-Markov approach. The system is malfunctioning due to hardware, software, or load failure. It is assumed that all failure and repair rates follow the Weibull distribution, and all failures must be attended to by a single repairman. The study's value is shown by numerical estimates for mean time to failure, availability, and profit function.
2 System's assumptions
-
(a)
The system (see Fig. 1) consists of two units, one of which is operational and the other of which is on cold standby.
-
(b)
After the operative unit fails, the system has one cold standby unit that comes online and starts working.
-
(c)
The system breaks as a result of hardware, software, or load failure.
-
(d)
The failure situations must be resolved by a single repairman.
-
(e)
If the system fails due to hardware failure, a repairman is dispatched to fix it.
-
(f)
If the system fails to owe to a software malfunction, the repairman comes out and upgrades it.
-
(g)
When the system fails due to load failure, the repairman arrives and restores it.
-
(h)
The Weibull distribution governs all failure, repair, and upgrade rates.
3 System's notations
\(R\) | Set of regenerative states (S0, S1, S2, S3, S4, S16) |
\(O/\,Cs\) | Operative unit/cold standby unit |
\(HFur/\,HFUR\) | Failure of hardware under repair/continuously repair from the previous stage |
\(WHf/\,WHF\) | Failure of hardware waiting for repair/continuously waiting for repair from the previous state |
\(Sup/\,SUP\) | Software upgradation/continuously upgradation from the previous state |
\(WSup\,/\,WSUP\) | Software upgradation waiting for repair/continuously waiting for upgradation from the previous state |
\(Ldf/\,LDF\) | Load failure/continuously failure from the previous state |
\(f_{1} (t) = \alpha \eta t^{\eta - 1} e^{{ - \alpha t^{\eta } }}\) | The hardware failure rate of the unit |
\(f_{2} (t) = \beta \eta t^{\eta - 1} e^{{ - \beta t^{\eta } }}\) | The software failure rate of the unit |
\(f_{3} (t) = \gamma \eta t^{\eta - 1} e^{{ - \gamma t^{\eta } }}\) | The load failure rate of the unit |
\(g_{1} (t) = k\eta t^{\eta - 1} e^{{ - kt^{\eta } }}\) | Hardware repair rate of the unit |
\(g_{2} (t) = l\eta t^{\eta - 1} e^{{ - lt^{\eta } }}\) | Software upgradation rate of the unit |
\(g_{3} (t) = m\eta t^{\eta - 1} e^{{ - mt^{\eta } }}\) | The recovery rate of the unit is due to load failure |
\(w(t) = h\eta t^{\eta - 1} e^{{ - ht^{\eta } }}\) | Waiting time for server arrival purpose |
\(\mu_{i}\) | Let system failure time is signified by T, and in the state \(S_{i}\), mean sojourn time is, \(\mu_{i} = \int\limits_{0}^{\infty } {p(T > t){\text{d}}t}\) |
\(q_{ij} (t)/Q_{ij} (t)\) | pdf/cdf of direct transition time from \(S_{i} \in R\) to \(S_{j} \in R\) without visiting any other regenerative state |
\(q_{ij.\,k} (t)/Q_{ij.\,k} (t)\) | pdf/cdf of first passage time from \(S_{i} \in R\) to \(S_{j} \in R\) or a failed state \(S_{j}\) with visiting state \(S_{k}\) once in (0,t] |
\(\begin{gathered} q_{ij.k(r,s)} (t)/ \hfill \\ Q_{ij.k(r,s)} (t) \hfill \\ \end{gathered}\) | pdf/cdf of first passage time from \(S_{i} \in R\) to \(S_{j} \in R\) or to a failed state \(S_{j}\) with visiting states \(S_{k}\), \(S_{r}\) once in (0,t] |
\(M_{i} (t)\) | The probability that the system is originally up in the regenerative state \(S_{i} \in R\) up to at the time (t) without passing via any other \(S_{i} \in R\) |
\(W_{i} (t)\) | The probability that the repairman is busy in the state \(S_{i}\) up to time (t) without making any transition to any other \(S_{i} \in R\) or returning to the same via one or more non-regenerative states |
\(\oplus / \otimes\) | Laplace convolution notation/Laplace Steltjes convolution notation |
\(* / * * /^{\prime}\) | Laplace transform notation (LT)/Laplace Stieltjes transform notation (LST)/function’s derivative notation |
| Indicated up state/failed state/regenerative state respectively |
See Fig. 1.
4 Transition probabilities
There are the following possible transition probabilities:
\(p_{01} = \frac{\alpha }{(\alpha + \beta + \gamma )}\), \(p_{03} = \frac{\beta }{(\alpha + \beta + \gamma )}\), \(p_{0,15} = \frac{\gamma }{(\alpha + \beta + \gamma )}\), \(p_{12} = \frac{h}{(h + \alpha + \beta + \gamma )}\), \(p_{19} = \frac{\beta }{(h + \alpha + \beta + \gamma )}\), \(p_{1,\,11} = \frac{\alpha }{(h + \alpha + \beta + \gamma )}\), \(p_{1,\,20} = \frac{\gamma }{(h + \alpha + \beta + \gamma )}\), \(p_{20} = \frac{k}{(k + \alpha + \beta + \gamma )}\) \(p_{2,13} = \frac{\alpha }{(k + \alpha + \beta + \gamma )}\), \(p_{2,14} = \frac{\beta }{(k + \alpha + \beta + \gamma )}\),\(p_{2,19} = \frac{\gamma }{(k + \alpha + \beta + \gamma )}\), \(p_{34} = \frac{h}{(h + \alpha + \beta + \gamma )}\), \(p_{35} = \frac{\alpha }{(h + \alpha + \beta + \gamma )}\), \(p_{36} = \frac{\beta }{(h + \alpha + \beta + \gamma )}\), \(p_{3,21} = \frac{\gamma }{(h + \alpha + \beta + \gamma )}\), \(p_{40} = \frac{l}{(l + \alpha + \beta + \gamma )}\), \(p_{47} = \frac{\alpha }{(l + \alpha + \beta + \gamma )}\), \(p_{48} = \frac{\beta }{(l + \alpha + \beta + \gamma )}\), \(p_{4,22} = \frac{\gamma }{(l + \alpha + \beta + \gamma )}\)
It is smoothly certified that
5 Mean sojourn time
Let the system’s failure time be denoted by ‘T’, and in the state Si the mean sojourn time is:
\(\mu_{{0}} = \int\limits_{0}^{\infty } {P(T > t)dt} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(\alpha + \beta + \gamma )^{1/\eta } }}\), \(\mu_{{1}} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(\alpha + \beta + \gamma + h)^{1/\eta } }}\), \(\mu_{{2}} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(\alpha + \beta + \gamma + k)^{1/\eta } }}\), \(\mu_{{3}} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(\alpha + \beta + \gamma + h)^{1/\eta } }}\), \(\mu_{{4}} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(\alpha + \beta + \gamma + l)^{1/\eta } }}\), \(\mu_{5} = \mu_{6} = \mu_{7} = \mu_{8} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(l)^{1/\eta } }}\).
\(\mu_{9} = \mu_{11} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(h)^{1/\eta } }}\), \(\mu_{10} = \mu_{12} = \mu_{14} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(k)^{1/\eta } }}\), \(\mu_{13} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(\alpha )^{1/\eta } }}\), \(\mu_{{{15}}} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(\alpha + \beta + \gamma + m)^{1/\eta } }}\), \(\mu_{16} = \mu_{17} = \mu_{18} = \mu_{19} = \mu_{20} = \mu_{21} = \mu_{22} = \frac{{\Gamma \left( {1 + \frac{1}{\eta }} \right)}}{{(m)^{1/\eta } }}\)
6 Mean time to system failure (MTSF)
Let \(\phi_{i} (t)\) is the continuous density function of the first elapsed time from \(S_{i} \in R\) to a failed state. Using the semi-Markov process and regenerative point technique, treating the failed state as a trapping state, then upcoming recursive interface for \(\phi_{i} (t)\) is:
Now taking LST of the above relations (4) and solving for \(\phi_{0}^{**} (s)\), we have
Now, the system model reliability is obtained using the inverse LT of Eq. (5). We have
7 Steady-State availability
Let \(A_{i} (t)\) is the probability that the system is in up state at a particular time ‘t’ specified that the system arrives at the \(S_{i} \in R\) at t = 0. Using the semi-Markov process and regenerative point technique, the upcoming recursive relation for \(A_{i} (t)\) is:
where \(M_{0} (t) = e^{{ - (\alpha + \beta + \gamma )t^{n} }}\), \(M_{1} (t) = e^{{ - (\alpha + \beta + \gamma + h)t^{n} }}\), \(M_{2} (t) = \,e^{{ - (\alpha + \beta + \gamma + k)t^{n} }}\), \(M_{3} (t) = e^{{ - (\alpha + \beta + \gamma + h)t^{n} }}\), \(M_{4} (t) = e^{{ - (\alpha + \beta + \gamma + l)t^{n} }}\), \(M_{15} (t) = e^{{ - (\alpha + \beta + \gamma + m)t^{n} }}\).
Now taking LT of above relations (7) and solving for \(A_{0}^{*} (s)\), the steady-state availability is given by
where \(N_{A} = N_{A1} + N_{A2} + N_{A3}\)
8 Busy period of the server due to repair of the failed unit
Let \(B_{i} (t)\) is the probability that the repairman is busy due to repair of the failed unit at a time ‘t’ specified that the system arrives at the \(S_{i} \in R\) at t = 0. Using the semi-Markov process and regenerative point technique, the upcoming recursive interface for \(B_{i} (t)\) is:
where \(W_{1} (t) = e^{{ - (\alpha + \beta + \gamma + h)t^{n} }}\), \(W_{2} (t) = \,e^{{ - (\alpha + \beta + \gamma + k)t^{n} }}\), \(W_{3} (t) = e^{{ - (\alpha + \beta + \gamma + h)t^{n} }}\), \(W_{4} (t) = e^{{ - (\alpha + \beta + \gamma + l)t^{n} }}\), \(W_{15} (t) = e^{{ - (\alpha + \beta + \gamma + m)t^{n} }}\).
Now taking LT of above relations (10) solving for \(B_{0}^{R*} (s)\), the time for which server is busy due to repair is given by
where \(N_{B} = N_{B1} + N_{B2} + N_{B3}\)
Also, \(D^{\prime}\) is earlier defined by Eq. (9).
9 Expected number of visits by the server
Let \(V_{i} (t)\) is the estimated no. of visits by the repairman for repair in (0, t] specified that the system arrives at \(S_{i} \in R\) t = 0. Using the semi-Markov process and regenerative point technique, the upcoming recursive interface for \(V_{i} (t)\) is:
Now taking LST of the above relations (12) and solving for \(V_{0}^{**} (s)\). The expected no. of visits of the server can be obtained as
where \(V_{r} = [p_{20} (p_{40} + p_{47} ) + p_{2,\,14} p_{40} ] \times \{ (1 - p_{3,21} )(1 - p_{15,18} )(1 - p_{1,20} )\} \,\)
and \(D^{\prime}\) is earlier defined by Eq. (9).
10 Particular cases
where
Z1, Z2, and Z3 are defined earlier.
11 Profit analysis
The profit analysis of the system can be done using the profit function;
where \(E_{0} = 5000\) (Revenue per unit uptime of the system), \(E_{1} = 500\) (Charge per unit time for which server is busy due to repair), \(E_{2} = 200\) (Charge per unit visit made by the server).
12 Discussion
The numerical behavior of the MTSF, availability, and profit function of the system are represented in Tables 1, 2, and 3 corresponding to hardware repair rate ranges [0.1–1.0] respectively. According to these tables, all the reliability measures have an increasing trend as compared with their initial trend by fixing the constant values of the parameters such as hardware failure rate α = 0.002, software failure rate β = 0.003, load failure rate γ = 0.001, software upgradation rate l = 1.5, waiting for server arrival rate h = 0.002, and load recovery rate m = 2.5. The first and second, third and fourth, fifth and sixth columns of these tables show the effect of software upgradation rate l = 1.5 to 2.0 with fixed shape parameter η = 0.5, η = 1.0, η = 2.0 respectively. By comparing the first, third, and fifth columns of the Tables 1, 2, and 3, shape parameters are more effective as compared to software upgradation on the reliability measures of the system.
When the software upgradation rate l spans from [0.1–1.0], the fourth (Table 4) and fifth tables (Table 5) analyzed the behavior of availability and profit function having an increasing tendency and decreasing their values when shape parameter (η) enhanced by keeping fixed values of the other parameters such as α = 0.002, β = 0.003, γ = 0.001, k = 1.5, h = 0.002, m = 2.5. And, the system's availability and profit function improve when the hardware repair rate (k) is increased from 1.5 to 2 while the other parameters remain unchanged (Tables 4, 5).
The availability and profit function exhibit an increasing trend when the load recovery rate (m) is in the range [0.1–1.0], and declining trends when the shape parameter (η) is increased from 0.5, 1.0 to 2.0 while keeping other parameters constant such as hardware failure rate α = 0.002, software failure rate β = 0.003, load failure rate γ = 0.001, hardware repair rate k = 1.5, waiting for server arrival rate h = 0.002, and software upgradation rate l = 2.5, respectively, as shown in Tables 6 and 7.
The system's availability and profit function improve when the hardware repair rate (k) is increased from 2.5 to 3 while all other parameters remain unchanged, and the shape parameter has three different values 0.5, 1.0, and 2.0. Hence, the load recovery rate has a meaningful impact on the reliability measures of the system with different values of the shape parameter.
13 Conclusion
This research focuses on a two-unit cold standby redundant computer system that has been exposed to hardware maintenance rather than software upgrades and load recovery. When the hardware repair rate (k) rises, the system MTSF, availability, and profit function rise as well; however, when the form parameter (η) rises, these numbers fall. While the pace of system software upgrades (l) rises, system dependability metrics like MTSF, availability, and profit function rise as well, even when other parameters remain constant. The main finding of the study is that because software upgrades and load recovery may be costly and time-consuming, hardware repairs are a cost-effective and lucrative way to improve the system's availability and profitability.
14 Future Scope
Generally, hardware repair facilities enhance the system performance, availability, and profit because hardware failure of the system or component is visible and easy to repair as compared to software upgradation and load recovery. The study is more fruitful in the industries, geophysical sciences, enviornment science, green enery, and clear energy systems following such type of mathematical model for their betterment.
References
Aikhuele D (2022) Development of a statistical reliability-based model for the estimation and optimization of a spur gear system. J Comput Cogn Eng 1(1):1–6. https://doi.org/10.47852/bonviewJCCE2202153
Barak AK, Barak MS, Malik SC (2014) Reliability analysis of a single-unit system with inspection subject to different weather conditions. J Stat Manag Syst 17(2):195–206
Barak MS, Yadav D, Barak SK (2017a) Cost-benefit analysis of a redundant system with the server having a refreshment facility subject to inspection. Int J Adv Eng Res Sci 4(6):24–31
Barak MS, Yadav D, Barak SK (2017b) Stochastic analysis of a cold standby system with conditional failure of the server. Int J Stat Reliab Eng 4(1):65–69
Gahlot M, Singh VV, Ayagi HI, Goel CK (2018) Performance assessment of repairable system in the series configuration under different types of failure and repair policies using copula linguistics. Int J Reliab Saf 12(4):348–363
Gopalan MN, Nagarwalla HE (1985) Cost-benefit analysis of a one-server two-unit cold standby system with repair and age replacement. Microelectron Reliab 25(5):977–990
Gupta R, Kumar P, Gupta A (2013) Cost-benefit analysis of a two dissimilar unit cold standby system with Weibull failure and repair laws. Int J Syst Assur Eng Manage 4(4):327–334
Gupta N, Kumar A, Saini M (2021) Reliability and Maintainability Investigation of Generator in Steam Turbine Power Plant using RAMD analysis. J Phys Conf Ser 1714(1):012009
Kadyan S, Barak MS, Gitanjali (2020) Stochastic analysis of a non-identical repairable system of three units with priority for operation and simultaneous working of cold standby units. Int J Stat Reliab Eng 7(2):269–274
Kumar J, Goel M (2016) Availability and profit analysis of a two-unit cold standby system for general distribution. Cogent Math 3(1):1262937
Kumar A, Saini M (2014) Cost-benefit analysis of a single-unit system with preventive maintenance and Weibull distribution for failure and repair activities. J Appl Math Stat Inf 10(2):5–19
Kumar A, Barak MS, Devi K (2016a) Performance analysis of a redundant system with Weibull failure and repair laws. Investig Oper 37(3):247–257
Kumar A, Saini M, Devi K (2016b) Analysis of a redundant system with priority and Weibull distribution for failure and repair. Cogent Math 3(1):1135721
Kumar A, Pawar D, Malik SC (2018a) Economic analysis of a warm standby system with a single server. Int J Math Stat Invent (IJMSI) 6(5):01–06
Kumar A, Saini M, Devi K (2018b) Stochastic modeling of non-identical redundant systems with priority, preventive maintenance, and Weibull failure and repair distributions. Life Cycle Reliab Saf Eng 7(2):61–70
Kumar A, Singh R, Saini M, Dahiya O (2020) Reliability, Availability, and maintainability analysis to improve the operational performance of soft water treatment and supply plant. J Eng Sci Technol Rev 13(5):183–192
Kumar A, Garg R, Barak MS (2021) Reliability measures of a cold standby system are subject to refreshment. Int J Syst Assur Eng Manage. https://doi.org/10.1007/s13198-021-01317-2
Maihulla AS, Yusuf I, Bala SI (2022) Reliability and performance analysis of a series-parallel system using gumbel–hougaard family copula. J Comput Cogn Eng 1(2):74–82. https://doi.org/10.47852/bonviewJCCE2022010101
Malik SC, Pawar D (2010) Reliability and economic measures of a system with inspection for online repair and no repair activity in abnormal weather. Bull Pure Appl Sci 29(2):355–368
Pawar D, Malik SC (2011) Performance measures of a single–unit system are subject to different failure modes with operation in abnormal weather. Int J Eng Sci Technol 3(5):4084–4089
Pourreza H, Jamkhaneh EB, Deiri E, Harish G (2022) Estimating the parametric functions and reliability measures for exponentiated lifetime distributions family. Gazi Univ J Sci 35:1665–1684
Waziri TA, Yakasai BM (2022) Assessment of some proposed replacement models involving moderate fix-up. J Comput Cogn Eng 1(1):1–10. https://doi.org/10.47852/bonviewJCCE2202150
Yadav D, Barak MS (2016) Stochastic analysis of a cold standby system with server failure. Int J Math Stat Invent 4(6):18–22
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors contributed equally to this manuscript, and there is no conflict of interest between them.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumar, A., Garg, R. & Barak, M.S. Performance analysis of computer systems with Weibull distribution subject to software upgrade and load recovery. Life Cycle Reliab Saf Eng 12, 51–63 (2023). https://doi.org/10.1007/s41872-022-00211-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41872-022-00211-5