Keywords

1 Introduction and Motivation

Currently, telecommunication high-altitude platforms (THAP), which are implemented on autonomous unmanned aerial vehicles (UAV), are widely developed and used in various fields of human activity [1, 2]. The main disadvantage of UAVs is the limited operating time associated with the short service life of UAV batteries equipped with electric motors or a limited supply of fuel for internal combustion engines. In this regard, such UAVs cannot be effectively used in systems that require a long operating time. The long-term operation can be ensured by tethered THAP, in which the engines and payload equipment are powered from ground-based energy sources [3,4,5]. The ability to transmit high-power energy (10–15 kW) through a cable from the ground to the THAP’s board allows lifting and holding at altitudes of 100–200 m of a payload telecommunication load for a long time, limited only by the reliability characteristics of the platform [6,7,8,9]. High reliability of the tethered unmanned module is achieved by the following ways: 1) choice of propulsion systems with a large meantime between failures; 2) redundancy of individual elements of the control system; 3) the usage of a multi-rotor architecture (for example, in a quadcopter, a failure of one engine leads to a complete cessation of operation, and in an eight-rotor version, in case of failure two motors, the copter may continue to run) and so on.

The reliability of such complex systems is effectively investigated using a mathematical model of the k-out-of-n system [10]. Such a system has broad practical applications in various industries: telecommunications and robotics [11, 12], oil and gas [13], subsea pipeline monitoring systems [14], cryptography [15], etc. This model has been widely studied under many assumptions about the structure of such a model, for example, the dependence and independence of the system elements, the shape of life and repair times distributions, different recovery scenarios, and others. To study various k-out-of-n systems, both analytical methods based on multidimensional Markov processes and simulation are used [11, 16,17,18].

Sensitivity analysis is a significant research stage, especially for redundancy systems like k-out-of-n system. In stochastic systems, stability is often understood as the insensitivity or low sensitivity of their output characteristics to the shape of some input distributions. The term “sensitivity” in other areas, for example, civil engineering, can be defined differently [19]. In queuing theory, the first results of sensitivity research are presented by Sevastyanov, Kovalenko, Gnedenko, Soloviev, and others. Some of the latest studies see in [18] and its references.

An additional research method considered in this paper is machine learning (ML). In queuing and reliability theories, ML methods are usually used for studying various probabilistic and time characteristics of complex systems. They are also useful in those cases when it is impossible to obtain results either analytically or using simulation [20]. The application of ML techniques for analyzing the reliability of an unmanned high-altitude module is due to the following factors.

  1. 1.

    From a practical point of view, the system service time is often estimated by its average value, while the shape of the lifetime distribution is unknown and can only be assumed based on some statistical data. ML model can operate based on the mean value without considering a specific distribution function of the lifetime of system elements.

  2. 2.

    Some parameters inside the system can significantly impact its reliability. However, from practice, this information may also be absent. The sensitivity analysis helps identify these weaknesses, after which they will be included in the ML model.

  3. 3.

    A model built and trained using ML techniques can predict the system reliability characteristics faster than a simulation model. In addition, it allows making accurate predictions on many data simultaneously, while simulation can only give a similar result after a lot of iterations.

  4. 4.

    Trained model can be useful and used by engineers at the development stage of such modules for many aims: to determine a highly reliable system architecture (parameters k, n), select the module components, the characteristics of which will support reliability and long-term operation of THAP (mean lifetime a and the coefficient of variation v), and also predict how long this unmanned module will operate with a satisfactory level of reliability.

There are many machine learning techniques. In the article, we will consider supervised learning for some types of regressions and neural networks using a Python programming language [21]. For this Scikit-learn [22] and TensorFlow [23] libraries will be used.

This paper continues studies related to reliability and sensitivity and considers a hot standby non-repairable system using analytical and simulation methods. The current paper aims to study the reliability of tethered THAP using the k-out-of-n:G system and ML methods, which make it possible to determine a satisfactory level of module reliability at different initial parameters with high accuracy.

The article is organized as follows. The next section introduces the problem setting and some notations. In Sect. 3, reliability function of homogeneous k-out-of-n:G system will study. Subsections 3.1 and 3.2 contain analytical results for a simple homogeneous k-out-of-n:G system and a homogeneous \(k^{*}\)-out-of-n:G system, the failure of which depends on the location of the failed elements. A numerical example and sensitivity analysis of the considered systems are presented in Subsects. 3.3. In Sect. 4, various ML techniques for predicting the level of reliability of unmanned module will discuss. Subsection 4.1 describes the methods and data used in this research, which are implemented in Subsects. 4.2 and 4.3. The paper ends with a conclusion and some problems descriptions.

2 Problem Setting

Due to the multi-rotor architecture of the high-altitude module, which consists of n identical engines, consider homogeneous k-out-of-n:G system. Such a system consists of n elements and remains operational iff at least k out of n elements are operational. Denote by \(A_i,\quad i = 1,2,...\), lifetimes of the system elements. Suppose that these random variables are independent and identically distributed, thus the corresponding cumulative density function is defined as \(A(t) = {\textbf {P}} \lbrace A_i \le t \rbrace \). Suppose also that instantaneous failures are impossible and their mean times are finite:

$$\begin{aligned} A(0) = 0, \quad a = \int _0^{\infty }(1 - A(t))dt. \end{aligned}$$

For the system study, introduce the random process \(J =\{J(t), \,\, t\ge 0\}\) with

$$ J(t) = \text{ number } \text{ of } \text{ working } \text{ components } \text{ in } \text{ time } \, t $$

with the set of states \(E = \{j = \overline{0, k}\}\), where j is number of working units.

Denote also by T time to first system failure \(T = \inf \lbrace t: J(t) \in E_1 \rbrace \), where \( E_1 = \{j = \overline{0, k-1}\}\) is the set of UP states. \( E_0 = \{j = k\}\) is the set of DOWN states. Thus, we are interesting in calculation of reliability function

$$\begin{aligned} R(t) = {\textbf {P}} \lbrace T > t \rbrace , \end{aligned}$$

and the mean time to system failure (MTTF)

$$ m = \int _0^\infty R(t) dt. $$

3 Analytical Models and Sensitivity Analysis

3.1 Reliability Function of Homogeneous k-out-of-n:G System

Consider homogeneous k-out-of-n:G system, \(A_i(t)=A(t)\) (\(i=\overline{1,n}\)). It is well known, the probability that exactly i elements of the system from n at time t are in a working state has the form

$$\begin{aligned} {\textbf {P}}(t) = C^i_n(1-A(t))^i A(t)^{n-i}. \end{aligned}$$

Thus, the reliability function of such a system (the probability of the system operating for a certain time t without failure) is

$$\begin{aligned} R(t) = {\textbf {P}} \lbrace T>t \rbrace = \sum _{i\ge k} ^ n C^i_n(1-A(t))^i A(t)^{n-i}. \end{aligned}$$
(1)

3.2 Reliability Function of Homogeneous k-out-of-n:G System Taking into Account the Location of the Failed Units

To investigate the reliability function of more complex homogeneous system, the failure of which depends on the location of its failed components, introduce a vector description of the state of the system \({\textbf {j}} = (j_i, j_2, ..., j_n)\), where \(j_i = 0\) if i-th component failed and \(j_i = 1\) if it works. Then the probability of state j in time t equals to

$$\begin{aligned} p_{{\textbf {j}}} (t) = \prod _{1 \le i \le n}(1-A(t))^{j_i} A(t)^{1-j_i}. \end{aligned}$$

The probabilities of the operable and failure states of the system at the time t take the forms

$$\begin{aligned} {\textbf {P}}(UP) = \sum _{{{\textbf {j}}} \in E_1} p_{{\textbf {j}}} (t), \quad {\textbf {P}}(DOWN) = \sum _{{{\textbf {j}}} \in E_0} p_{{\textbf {j}}} (t). \end{aligned}$$

Thus, the system reliability function is

$$\begin{aligned} R(t) = \sum _{{{\textbf {j}}} \in E_1} \prod _{1 \le i \le n}(1-A(t))^{j_i} A(t)^{1-j_i}. \end{aligned}$$
(2)

3.3 Numerical Examples and Sensitivity Analysis

As a numerical example consider the case of 4-out-of-6:G system. It is supposed that the lifetime of the system’s units have the following distributions:

  • Gamma \(\left[ \varGamma \left( 1 / v^2, a v^2 \right) \right] \);

  • Gnedenko-Weibull \(\left[ GW \left( \mu , \frac{a}{\varGamma (1 + 1/\mu )} \right) \right] \);

  • Log-normal \(\left[ LnN \left( \ln {\frac{a}{\sqrt{1 + v^2}}}, \sqrt{\ln \left( { 1 + v^2}\right) } \right) \right] \),

where a is mean lifetime of the system components and v is its coefficient of variation. \(\mu \) is the shape parameter of GW distribution and selected based on the value of v.

In our experiments we choose \(a = 1\) and \(v = [0.1, 0.5, 1, 5, 10]\). First, consider the simple case of homogeneous 4-out-of-6:G system.

Fig. 1.
figure 1

Reliability function R(t) of homogeneous 4-out-of-6:G system

Figure 1 shows the dependence of system reliability function on the time t calculated by formula (1). Black, red and blue colors correspond the \(\varGamma \), GW, and LnN distributions, respectively. As it can be seen from the curves, the reliability function of the system is asymptotically insensitive to the form of the lifetime distribution at fixed mean and coefficient of variation \(v \le 1\). At the same time, with \(v > 1\), this insensitivity disappears, and the system loses its reliability very quickly. We can conclude that the system behavior depends on the value v.

Further, look at the reliability of the 4-out-of-6:G system taking into account the location of the failed units. Denote such a system as a \(4^{*}\)-out-of-6:G system. Suppose that the system is operational as long as at least 4 out of 6 engines are running, and two failed motors should not be located next to each other. In other words, the system fails when two adjacent motors stop operate, or when any three engines fail.

Due to the complexity (time and computational) of calculating the reliability function using the formula (2) for arbitrary A(t), k, and n, here we will apply simulation modeling to achieve our goals. The numerical example for the case of exponential distribution of system elements lifetime can be found in paper [11].

To build a simulator Python programming was chosen. The constructed simulation model is shown graphically as a process flowchart (Fig. 2). As a result of the algorithm, we can get the empirical reliability function \(\hat{R}(t)\), and MTTF.

Fig. 2.
figure 2

Flowchart of the simulation model of a \(k^{*}\)-out-of-n:G system

Figure 3 shows evaluation result using simulation. The example of both the same system and parameters as before are used.

Fig. 3.
figure 3

Reliability function R(t) of homogeneous \(4^{*}\)-out-of-6:G system

As can be seen from the numerical examples, the behavior of 4-out-of-6:G and \(4^{*}\)-out-of-6:G system reliability functions is very similar. To see the difference between them, consider corresponding MTTF (Table 1).

Table 1. Mean lifetime m of 4-out-of-6:G/\(4^{*}\)-out-of-6:G systems

The results of the calculation of m confirm the conclusions of the sensitivity analysis. Moreover, the 4-out-of-6:G system, without dependence on the location of the failed elements, is efficient for a longer time than the other one.

4 Machine Learning Methods and Their Application to the Task

This section presents the results of prediction THAP reliability using ML methods.

4.1 Methods and Data

As ML methods [21], we will consider the followings from scikit-learn (for regressions) and TensorFlow (for neural network) libraries:

  • Linear regression (LinReg),

  • Polynomial regression (degree \(= 4\)) (PolyReg),

  • K-nearest neighbors regression (n_neighbors = 5) (KNN),

  • Multi-output regression with cross-validation (scoring = MSE) based on Ridge regression (MultiReg),

  • Artificial neural network with three hidden layers (optimizer = RMSprop(1e-3), loss = MSE, batch size = 96) (ANN).

As it was noted in the introduction, the purpose of machine learning application is to predict the reliability and time characteristics of a tethered unmanned module. Therefore, the output parameters are Rtm (Table 2). The set of parameters, as well as their ranges, is associated with the following. The previous section concludes some hidden parameters of the system, namely the coefficient of variation, have a significant impact on its behavior and performance. Moreover, the system is insensitive to the shape of the lifetime distribution with \(v < 1\). In addition, from a practical point of view, we assume that the system is at a satisfactory level of reliability if \(R(t) \ge 0.5\).

Table 2. Variables for machine learning models and their ranges

We have generated two datasets for training the models.

  1. 1.

    To train the model, which describes the behavior of THAP by homogeneous k-out-of-n:G system, the dataset was generated using formula (1), in which \(A(t) \sim \varGamma \).

  2. 2.

    For the second case, in which system failure depends on the location of the failed elements, simulation results were used, here also \(A(t) \sim \varGamma \). This data supposes that a system failure occurs either when 2 adjacent or any (\(n - k + 1\)) elements have failed.

The architecture of the selected ML models is different. Some can predict several outputs simultaneously, while others can operate with only one outcome. The whole process contains two phases – training and testing. Before training, we divide the initial dataset into train and test sets with a ratio of \(70\%\) and \(30\%\), respectively. The learning process for LinReg, PolyReg, and KNN is structured as follows. The first step is to predict reliability R using parameters nkavt. Next, the model is trained for prediction t on parameters nkavR. The last cycle ends with a forecast of m based on the set nkavRt. After each round, the accuracy of the trained model is assessed, and testing begins on a new data sample. For MultiReg and ANN, there is one training cycle, in which the model predicts Rtm simultaneously based on nkav. These models provide an additional phase for monitoring training, the so-called cross-validation. In this way, the initial set is divided into \(70\%\), \(20\%\) and \(10\%\) for train, validation and final test.

4.2 Training and Testing Results for k-out-of-n:G System

Now move on to the results of ML techniques application for analyzing the reliability of a tethered unmanned high-altitude module. First, consider the k-out-of-n:G system. Table 3 shows the mean square error (MSE) for the predicted values on the training set. The table results show the smallest prediction error was achieved using PolyReg and KNN. The greatest error corresponds to MultiReg. The closest prediction in the training phase among all methods was made for MTTF m.

Table 3. Accuracy of training
Table 4. Accuracy of testing

Table 4 demonstrates MSE, mean absoulute error (MAE) as well as the coefficient variation (\(R^2\)) for the test set. Analyzing the results obtained, we can note that MSE estimate for all cases lies in an acceptable interval. MAE estimate shows the relative value of the prediction error. In our task, MAE \(\ge \) 0.05 is considered unsatisfactory. Therefore, only the K-nearest neighbors regression shows the obtained accuracy result among all the considered cases. \(R^2 \) estimate indicates how well the constructed model adequately describes the initial data. The best result for this indicator is again shown by the KNN method. Note that all methods are suitable for predicting the meantime m. The estimates MSE and MAE are quite small, and \(R^2\) is high, which confirms the high dependence between the input and output parameters.

Consider prediction results on the test set graphically. Figure 4, 5, 6, 7 and 8 shows the scatter diagrams for ML methods described above. For each of these figures, 500 samples were taken at random. In reality, the test sample contains about 200.000 values. LinReg and MultiReg demonstrate similar results for all predicted parameters, but their accuracy is quite low. PolyReg and ANN show acceptable prediction accuracy of m. For the other two, the prediction error is too high. These methods present insufficient prediction accuracy. It suggests that models do not reflect the relationship between input and output data. Predictions for R and m using KNN are close enough to their exact values. For t, this is not so much accurate. Nonetheless, the application of the KNN method obtains the most accurate prediction result for all metrics among the considered ML techniques.

Fig. 4.
figure 4

Scatter plots for LinReg

Fig. 5.
figure 5

Scatter plots for PolyReg

Fig. 6.
figure 6

Scatter plots for KNN

Fig. 7.
figure 7

Scatter plots for MultiReg

Fig. 8.
figure 8

Scatter plots for ANN

4.3 Training and Testing Results for \(k^{*}\)-out-of-n:G System

The application of machine learning techniques to the task at hand has shown that KNN most accurately predicts the reliability of an unmanned high-altitude module, the failure of which occurs after the failure of \((n - k + 1)\) its elements. Therefore, for the second case of dependence of the system failure on the location of the failed elements, we will consider only the KNN method. Consider the learning accuracy results (Table 5). MSE is small enough and takes the desired value.

Table 5. Accuracy of training (MSE)

The results on the test set are presented in Table 6 and Fig. 9. The results of the prediction accuracy take acceptable values. MSE and MAE are small enough, and the coefficient of determination \(R^2\) is high.

Table 6. Accuracy of testing

The graphical results show similar prediction accuracy to the k-out-of-n:G system. The KNN model accurately reflects the dependence of R and m on the initial data, while the prediction of t is not so accurate, MAE \(\approx 5\%\).

Fig. 9.
figure 9

Scatter plots for KNN

5 Conclusion

The paper investigates the reliability of an unmanned high-altitude module based on a mathematical model of the k-out-of-n system and machine learning methods. Two scenarios of the dependence of the system failure on the location of the failed elements were considered. Analytical results and sensitivity analysis demonstrated the dependence of the system reliability on the coefficient of variation of the lifetime for both scenarios. The application of machine learning methods showed that K-nearest neighbors regression describes the system reliability in the best way. As future research direction, we plan to improve chosen ML model to achieve more accurate predictions and consider other methods and models.