Keywords

1 Introduction

The management of a complex manufacturing enterprise requires great attention to all interrelated business processes [1]. Such processes are characterized by a high level of complexity, and additionally, by massive volumes of aggregated information [2].

The production processes have unobvious relations, so the decision-maker does not always recognize or understand the correlation between them [3].

The current approach of production management is based on using a standard industrial methodology adopted for various enterprises. The industrial methodology contains algorithms and coefficients, accumulated from the statistic of the whole industry [4]. The principal disadvantage of this approach is a strong contrariety between real production indicators and indicators described in the methodology. The industrial methodology is inflexible and can’t be operatively modified according to changes in the problem area. Therefore the development of the governance framework for data quality and traceability of information used in decision-making is needed [5]. Qualitative data management is critical in this context, but can also be very time-consuming for implementation.

Thus, the limitations of the industry methodology in the decision-making process are:

  • the complexity of calculation of statistical coefficients based on production indicators analysis;

  • the impossibility of dynamic adaptation of calculations into periods that shorter than the forecast horizon;

  • the methodology does not provide adaptation to the specific production.

Enterprise management tasks must be solved using new automation and intellectualization approaches to analysis and forecasting of production indicators  [6, 7].

To solve this problem the next tasks must be solved:

  • definition of input data;

  • creation of models that reflect the state of production processes;

  • development of algorithms for linguistic summarization of the current state of the enterprise.

In the process of analyzing the industrial methodology was found that the production indicators (for example, staff time, staff performance, equipment performance and depreciation of equipment) are aggregated and averaged that decrease the accuracy of the decision-making. Information systems of the enterprise contain all the necessary information to evaluate the state of production. The production processes can easily be represented by a discrete time series that could be extracted from information systems. It is necessary to use time series that modeling with type 2 fuzzy sets to account for the fuzziness of the real world [8]. Using the fuzzy approach allows creating models that can improve the quality of the decision-making [9,10,11]. The fuzzy approach and ontology engineering methods are used in this research. The hybridization of these approaches allows analyze the data about production processes and makes linguistic summarization of the production state in the process of decision-making.

2 Types of Enterprise Time Series

The main point of the proposed approach is to find changes in the values of production processes indicators and react to these changes operatively. Time series models are used for tracking these changes. The industrial methodology for production management based on a set of statistical coefficients. But these coefficients not always can be given by an expert or calculated. Each of them can be extracted from the enterprise information systems.

The following types of time series were extracted from enterprise information systems:

  • unit productivity dynamics as a sum of employees labor productivities;

  • the number of equipment in units grouped by type and type of work;

  • the number of employees grouped by type and type of work;

  • area usage as a sum of equipment sizes;

  • utilization of equipment as a planned load in the framework of the production program.

The number of points in the time series is in the range of 12–40 points because data were collected for the past period from a year to three years. Time series are discrete nature with a monthly average value at a point. Finding the seasonality, local and global tendencies of time series is critical for the quality of time series analysis.

Let see models for smoothing, extracting, and forecasting of production processes time series that proposed in this research.

3 Time Series Model Based on Type 2 Fuzzy Sets

Time series modeling based on type 2 fuzzy sets allows building the model that reflects the uncertainty of the choice of values of coefficients or indicators determined by an expert. An interval time series was chosen as the representation of the modeling object. In [12] is described the algorithm for constructing a model for interval time series.

The formal model of the interval time series can be represented as the following expression:

$$ TS = \lbrace ts_i \rbrace , i \in N, $$

where \(ts_i=[t_i, \tilde{A}_{t_i}]\) is an element of the time series at the moment of time \(t_i\) and a value in the form of a type 2 fuzzy set \(\tilde{A}_{t_i}\). The universe of type 2 fuzzy sets is defined as \(U = (\tilde{A}_1, \dots , \tilde{A}_l), \tilde{A}_{t-i} \in U, l \in N,\) for the entire time series, where l is the number of fuzzy sets in the universe. A set \(\tilde{A}_{t_i}\) is a type 2 fuzzy set, therefore, a fuzzy set of type 1 is assigned to it as a value.

A triangular is used as the form of fuzzy sets due to the small computational complexity.

Type 2 fuzzy sets \(\tilde{A}\) in the universum U can be defined using type 2 membership function. Type 2 fuzzy sets can be represented as:

$$ \tilde{A} = ((x, u), \mu _{\tilde{A}}(x, u))| \forall x \in U, \forall u \in J_x \subseteq [0, 1] $$

where \(x \in U\) and \(u \in J_x \subseteq [0, 1]\) in which \(0 \le \mu _{\tilde{A}} (x, u) \le 1\). The main membership function is in the range from 0 to 1, so the appearance of the fuzzy set is expressed as:

$$ \tilde{A} = \int _{x \in U} \int _{u \in J_x} \mu _{\tilde{A}}(x, u)/(x, u) J_x \subseteq [0, 1] $$

where the operator \( \int \int \) denotes the union over all incoming x and u.

Time series modeling needs to define interval fuzzy sets and their shape. The Fig. 1 shows the appearance of the sets.

Fig. 1.
figure 1

The shape of the upper and lower membership functions

Triangular fuzzy sets are defined as follows:

$$ \tilde{A}_i = (\tilde{A}_i^U, \tilde{A}_i^L) = ((a_{i1}^u, a_{i2}^u, a_{i3}^u, h(\tilde{A}_i^U)), (a_{i1}^l, a_{i2}^l, a_{i3}^l, h(\tilde{A}_i^l))). $$

where \(\tilde{A}_i^U\) and \(\tilde{A}_i^L\) is a triangular type 1 fuzzy sets, \(a_{i1}^u, a_{i2}^u, a_{i3}^u, a_{i1}^l, a_{i2}^l, a_{i3}^l,\) is reference points of type 2 interval fuzzy set \(\tilde{A}_i\), h is the value of the membership function of the element \(a_i\) (for the upper and lower membership functions, respectively).

An operation of combining fuzzy sets of type 2 is required in the process of working with a rule base based on the values of a time series. The combining operation defined as follows:

$$\begin{aligned} \begin{array}{c} \displaystyle \tilde{A}_1 \oplus \tilde{A}_2 = (\tilde{A}_1^U, \tilde{A}_1^L) \oplus (\tilde{A}_2^U, \tilde{A}_2^L)\\ \displaystyle = ((a_{11}^u + a_{21}^u,a_{12}^u + a_{22}^u, a_{13}^u + a_{23}^u;\\ \displaystyle min(h_1(\tilde{A}_1^U), h_1(\tilde{A}_2^U)\tilde{A}_1^U)), min(h_2(\tilde{A}_1^U), h_2(\tilde{A}_2^U)),);\\ \displaystyle (a_{11}^l + a_{21}^l,a_{12}^l + a_{22}^l, a_{13}^l + a_{23}^l;\\ \displaystyle min(h_1(\tilde{A}_1^L), h_1(\tilde{A}_2^L)), min(h_2(\tilde{A}_1^L), h_2(\tilde{A}_2^L)) ); \end{array} \end{aligned}$$

4 Algorithm for Smoothing and Forecasting of Time Series

The main principle of the proposed algorithm is closely related to the nature of the time series. Type 2 fuzzy sets are used for modeling in the process of smoothing and forecasting of time series because the time series has the interval nature  [13].

The proposed algorithm can be represented as a sequence of the following steps:

  1. Step 1.

    Determination of the universe of observations. \(U = [U_{min}, U_{max}]\), where \(U_{min}\) and \(U_{max}\) are minimal and maximal values of a time series respectively.

  2. Step 2.

    Definition of membership functions for a time series \(M = \{ \mu _1, ..., \mu _l\}, l<< n\), where l is the number of membership functions of fuzzy sets, n is the length of a time series. The number of membership functions and, accordingly, the number of fuzzy sets is chosen relatively small. The motivation for this solution is the multi-level approach to modeling a time series. To decrease the dimension of the set of relations is necessary to reduce the number of fuzzy sets at each level. Obliviously, this approach decrease the approximation accuracy of a time series. However, creating the set of membership functions at the second and higher levels increase the approximation accuracy with an increase in the number of levels.

  3. Step 3.

    Definition of fuzzy sets for a time series. The superscript defines the type of fuzzy sets in that case. \(A^1 = \{A^1_1, ..., A^1_l\}, A^2 = \{A^2_1, ..., A^2_m\}\), where l is the number of type 1 fuzzy sets, m is the number of type 2 fuzzy sets.

  4. Step 4.

    Fuzzification of a time series by type 1 sets. \(\forall x_i\) \(\tilde{y}_i = Fuzzy(x_i)\)

  5. Step 5.

    Fuzzification a time series by type 2 sets.

  6. Step 6.

    Creation of relations. The rules for the creation of relations are represented in the form of pairs of fuzzy sets in terms of antecedents and consequents, for example: \(A^1_1A^2_1... \longrightarrow A^1_2A^21\).

  7. Step 7.

    Forecasting for the first and second levels based on a set of rules. The forecast is calculated by the centroid method, first on type 1 fuzzy sets \(A^1 = \{A ^ 1_1, \dots , A ^ 1_l \} \), then on type 2 fuzzy sets.

  8. Step 8.

    Evaluation of forecasting errors.

The prediction step should include trend analysis for the best results [14,15,16].

5 Experiments

5.1 Time Series Approximation

The experiment with the approximation method of a time series is needed for verification of the hypothesis that the approximation of the time series that used a time series model based on higher orders fuzzy sets have high accuracy when selected the optimal number and shape of fuzzy sets.

The formation of an approximate representation of a time series based on higher orders fuzzy sets consist of the creation of a set of fuzzy sets at each level.

Additionally, the universe of values will be determined for each level: for the whole time series at the first level, then for the intervals of values of each of the sets at the previous level.

The following parameters were determined in the experiment:

  • Fuzzification of the time series by type 1 and type 2 fuzzy sets.

  • The number of fuzzy sets of each type is 3.

  • The shape of fuzzy sets is isosceles triangles.

The Fig. 2 shows the approximation result of the time series of the unit productivity dynamic (production coefficient). A small number of fuzzy sets at the first level allow get only a rough approximation of the time series. The usage of type 2 fuzzy sets for approximation improves the accuracy of the approximation result. 80% of the time series was used as experimental data, and 20% to assess the quality of the proposed methods. The accuracy of the approximation result is evaluated using the SMAPE criterion [17] (the lower, the better):

  • for type 1 fuzzy sets the SMAPE is 5.06%.

  • for type 2 fuzzy sets the SMAPE is 2.82%.

Fig. 2.
figure 2

Approximation and forecasting of the time series of the production coefficient

For the time series of the number of employees (employee count) (see Fig. 3) the SMAPE scores are:

  • for type 1 fuzzy sets the SMAPE is 18.61%.

  • for type 2 fuzzy sets the SMAPE is 9.66%.

Fig. 3.
figure 3

Approximation and forecasting the time series of the number of employees

Conclusions from the experiment with the approximation of the time series:

  1. 1.

    The accuracy of the approximation is depended on the number of fuzzy sets. The experiment has shown that the small number of fuzzy sets possible to achieve a high approximation accuracy.

  2. 2.

    The boundaries of the fuzzy sets is also important. The approximation result has a low accuracy at the initial segments for both time series presented in the experiment. The problem of choosing the boundaries of fuzzy sets intervals remains relevant.

5.2 Time Series Forecasting

The time series forecasting experiment is conducted using the “If-Then” rule base.

Rules are extracted from time series in the form of sequences of fuzzy sets acting as antecedents (number \(>= 2\)) and consequent values, which are also a fuzzy set. The formed rule base makes it possible to get a one-point forecast.

The conditions of the experiment:

  • The forecast was made for the test interval that contains 10 points.

  • The forecast was made based on both type 1 and type 2 fuzzy sets.

The forecasting of the time series of the production coefficient (see Fig. 2) is complicated by the fact that the testing interval contains the previously not presented values, and the rules do not have specific series behavior.

Forecasting result accuracy of the production coefficient is evaluated using the SMAPE criterion:

  • for type 1 fuzzy sets the SMAPE is 7.69%.

  • for type 2 fuzzy sets the SMAPE is 9.27%.

The Fig. 3 shows the forecasting result of the time series of the number of employees.

Forecasting result quality of the number of employees is evaluated by SMAPE:

  • for type 1 fuzzy sets the SMAPE is 39.41%.

  • for type 1 fuzzy sets the SMAPE is 27.91%.

The conclusions of the experiment with time series forecasting correlate with the conclusions of the experiment with approximation: the quality of modeling and forecasting of a time series depends on the number and boundaries of fuzzy sets.

5.3 Ontology-Oriented Linguistic Summarization of the Time Series Forecast Values

The subsystem for ontology-oriented linguistic summarization of the time series forecast values allows to decision-maker operatively react to changes in the situation at the factory. The basic input data of the subsystem is the production program. The result of the subsystem calculated based on the values of production indicators and their dynamics. The using of natural language in the process of summarization allows the decision-maker to understand the resulting summaries in an easy way [18]. The ontology [19, 20] of the following structure is used to organize the rule base of the linguistic summarization subsystem:

$$\begin{aligned} O = \langle V, D, R, F \rangle , \end{aligned}$$
(1)

where \(V = \lbrace v_{1}, v_{2}, \ldots , v_{k} \rbrace \) is the set of indicators that determine a state of production (for example, the number of equipment, equipment productivity, number of employees, etc.);

\(D = \lbrace d_{1}, d_{2}, \ldots , d_{i}, \ldots , d_{k} \rbrace \) is the set of indicator summarization variants. A \(d_{i}\) can be represented as the following expression:

$$\begin{aligned} d_{i} = \lbrace \langle s_{1}, t_{1} \rangle , \langle s_{2}, t_{2} \rangle , \ldots \langle s_{j}, t_{j} \rangle , \ldots \langle s_{l}, t_{l} \rangle \rbrace , \end{aligned}$$

where \(s_{j}\) is a characteristic of the indicator \(d_{i}\) value (for example, low, medium, high);

\(t_{j}\) is a linguistic representation of the indicator \(d_{i}\) state for a specific characteristic \(s_{j}\);

R is the set of ontology ties that define the relationship between an indicator and its summarization variants;

F is the algorithmically defined interpretation function that allows to specify a subset \(\hat{D} \in D \) of the summarization variants that correspond to the specified indicator values.

The \(\mathcal {ALCHF(D)}\) extension of the description logic \(\mathcal {ALC}\) [21,22,23] is used for the logical representation of the ontology O (Eq. 1) for the linguistic summarization of the time series forecast values. With using the description logic \(\mathcal {ALCHF(D)}\) the ontology O can be represented as:

$$\begin{aligned} O = TBox \cup ABox, \end{aligned}$$

where TBox is the terminological box;

ABox is the assertional box.

The TBox contains statements describing concept hierarchies and relations between them. The ABox contains axioms defined as a set of individuals and relations between individuals and concepts.

Terminological box TBox

$$\begin{aligned} \begin{aligned}&Values \sqsubseteq \top \\&Values \equiv \top \sqcap \exists hasValueEquipmentProductivity.Double \sqcap \\&\sqcap \exists hasValueEquipmentNumber.Double \, \sqcap \\&\sqcap \exists hasValueEmployeeNumber.Double \sqcap \exists hasResume.Resume \\&Resume \sqsubseteq \top \\&Values \sqsubseteq \lnot Resume \\&Resume \equiv \top \sqcap \exists hasTextDescription.String \\&EquipmentNumber \sqsubseteq Resume\\&EquipmentNumberLow \sqsubseteq EquipmentNumber \\&EquipmentNumberMiddle \sqsubseteq EquipmentNumber \\&EquipmentNumberHigh \sqsubseteq EquipmentNumber \\&EquipmentProductivity \sqsubseteq Resume \\&EquipmentProductivityLow \sqsubseteq EquipmentProductivity \\&EquipmentProductivityMiddle \sqsubseteq EquipmentProductivity \\&EquipmentProductivityHigh \sqsubseteq EquipmentProductivity \\&EmployeeNumber \sqsubseteq Resume \\&EmployeeNumberLow \sqsubseteq EmployeeNumber \\&EmployeeNumberMiddle \sqsubseteq EmployeeNumber \\&EmployeeNumberHigh \sqsubseteq EmployeeNumber \\ \end{aligned} \end{aligned}$$

where Values is the set of indicators V;

Resume is the set of indicator summarization variants D;

\(EquipmentNumber \sqsubseteq Resume, EquipmentNumberLow \sqsubseteq EquipmentNumber\) is the summarization variant of the “number of equipment” indicator at low value of this indicator;

\(hasValue*\) is the name of the role “the indicator has a value”;

hasResume is the name of the role “the indicator has a summarization variant”;

hasTextDescription is the name of the functional role “the summarization variant has a text description”;

String is the concrete string domain;

Double is the concrete double domain.

Assertional box ABox

$$\begin{aligned} \begin{aligned}&currentValues :Values \\&\left( currentValues, valueEqP: Double \right) :hasValueEquipmentProductivity \\&\left( currentValues, valueEqN: Double \right) :hasValueEquipmentNumber \\&\left( currentValues, valueEmN: Double \right) :hasValueEmployeeNumber \\ \\&equipmentNumberLow :EquipmentNumberLow \\&\left( equipmentNumberLow, valueEqNLowDescr: String \right) :hasTextDescription \\&\ldots \\&employeeNumberHigh :EmployeeNumberHigh \\&\left( employeeNumberHigh, valueEmNHighDescr: String \right) :hasTextDescription \\ \end{aligned} \end{aligned}$$

Inference-Based Linguistic Summarization of the Time Series Forecast Values. The set of SWRL rules [24] is used to organize the inference-based linguistic summarization of the time series forecast values. Each SWRL rule allows assing the indicator to the specific summarization variant, for example:

figure a

The following SQWRL query [25] is used to obtain results of the ontology-oriented linguistic summarization of the time series forecast values:

figure b

For example, if the value of the “equipment number” indicator is 171, the result of the SQWRL query will be:

figure c

6 Conclusion

The approach to analysis and forecasting of time series of production indicators based on the type-2 fuzzy sets was proposed in this article. This approach helps to avoid the problems of usage of the industrial methodology for production management by adding more operativeness in decision-making.

Proposed methods of forecasting of type-2 time series improve the quality of management decisions, because they can to create a model of real production processes of the factory.

The ontology-oriented linguistic summarization of the time series forecast values allows to decision-maker operatively react to changes in the situation at the factory.

The SWRL rules used in the inference allow decision-maker to get into a deeper knowledge not only of the production processes but also to their interrelations.

These principles can help decision-makers improve the quality of the technological preparation of complex industries.