1 Introduction

The rapid economic development of science and technology has pushed the humankind into an era of information explosion. In 2012, the Obama administration announced a high-profile plan called "Big Data Research and Development," and big data became a hot word. Big data, also known as huge amount of data, refers to collecting, storing, analyzing, and processing data of different structures and types in a more cost-effective and efficient manner in the age of information explosion. Besides, it can obtain relevant information with decision value.

In recent years, with the development of society and the widespread application of Internet of Things, the big data has been applied to various fields due to its diverse types, fast acquisition speeds, low acquisition and storage costs, and wide sources. It enables companies to achieve efficient and high-quality financial management, with advantages including increasing the use value of financial data, improving the processing of financial data, and providing managers with useful information for decision-making.

The financial status of enterprise is the focus of all the stakeholders of enterprise including operators, corporate creditors, and investors. In a fierce market competition, no enterprise can avoid risk. Financial crisis early warning and risk aversion should be considered in making market management decisions, credit ratings of financial and insurance industry, and making investment decisions.

Under the big data environment, a series of changes have taken place in the financial risk management of companies. The broader access to information enables companies to have more references when making decisions; the faster access to information enables companies to focus on real-time information affecting financial risk. The cloud computing, mobile computing and other technologies have improved the financial analysis. At the same time, the big data is of great significance to the early warning of corporate financial risks. Companies use the data mining to pick out valuable information from massive amounts of data, which is analyzed to predict potential risks. This method has been applied to the financial management, helping managers to grasp the current status of companies, and take timely measures to reduce the possibility of financial risks with minimized losses.

From the US subprime mortgage crisis to the European debt crisis, the research on financial risk crisis warning is particularly important. In harsh market competition, the requirements of companies for risk management are increasing. The study of corporate financial risk originated in the 1930s, and Fitzpatrick [1] used the single variable analysis to predict the crisis. The use of a single financial indicator for prediction has created a precedent for empirical research on early warning of financial crisis. In the 1960s, Atman, American scholar, used the multiple variable analysis model to discuss the early warning of corporate financial crisis. Besides, he used the Z-score model to establish a multivariate linear function formula to predict financial crisis [2]. In the 1970s, Meyer and Pifer [3] used the LPM (linear probability model) to analyze the financial crisis warning of banking industry. Linear probability model is a special case of multivariate analysis model to estimate the probability of business failure. Later, Laitinen et al. in [4] applied the LPM to corporate financial crisis.

Huaiyi Zhu and Yong Gao in 2002 introduced the artificial neural system to crisis early warning system. Moreover, they designed a crisis early warning system of core competency strategy for timely detection and anticipation of deviations in order to implement control and ensure the sustainability and efficiency of core competency strategies [5]. Hua Ren and Xusong Xu used the fuzzy optimization and BP neural network to theoretically derive and design the data analysis subsystem and alarm subsystem, and constructed a crisis warning index system [6]. Yingyu Wu et al. established a corporate financial crisis identification system based on financial and non-financial perspectives, using principal component analysis and neural network technology [7]. Yang et al. introduced the Benford's law into the financial-risk early-warning Logistic model. It increases the effective variables representing the quality of financial data to improve the prediction accuracy of the early warning model [8]. The use of genetic algorithm models to predict corporate bankruptcy is based on the optimization of parameters constrained in many aspects. Condition ratios and qualitative variables of financial indicators can be used for the conditional discrimination and rule extraction, with clear structure [9].

Koyuncugil AS and Ozgulba N used the neural network technology to predict financial risks. A financial risk warning model is established, including five financial ratio indicators. The accuracy of prediction results has been significantly improved [10]. Banerjee, Arindam, Prachi et al. used three multivariable models, SVM and ANN to study the financial risks of companies. The financial risk estimated by SVM is closest to the real result [11]. Cao introduced the relevant variables of option pricing model during the establishment of the enterprise financial risk early-warning model, focusing on the option variables related to corporate financial risk [12]. Xiao, Yang, Pang et al. used the genetic algorithm and neural network method in the construction of financial risk early warning system, which greatly improves the accuracy of prediction results. The system has gradually become a dynamic system [13].

Fang, Shyng, Lee et al. applied the rough set method in predicting corporate financial risk, which improves the speed of data analysis. It provides time and space for companies to choose how to solve financial risks [14]. Maimon and Rokach reiterated the concept of data mining and identified the high-density, time-efficient, and easy-to-understand information that can provide basis for enterprise managers' decisions from massive data. Selecting and storing information belongs to the generalized data mining [15].

Biao Song, Jianming Zhu, and Xu Li used the information collected through Internet for emotional analysis and processing, based on which a financial risk early warning model was established. The financial risk predicted by this model based on big data has a small deviation from the actual risk [16]. Liang Zhang, Lingling Zhang, and Yibing Chen used the logistic regression model and SVM model and established a financial risk early warning model based on information fusion. This method has improved the feasibility of the early warning model, and its accuracy is far higher than one of the above methods alone [17].

With the rapid development of capital market, the requirements of companies for risk management are increasing. It has become a research hotspot and a difficult point to evaluate the financial risks in enterprise management and provide timely warnings. For the foregoing studies, there are problems such as many assumptions, inability to handle massive data, failure to consider the time continuity of financial indicators, and failure to track the fluctuations and trends of financial indicators. The work analyzed the corporate financial risk based on fuzzy association rules and dynamic maintenance methods. First, for the time series generated by a complex system, we studied the correlation characteristics of the internal or local morphology of time series, and the division boundaries of the time domain attribute universe were softened through fuzzy clustering algorithms. Then, an improved parallel mining algorithm of Boolean attribute association rules was used to find the frequent fuzzy attribute sets and determine the fuzzy association rules. Based on this, an enterprise financial risk analysis model was established based on fuzzy association rule mining algorithm.

2 Problem description

Association rules mining is to find the correlation between different items in the same event [18, 19]. The used mining strategy includes the generation of frequent itemsets and rules. The former is to find all frequent itemsets that meet the minimum support threshold, and the latter is to extract the rules with high confidence from frequent itemsets. These rules are called strong rules. Association rule is an important topic in data mining, and people have done a lot of work [20]. At present, Apriori and FP_Growth are representative in association rule mining algorithms.

2.1 Apriori algorithm based on candidate pattern generation and testing

Based on the support-confidence framework, the Apriori [21] algorithm proposed by Agrawal et al. uses the iteration to generate frequent pattern sets of all lengths. The Apriori algorithm has the anti-monotonicity of frequent patterns, and the lattice structure is often used to enumerate all possible itemsets. A data set containing d different items may simultaneously generate \(2^{d}\) frequent itemsets and R rules:

$$R = \sum\limits_{{{\text{k}} = 1}}^{{{\text{d}} - 1}} {\left[ {\left( \begin{gathered} d \hfill \\ k \hfill \\ \end{gathered} \right) \times \sum\limits_{{{\text{j}} = 1}}^{{{\text{d}} - {\text{k}}}} {\left( \begin{gathered} d - k \hfill \\ j \hfill \\ \end{gathered} \right)} } \right]} = 3^{d} - 2^{{d + 1}} + 1$$
(1)

An original method for finding frequent itemsets is to determine the support count for each candidate itemset in the lattice structure. If an itemset is in frequent pattern, all its subsets must also be the same, which is called the anti-monotonicity of frequent pattern. Conversely, if the selection is infrequent, the entire subgraph containing the selection can be immediately pruned. In the generation of rules, the support degree indicates the probability of simultaneous occurrence of itemsets A and B in the database, with a certain statistical significance. Confidence indicates the probability when itemsets A and B occur simultaneously. Besides, it stands for the strength of rule.

\(L_{k}\) and \(C_{k}\) are supposed to be the frequent pattern set and candidate pattern set with the length of \(k\), respectively. The database is scanned to generate the candidate 1-itemset \(C_{1}\). Then the anti-monotonicity of Apriori is used for pruning after support count comparison, with frequent 1-itemset \(L_{1}\) generated. Frequent 1-itemsets are linked with themselves to generate candidate 2-itemset \(C_{2}\), and the pruning is performed after comparison. Frequent itemset \(L_{k}\) with the length of \(k{\kern 1pt} (k \ge 1)\) is obtained until no more frequent itemsets are generated [22].

2.2 Time series data mining

Time series data mining is an in-depth study of the advancement of things by analyzing the time characteristics of data. Knowledge is obtained from the data with time characteristics. A large amount of time series data is used to extract potential, unpredictable rules, which are closely related to time characteristics. These rules can be used to predict the short-term, medium-term or long-term development trends of time data.

Let Y denote a time series, which can be represented by

$$Y = f(T,S,C,e)$$
(2)

where \(T\) is the long-term trend, which indicates that the predicted value steadily increases, decreases, or remains at a certain level according to a certain rule over time; \(S\) the seasonal change, which means the predicted value has a regular periodic change within a certain time; \(C\) the cyclical variation, which represents that the predicted value cyclically changes over a long period; \(e\) the random term, which indicates the impact of an unexpected and accidental factor on time series.

In order to discover the regularity of data, the time series data needs to be smoothed and anti-seasonally processed. The processing steps are as follows:

Step 1: Estimate the long-term trend term T to get the product \(Se = \frac{Y}{T}\) of seasonal variation term and error term. Use a 6-month central moving average for monthly data to smooth the data:

$$\hat{y}_{m6} = \frac{{(0.5y_{t - 6} + y_{t - 5} + y_{t - 4} + \ldots + y_{t + 4} + y_{t + 5} + 0.5y_{y + 6} )}}{12}$$
(3)

Use a 2-month central moving average for quarterly data:

$$\hat{y}_{q2} = \frac{{(0.5y_{t - 2} + y_{t - 1} + y_{t} + y_{t + 1} + 0.5y_{y + 2} )}}{4}$$
(4)

The moved data is

$$\frac{y}{{\hat{y}}} = S \times e$$
(5)

There is no seasonality in the moved data, where \(S\) is the normalized seasonal factor.

Step 2: Remove the error term and estimate the seasonal term \(S\). The numbers corresponding to different seasons are called seasonal factors are standardized. The data \(\frac{y}{{\hat{y}}}\) after removing the long-term trend includes the seasonal and random error terms. By averaging data from the same season in different years, the error term can be removed, with seasonal term left. In order to ensure that the average of seasonal index is 1, the seasonal factor needs to be normalized.

Normalize the monthly data:

$$zb_{im} = \frac{{z_{i} \times 12}}{{\sum\nolimits_{j = 1}^{12} {z_{j} } }}$$
(6)

Normalize the quarterly data:

$$zb_{iq} = \frac{{z_{i} \times 4}}{{\sum\nolimits_{j = 1}^{4} {z_{j} } }}$$
(7)

Step 3: Remove the seasonal terms from original data to obtain the seasonally adjusted data.

2.3 Fuzzy FCM clustering

The FCM clustering algorithm was first proposed by Bezdek [23]. Compared with other clustering algorithms, it is the most effective and simplest to calculate, widely used in industrial processes.

Given the data sample set \(\phi = \{ \phi (1),\phi (2), \ldots ,\phi (N)\}\), the FCM clustering algorithm obtains the membership matrix \({\mathbf{U}} = [\mu_{i,k} ]_{c \times N}\) and clustering center \({\mathbf{V}} = [v_{1} ,v_{1} , \ldots ,v_{c} ]\) by minimizing the objective function. When the number \(c\) of clusters is constant, the objective function of FCM clustering algorithm can be expressed as

$$J(U,V,\Phi ) = \sum\limits_{{{\text{q}} = 1}}^{{\text{c}}} {\sum\limits_{{{\text{k}} = 1}}^{{\text{N}}} {\mu_{q,k}^{{\text{w}}} ||\phi (kT) - v_{{\text{q}}} ||^{2} } }$$
(8)

where \(w\) is the index weight that affects the fuzzification of membership matrix. \(w \in (1,\infty )\), and it is often set to 2. \(\mu_{i,k}\) satisfies

$$\left\{ \begin{gathered} \sum\limits_{q = 1}^{c} {\mu_{q,k} = 1,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} k = 1,2, \ldots ,N} \hfill \\ 0 < \sum\limits_{k = 1}^{N} {\mu_{q,k} < N,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} q = 1,2, \ldots ,c} \hfill \\ \end{gathered} \right.$$
(9)

The objective Eq. (8) is minimized to obtain

$$v_{q} = \frac{{\sum\limits_{{k = 1}}^{N} {\mu _{{q,k}}^{w} \phi (kT)} }}{{\sum\limits_{{k = 1}}^{N} {\mu _{{q,k}}^{w} } }},{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} q = 1,2, \ldots ,c$$
(10)
$$\mu_{q,k} = \frac{1}{{\sum\limits_{j = 1}^{c} {(\frac{{||\phi (kT) - v_{q} ||}}{{||\phi (kT) - v_{q} ||}})^{2/(w - 1)} } }},{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} q = 1,2, \ldots ,c,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} k = 1,2, \ldots ,N$$
(11)

 (10) and (11) cannot obtain the analytical solutions. The FCM clustering algorithm provides an iterative algorithm to approximately obtain the minimum value of objective function.

Step 1: Giving the data sample set \({{\varvec{\Phi}}}\), the number \(c\) of clusters, and the arbitrary initial membership matrix \({\mathbf{U}}_{0}\).

Step 2: Calculate the cluster center vector \(v_{q} ,q = 1,2, \ldots ,c\) according to Eq. (10).

Step3: Recalculate the subjection degree \({\mathbf{U}}\) from \(v_{q}\) obtained from Step 2 and Eq. (11). If \(q = j\), and \(||\phi (kT) - v_{q} ||\)= 0, then \(\mu_{j,k} = 1,\mu_{q,k} = 0,\forall q \ne j\).

Step4: Repeat the above steps until the given convergence index is satisfied. For example, \(||{\mathbf{U}}_{l} - {\mathbf{U}}_{l - 1} || \le \varepsilon\) where \(|| \bullet ||\) is the norm; \(l\) the iteration; \(\varepsilon\) the index for terminating iteration. When \(\varepsilon = 0.01\), the satisfactory accuracy can be achieved.

After iteration, the membership matrix \({\mathbf{U}}\) and cluster center \({\mathbf{V}}\) can be obtained. That is, with the given number \(c\) of clusters, the parameter \(v_{q}\) to be identified is determined, and \(s_{q}\) can be determined by the nearest neighbor heuristic algorithm:

$$s_{q} = \left[ {\frac{1}{p}\sum\limits_{{l = 1}}^{p} {(c_{q} - c_{l} )^{T} (c_{q} - c_{l} )} ^{{1/2}} } \right],{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} q = 1,2, \ldots ,c$$
(12)

where \(p\) is the number of nearest neighbors of the \(q\)-th cluster, and \(c_{l} (l = 1,2, \ldots ,p)\) is the cluster center of each nearest neighbor of \(c_{q}\).

After pre-processing the time series with the length of \(n\) according to the method in Sect. 2.2, the attributes of used FCM clustering algorithm are divided. The monthly fluctuations relative to the average trend are taken as a new time series instead of original one, and then the above method is used for fuzzy clustering. Each of these local sequences is softened to a representative form.

Supposing \(X = \left\{ {X_{1} ,X_{2} , \ldots ,X_{n} } \right\}\) be a new time series, the time window with the width of \(\omega\) is applied to \(X\), thus forming the subsequence \(Y_{i} = \left\{ {X_{i} ,X_{i + 1} , \ldots ,X_{i + \omega - 1} } \right\}\) with the length of \(\omega\). The time window is slipped in a single step from the beginning to the end of time series \(X\) to form a series of subsequences \(M_{1} ,M_{2} , \ldots ,M_{N - \omega + 1}\) with the width of \(\omega\). It is denoted as \(W(X,\omega ) = \left\{ {M_{i} |i = 1,2, \ldots ,N - \omega + 1} \right\}\), which is the set of subsequences that the time series \(X\) slips using the sliding window with the width of \(\omega\). \(W(X,\omega )\) is regarded as \(N - \omega + 1\) points in \(\omega\)-dimensional Euclidean space, and the FCM method is used for fuzzy clustering.

3 Early warning model of corporate financial risk is based on fuzzy association rules

Corporate financial risk analysis is to select the appropriate risk analysis models and risk analysis indicators [23, 24]. The degree of risk is quantitatively described, defining the level of corporate financial risk [25, 26]. Management takes measures to control risks and provides theoretical and practical evidence.

3.1 Selection of financial indicators and correlation analysis

In order to examine the impact of financial indicators on corporate financial risks, the financial indicators in the work were closely related to financial risks, and selected from corporate profitability indicators, corporate operation ability indicators, corporate growth ability indicators, corporate debt-paying ability indicators and corporate cash flow indicators. On this basis, after the correlation analysis of these financial indicators, some highly relevant financial indicators are eliminated to simplify the model. The correlation coefficient of each financial indicator can be determined by

$$r_{x,y} = \frac{{n\sum {x_{i} y_{i} - \sum {x_{i} \sum {y_{i} } } } }}{{\sqrt {n\sum {x_{i}^{2} - (\sum {x_{i} )^{2} } } } \sqrt {n\sum {y_{i}^{2} - (\sum y_{i} )^{2} } } }}$$
(13)

where \(x,y\) are two variables, and \(r_{x,y}\) is the correlation coefficient of variables, satisfying \(- 1 \le r_{x,y} \le 1\). When \(\left| {r_{x,y} } \right| = 1\), \(x,y\) are in completely linear correlation; \(r_{x,y} = 1\) indicates \(x,y\) are in completely positive correlation, and \(r_{x,y} = - 1\) in completely negative correlation. \(r_{x,y} = 0\) indicates \(x,y\) are in non-correlation. \(- 1 < r_{x,y} < 1\) indicates \(x,y\) are in linear relationship.

Indicators with high positive or negative correlations are excluded to reduce the collinearity between financial indicators.

3.2 Data preprocessing and reconstruction

In order to reduce the deviation of corporate financial risk analysis, it is necessary to clean up the collected sample data and eliminate all abnormal values of financial indicators. At the same time, in order to mine the association rules in the next step, the continuous financial indicator data needs to be discretized according to the financial risk grade.

William Weitzel et al. divided the process from recession to final death of enterprise into five stages: blind stage, sluggish stage, wrong-action stage, crisis stage and extinction stage. Therefore, all indicators can be divided into 5 stages according to the principle of enterprise life cycle, which are represented by 1, 2, 3, 4 and 5, respectively. According to the distribution of indicator values, the financial crisis early-warning indicators are divided into five sub-areas (See Fig. 1).

Fig. 1
figure 1

Equal area partition of warning indicators of financial crisis

In the recession and metamorphosis periods, the enterprise must continuously reform to seek metamorphosis, or it will die out. Therefore, the establishment of appropriate early warning mechanisms is necessary for modern enterprises.

The database reconstruction is to discretize the continuous data of financial crisis warning indicators into financial indicator data suitable for association rule mining. Due to the data set consisting of financial data of different companies, each financial indicator variable is basically in normal distribution. Thus, the equal area partition in normal distribution is used to discretize continuous variables. The work discretized each financial indicator variable into 5 grades according to the 1/5, 2/5, 3/5, and 4/5 quantiles of distribution function of each variable.

3.3 FARM algorithm (fuzzy association rules big data mining algorithm)

The work used the Apriori algorithm based on candidate pattern generation and testing to determine the frequent pattern sets, and parallelized the candidate pattern sets. They were counted after being divided on each processor, and the processors communicated through message passing. The time series was obtained after processing by parallel algorithm. The continuous attributes were discretized to obtain new fuzzy-attribute data sets, which were divided on each processor. During data scanning, the processor could calculate the local fuzzy supporting number asynchronously and independently. The synchronization was maintained at the end of each scan, and the same candidate sets were saved by processor. The inputs were the minimum fuzzy support degree \(\sup_{\min }\) and minimum fuzzy trust degree \(conf_{\min }\); the output was the association rule set \(S_{ar}\). The algorithm steps are as follows:

Step 1: Set up the parallel processor \(p_{1} ,p_{2} , \ldots ,p_{n}\).

Step 2: Divide the transaction database into multiple partitions and allocate them to each processor separately.

Step 3: Cluster each processor using FCM algorithm and transform it into a new data set. Use the continuous-attribute discretization technology to transform the obtained time series into a new database, and a prefix tree is constructed according to \(\sup_{\min } ,conf_{\min }\).

Step 4: Perform a local count on each processor according to Step 3. For each transaction in the transaction database and each item in the candidate itemset, if an item belongs to a transaction on a processor, a local count is performed. Propagate them to other sites.

Step 5: Calculate the global count, and generate a rule set.

The rules are filtered according to the timing constraints to be satisfied by the antecedent and posterior attributes of rules to obtain the timing rules. The development trend of rules is used to determine the enterprise crisis degree, with a qualitative analysis of enterprise financial crisis. By calculating the crisis coefficient, the enterprise crisis stage is finally determined, which realizes the quantitative analysis of enterprise financial crisis. If with low antecedent and high consequent of rules, the corporate crisis is aggravated; otherwise, the crisis is reduced. If the rules are always at the first stage, the enterprise crisis is relatively light; if at the third stage, the crisis is moderate; if at the fifth stage, the enterprise is on the bankruptcy verge.

Crisis coefficient is introduced to calculate the specific degree of enterprise financial crisis:

$$F = F(\overline{x}_{i} ,\sup (x_{i} ),conf(x_{i} )) = \frac{1}{n}\sum\limits_{i = 1}^{n} {\overline{x}_{i} \sup (x_{i} ) + } \frac{1}{n}\sum\limits_{i = 1}^{n} {\overline{x}_{i} conf(x_{i} )}$$
(14)

where \(n\) is the number of rules, and \(\overline{x}_{i}\) the discretized variable data.

4 Example simulation

A minor financial crisis may be just a temporary difficulty in capital turnover, while a serious financial crisis is an unsuccessful operations or bankruptcy liquidation. It is a development process from financial crisis to corporate bankruptcy. If taking appropriate measures, companies may resolve the financial crisis [27].

The work selected the Chinese ST listed company as the research object, and the annual and quarterly statements of from 2003 to 2018 as the data source, collecting a total of 32 financial indicators (see Table 1).

Table 1 Financial indicators

The data samples of selected financial indicators were sorted out to remove outliers. Equation (13) was used to calculate the correlation coefficients between financial indicators, excluding the financial indicators with higher absolute correlation coefficients in the same group of financial indicators. They were return on net assets X3, net profit X5, main business income per share X10, number X12 of receivables turnover days, number X14 of inventory turnover days, number X16 of current assets turnover days, principal business income growth rate X17, net asset growth rate X19, quick ratio X24, and cash flow ratio X32. The remaining indicators were discretized according to the financial risk level to obtain a reconstructed financial indicator database.

The 12 financial indicators of each enterprise's 12 quarters were summarized as a record for time series analysis. The discretized data set was coded at the granularity of each level of each financial indicator every quarter, which was taken as the input of association rule mining algorithm. The association rules were performed with the sliding window parameters \(\omega (\omega = 3)\) and the clustering class number of 4. Then the algorithm proposed was used to mine the rules, thus obtaining the fuzzy association rule set. Besides, association rules were used for predicting financial indicators and crisis warning. The level attribute set of financial index of analysis target can be matched in whole or in part in the mined antecedent of rule base rules to obtain the corresponding crisis warning information.

The financial index data of Chinese ST listed companies from 2013–2018 was selected, with a total of 720 data records for 30 companies. Each record contained 22 financial indicators after excluding high-correlation ones. The proposed algorithm and traditional Apriori algorithm were used to generate frequent pattern sets, and Fig. 2 shows the running time of algorithm.

Fig. 2
figure 2

Performance comparison between Apriori and FARM algorithm

Coordinate X is the support threshold, with the variation range of 0.25–0.15, and the step size of 0.01. Coordinate Y is the running time for computing frequent pattern sets. With different support thresholds, the proposed algorithm has shorter running time and higher operating efficiency.

Figure 3 shows the number of rules under different support thresholds and confidence thresholds. Coordinate X is the support degree; coordinate Y the confidence degree; coordinate Z the rule number. When different support thresholds and confidence thresholds are selected, different association rules can be obtained according to financial-indicator database.

Fig. 3
figure 3

Relation of association rules, support threshold and confidence threshold

The data mining of financial indicators shows that when the listed companies have financial risks, some key financial indicators always appear frequently (See Table 2). Main key indicators are the key factors for judging whether the enterprise has financial risk, and their fluctuation determines the risk level of enterprise.

Table 2 Key financial indicators

5 Conclusions

With the development of the economy and the era of big data, the enterprises can collect and store all business activity data. The work proposed a fuzzy association rule mining algorithm of time series based on FCM clustering, which used FCM clustering algorithm for the fuzzy discretization on the cleaned time data. A parallel mining algorithm of association rule was used to obtain the frequent fuzzy option sets, and multiple processors in parallel generated the fuzzy association rules satisfying minimum fuzzy trust degree. The rules between all financial indicators were mined to determine more representative financial risk indicators. The big data mining algorithm of Internet of Things was established based on fuzzy association rules to obtain the model of corporate financial-risk analysis. The rules between financial indicators were found to predict corporate financial crisis. The method in the work has been verified by the experiment, and the fluctuation of key indicators determined the enterprise risk degree.