Inducing Fuzzy Association Rules with Multiple Minimum Supports for Time Series Data

Rathi, Rakesh; Jain, Vinesh; Gautam, Anshuman Kumar

doi:10.1007/978-81-322-1602-5_47

Rakesh Rathi⁹,
Vinesh Jain⁹ &
Anshuman Kumar Gautam⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 236))

1244 Accesses

Abstract

Technological changes have occurred at an exponential rate in recent years leading to the generation of large amount of data in various sectors. Several database and data warehouse is built to store and manage the data. As we know the data which are relevant to us should be extracted from the database for our task. Earlier different mining approaches are proposed in which items are collected at same minimum support value. In this paper we propose a fuzzy data mining algorithm which generates the fuzzy association rules from time series data having different minimum support values. The temperature varying dataset is used to generate fuzzy rules. The proposed algorithm also predicts the variation of temperature. Experiments are also performed to get the desired result.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Fuzzy Approach for Detection of Anomalies in Time Series

A Survey of Fuzzy Data Mining Techniques

Fuzzy Rule-Based Ensemble for Time Series Prediction: Progresses with Associations Mining

Keywords

1 Introduction

Data mining plays a vital role in today’s application. So, researchers are paying more attention toward the new tricks and techniques which can be evolved in it. It covers a large domain where it is frequently applied such as business, medical, biometrics. Fuzzy concepts have a great impact on data dredging methodology. Various data warehouses are managed to store and use the data efficiently through different domain. Time series data is a collection of data points which has some specific value at that instant of time. It varies with respect to time. This paper proposes an algorithm which induce fuzzy association rule with multiple minimum support value. Earlier many algorithms have been proposed but they follow only single support value condition. Sometimes itemset has different minimum support. To explain this we applied a fuzzy concept on time series data more specifically temperature varying data. As we know that time series data comes under the category of Sequence data which has some trend or pattern in it. So algorithm would predict the near temperature using the trend analysis. The proposed algorithm has two advantages:—First, the result would be easier to understand as we are using fuzzy theory which is quite familiar with natural language. Second, it also helps to determine the sudden change in temperature of a place.

The remaining parts of this paper are assembled as follows: review of fuzzy set theory is given in Sect. 2. The related work of the paper is explained in Sect. 3. The proposed algorithm is explained in given in Sect. 4. Further experimental results are shown in Sect. 5. Finally conclusion and future work is discussed in Sect. 6.

2 Fuzzy Set Theory

Fuzzy set theory was pointed out in 1965 by Zadeh in his seminal paper entitled “Fuzzy sets” which played a vital role in human thinking, focusing in the domains of pattern recognition, communication of information and abstraction. Fuzzy set theory consists of fuzzy membership functions. Fuzzy set expresses the degree to which an element belongs to a set called as characteristic function. For a given crisp set B, the function assigns a value $\mu _{\mathrm{B}}$(x) to every x € X such that

$$\begin{aligned} \mu _\text {B} \left( \text {x} \right)&=\left\{ 1 \ \text {iff x}\in \text {B} \right. \\ \mu _\text {B} \left( \text {x} \right)&=\{0 \text { iff x does not}\in \text {B} \end{aligned}$$

Assume that $\text {x}_{1}$ to $\text {x}_{\text {k}}$ are the elements in fuzzy set B, and $\mu _{1 }$to $\mu _{\text {k}}$ are respectively their grades of membership function in B. B is usually represented as follows:

$$\begin{aligned} \text {B }=\mu _1 /\text {x}_1 +\mu _2 /\text {x}_2 +\cdots +\mu _\text {k} /\text {x}_\text {k} \end{aligned}$$

(1)

3 Related Work

Data mining is frequently used in inducing association rules from large itemsets. The association rules describes the effects of presence and absence of an item in a transaction with other items in terms of two measures support and confidence.

Hong proposed an algorithm which induces association rules with multiple minimum supports using maximum constraints on general items. Au and Chan proposed a fuzzy dredging approach to find fuzzy rules for time series data. Das proposed a dredging algorithm for time series data prediction. Das used the clustering method to extract basic shapes from time series and applied Apriori method to induce the association rules on it.

4 The Proposed Algorithm with Multiple Minimum Support

Input: A time series TS with n data points, a list of m membership functions for data points, a predefined minimum support threshold for each fuzzy item $\text {ms}_{\text {i}}$, i $=$ 1 to z, a predefined minimum confidence threshold $\lambda $, and a sliding window size ws.

Step 1:
Convert the time series TS into a list of subsequences W(TS) according to the sliding-window size ws. That is, $\text {W}\left( {\text {TS}} \right) =\{\text {s}_\text {b}|\text {s}_\text {b} =\left( {\text {d}_\text {b},\text { d}_{\text {b}+1}, \ldots , \text { d}_{\text {b}+\text {ws}-1} } \right) ,\text { b }={1 \text {to n}}-\text {ws}+1\}$, where $\text {d}_{\text {b}}$ is the value of the b-th data point in TS.
Step 2:
Transform the k-th (k $=$ 1 to ws) quantitative value $v_{\text {bk}}$ in each subsequence $\text {s}_{\text {b}}$ (b $=$ 1 to n-ws $+$ 1) into a fuzzy set $\text {f}_{\text {bk}}$ represented as $\left( {\text {f}_{\text {bk1}} /\text {R}_{\text {k1}} +\text { f}_{\text {bk2}} /\text {R}_{\text {k2}} +\ldots +\text { f}_{\text {bkn}} /\text {R}_{\text {kn}} } \right) $ using the given membership functions, where $\text {R}_{\text {kl}}$ is the l-th fuzzy region of the k-th data point in each subsequence, m is the number of fuzzy memberships, and $\text {f}_{\text {bkl}}$ is $\text {v}_{\text {bk}}$’s fuzzy membership value in region $\text {R}_{\text {kl}}$. Each $\text {R}_{\text {kl} }$ is called a fuzzy item.
Step 3:
Compute the scalar cardinality of each fuzzy item $\text {R}_{\text {kl}}$ as
$$\begin{aligned} \text {Count}_{\text {kl}} =\sum _{b=1}^{n-ws+1} {f_{\text {bkl}} } \end{aligned}$$
Step 4:
Check whether the support value ($ =\text {count}_{\text {kl}}/\text {n}-\text {ws} + 1$) of each $\text {R}_{\mathrm{kl}}$ in $\text {C}_{1}$ is greater than or equal to its predefined minimum support threshold value $\text {ms}_{\text {Rkl}}$. If $\text {R}_{\text {kl}}$ satisfies the above condition, collect it in the set of large 1-itemsets ($\text {L}_{1})$. That is:
$$\begin{aligned} \text {L}_1 =\{\text {R}_{\text {kl}} |\text {count}_{\text {kl}} \ge \text {ms}_{\text {Rkl}}, 1\le \text {k}\le \text {b}+\text {ws}-1\;{\text {and}}\;1\le l\le \text {m}\}. \end{aligned}$$
Step 5:
IF $\text {L}_{1}$ is not null, then perform the next step; otherwise, terminate the algorithm.
Step 6:
Set t $=$ 1, where t is used to represent the number of fuzzy items in the current itemsets to be processed.
Step 7:
Join the large t-itemsets $\text {L}_{\text {t} }$ to obtain the candidate (t $+$ 1)-itemsets $\text {C}_{\text {t}+1 }$ in the same way as in the Apriori algorithm provided that two items obtained from the same order of data points in subsequences cannot exist in an itemset in $\text {C}_{\text {t}+1 }$ at the same instant provided the minimum support of all the large t-itemsets must be greater than or equal to the maximum of the minimum supports of fuzzy items in theses large t-itemsets.
Step 8:
Now, perform the following steps for fuzzy items in $\text {C}_{\text {t}+1 :}$
1. (a)
  Compute the fuzzy value of I in each subsequence $\text {s}_{\text {b}}$ as $ {\text {f}}_{{\text {I}}}^{{{\text {sb}}}} = {\text {f}}_{{{\text {I}}1}}^{{sb }}\wedge {\text {f}}_{{{\text {I}}2}}^{{{\text {sb}} }}\wedge \ldots { \wedge } {\text {f}}_{{{\text {It}} + 1}}^{{sb}} $ where $\text {f}^{\text {sb}}_{\text {Ik} }$ is the membership value of fuzzy item $\text {I}_{\text {k}}$ in $\text {S}_{\text {b}}$. If the minimum operator is used for the intersection, then:
  $$\begin{aligned} {\text {f}}^{\text {s}} _{{{\text {Ib}}}} = {\text {Min}}_{{{\text {k}} = 1}}^{{{\text {t}} + 1}}\;{\text {f}}\;^{\text {s}} _{{{\text {Ip}}}}. \end{aligned}$$
2. (b)
  Compute the count of I in all the subsequences as:
  $$\begin{aligned} \hbox {Count}_{l} =\sum _{b=1}^{n-ws+1} {f_{I} ^{{sb}}} \end{aligned}$$
Step 9:
If the support ($= \text {count}_{\text {I} }/\text {n}-\text {ws} + 1$) of I is greater than or equal to maximum of the minimum support value, put it in $\text {L}_{\text {t}+1}$.
$$\begin{aligned} {\text {L}}_{{{\text {t}} + 1}} = \left\{ {{\text {I}}_{{\text {k}}} \left| {{\text {count}}_{{\text {I}}} > = {\text {ms}}_{{{\text {Ik}}}},} \right| } \right. \end{aligned}$$
Step 10:
STEP 13: If $\text {L}_{\text {t}+1}$ is null, then do the next step; otherwise, set t $=$ t $+$ 1 and repeat STEPs 6–9.
Step 11:
Generate the association rules for each large h-itemset I with items ($\text {I}_{1}, \text {I}_{2},\ldots , \text {I}_{\text {h}})$, $\text {h}\ge $2, using the following substeps:
1. (a)
  Form each possible association rule as follows: $ {\text {I}}_{1} ^{ \wedge } \ldots ^{ \wedge } {\text {I}}_{{{\text {n}} - 1}} ^{ \wedge } {\text {I}}_{{{\text {n}} + 1}} ^{ \wedge } \ldots ^{ \wedge } {\text {I}}_{{\text {h}}} \rightarrow {\text {I}}_{{\text {n}}} $, n $=$ 1 to h.
2. (b)
  Calculate the confidence values of all association rules by the following formula:
$$\begin{aligned} = \sum \limits _{{b = 1}}^{{n - ws + 1}} {f_{{\text {I}}}^{{{\text {sb}}}} \backslash \sum \limits _{{b = 1}}^{{n - ws + 1}} {\left( {(f_{I}^{{{\text {sb}}\; \wedge }} \ldots ^{ \wedge } f^{s} _{{{\text {IP}}}} } \right) } } \end{aligned}$$

Output: A set of association rules which satisfies the condition of the maximum values of minimum supports.

5 An Example

This section explains the working of the proposed algorithm and generates fuzzy association rule (Table 1).

Table 1 Set of data points

Full size table

Assume the membership function used in the example as Fig. 1 (Table 2).

Table 2 Predefined minimum support value of all fuzzy itemset

Full size table

Step 1:
The window size is assumed as 5. Using the formula we get (15$-$5$+$1) $=$ 11 subsequences
Step 2:
The data values are then converted into fuzzy item sets using the membership function shown in fig no.
Step 3:
Add all the value of fuzzy region of the subsequences called as its count. For example-Assume a fuzzy item $\text {Q}_{1}$.Middle.count is ($0+0.33+1+0+0.33+1+0+.2+0+1+0$) $ = $ 3.86
Step 4:
Now,compare the count of all fuzzy item with its individual minimum support count which is predefined. Fuzzy items whose count is greater than minimum support value of itself put the fuzzy item in the table $\text {L}_{1}$ (Table 3).
Step 5:
If $\text {L}_{1}$ consists of fuzzy item, proceed to step 6, else terminate.
Step 6:
Candidate set $\text {C}_{\text {t}+1}$ is generated from $\text {L}_{\text {t}}$. Fuzzy items in $\text {L}_{1}$are (Q1.Low,Q1.Middle, Q2.Low,Q2.Middle, Q3.Low, Q3.Middle, Q4.Low, Q5.Low,Q5.High) .
Step 7:
$\text {L}_{1}$ is joined to generate $\text {C}_{2}$. The new fuzzy items in $\text {C}_{2}$ are as follows (Q1.Low,Q2.Mid),(Q1.Low, Q3.Mid), (Q1Low, Q5.High), (Q1.Low, Q2.Mid), (Q1.Low, Q3.Mid),(Q1.Low, Q5.Low), (Q2.Low, Q3.Low), (Q2.Low, Q4.Low), (Q2.low, Q5High), (Q2.low, Q1.Mid), (Q2.Low, Q3.Mid), (Q2.Low, Q5.Low), (Q3.Low, Q4.Low), (Q3.low, Q5.High), (Q3.Low, Q1.Mid), (Q3.low, Q2.Mid), (Q3.low, Q5.low), (Q4.Low, Q1.Mid), (Q4.Low, Q2.Mid), (Q4.low, Q5.High), (Q4.Low, Q5.Low), (Q5.High, Q1.Mid), (Q5.High, Q2.Mid), (Q5.High, Q3.High).
Step 8:
Now compute the count of all the fuzzy items of $\text {C}_{2}$.
Step 9:
Compare the $\text {C}_{2}$ itemset count with minimum support count of Fuzzy itemset. C2 items whose count is greater or equal to minimum support of maximum of the two itemset is stored in $\text {L}_{2.}$
Step 10:
Since $\text {L}_{2}$ is not null, repeat step no 6–9 until $\text {L}_{t}$ is null (Tables 4, 5, 6).
Step 11:
(a) In this example, only (Q3.Low Q2.Mid) exists. It means association rules formed are If Q3 $=$ Low then Q2 $=$ Mid. If Q2 $=$ Mid then Q3 $=$ Low. (b) Calculation of confidence of (Q3.Low Q2.Mid) $=$ 3.34$\backslash $ 3.34 $=$ 1. It means if the value of a data point is mid at time2 then value of a data point is low at time3 with a confidence factor of 1.

Table 3 Sequence generated, ws=5

Full size table

Table 4 Converted fuzzy set

Full size table

Table 5 Candidate set $\text {C}_{2 }$

Full size table

Table 6 Fuzzy itemset $\text {L}_{2}$

Full size table

6 Experimental Results

The proposed algorithm is implemented in a programming language C. The dataset points consisted of temperature varying points between year 2008–2012. The dataset is taken from National Data Center (NDC) US (Figs. 2 and 3).

In Fig. 4 as the support value of fuzzy itemset is increased, number of fuzzy association rule decreased. This means change in temperature is effected with change in support value. It means that if the temperature of a day at second day of a month is moderate, then it may be high at third day of the month.

7 Conclusion and Future Work

In this paper, the proposed algorithm provided the best way to induce efficient fuzzy association rule as there is predefined minimum support for all the fuzzy items. The temperature prediction would be more accurate than earlier proposed approaches. Future work suggests that the membership function can be set dynamically. In this paper membership functions are known in advance. More complex operations could be made in near future. It also provided us another view point for defining minimum support of fuzzy items.

References

Aggrawal, R.: Mining association rules between sets of items inlarge database ACM
Google Scholar
Stepnicka, M.: Time series analysis and prediction based on fuzzy rules and the fuzzy transform
Google Scholar
Dr.Sivatsa S.K.: Inaccuracy minimization by partitioning fuzzy data sets—validation of an analytical methodology(IJCSIS). Int. J. Comput. Sci. Inf. Secur. 8(1), (2010)
Google Scholar
Das, G.: Rule discovery from time series. In: Proceedings of the 4 the International Conference
Google Scholar
Pongracz, R.: Application of fuzzy rule-based modeling technique to regional drought. J. Hydrol. 224, 100–114 (1999)
Article Google Scholar
Mueen, A.A.: Exact primitives for time series data mining. University of California, Riverside (2012)
Google Scholar
Zhu, Y.: High performance data mining in time series: techniques and case studies. New work University, New York (2004)
Google Scholar
Herrera, Francisco: Learning the membership function contexts for mining fuzzy association rules by using genetic algorithms. Fuzzy Sets Syst. 160, 905–921 (2009)
Article MATH MathSciNet Google Scholar
Han, J.: Data Mining concepts and Techniques
Google Scholar
Hong, T.P.: Mining association rules with multiple minimum supports. Int. J. Approximate reasoning. 3, 38–42 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Government Engineering College Ajmer, Ajmer, Rajasthan, India
Rakesh Rathi, Vinesh Jain & Anshuman Kumar Gautam

Authors

Rakesh Rathi
View author publications
You can also search for this author in PubMed Google Scholar
Vinesh Jain
View author publications
You can also search for this author in PubMed Google Scholar
Anshuman Kumar Gautam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rakesh Rathi .

Editor information

Editors and Affiliations

Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
B. V. Babu
Department of Computer Science, Liverpool Hope University, Liverpool, United Kingdom
Atulya Nagar
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Kusum Deep
Department of Paper Technology, Indian Institute of Technology Roorkee, Roorkee, India
Millie Pant
Department of Applied Mathematics, South Asian University, New Delhi, India
Jagdish Chand Bansal
Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
Kanad Ray
Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
Umesh Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rathi, R., Jain, V., Gautam, A.K. (2014). Inducing Fuzzy Association Rules with Multiple Minimum Supports for Time Series Data. In: Babu, B., et al. Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012. Advances in Intelligent Systems and Computing, vol 236. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1602-5_47

Download citation

DOI: https://doi.org/10.1007/978-81-322-1602-5_47
Published: 26 February 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1601-8
Online ISBN: 978-81-322-1602-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Inducing Fuzzy Association Rules with Multiple Minimum Supports for Time Series Data

Abstract

Similar content being viewed by others