1 Introduction

Originally, strong association rules were defined in [1] as those that have sufficiently high values of two parameters: (relative) support (probability that an association rule is satisfied) and confidence (conditional probability of the rule consequent given its antecedent). However, it has been shown in the literature that in general neither of these two measures is capable of determining (in)dependence between rules’ constituents correctly (see e.g. [12]). Thus, usage of other measures for rule evaluation is also under research [3, 5, 9, 10, 13,14,15,16,17,18]. In this paper, we formulate a generic notion of a canonical measure. For a number of example evaluation measures of association rules, we show that they are canonical. For association rules satisfying any set of criteria based on canonical measures, we offer their concise lossless representation in the form of so called representative rule templates. Also, we derive a number of properties of this representation.

Our paper has the following layout. In Sect. 2, we briefly recall and comment basic notions of itemsets, (association) rules, support, confidence, dependency of events, ACBC-measures, and a good interestingness measure. Our main contribution is presented in Sects. 35. In Sect. 3, we introduce and justify usefulness of the notion of a canonical measure and show that a number of example ACBC-measures are canonical. In Sect. 4, we introduce the notion of rule templates and show that representative rule templates are sufficient to represent all association rules that are strong with respect to any criteria expressible in terms of canonical measures. Also, we derive a number of properties of (representative) rule templates. In particular, we derive a formula determining how many association rules are covered by a rule template. In Sect. 5, we briefly address related work on lossless representations of association rules and make additional claims about our new representation. Section 6 concludes our contribution.

2 Basic Notions and Properties

In this section, we provide definitions and properties related to rules and their specific type called association rules [1]. Let \( {\mathcal{I}} \) be a set of items (e.g. products). Any set \( X\, \subseteq \,{\mathcal{I}} \) is called an itemset. A transaction dataset is denoted by \( {\mathcal{D}} \) and is defined as a set of itemsets. Each itemset in \( {\mathcal{D}} \) is called a transaction.

Let X and Y be any itemsets. An expression \( X \to Y \) is called a rule. If additionally \( X \cap Y = \varnothing \), then \( X \to Y \) is called an association rule. The set of all association rules is denoted by AR. Itemsets X and Y, which occur in \( X \to Y \), are called its antecedent and consequent, respectively. The itemset \( Z = X \cup Y \) is called the base of \( X \to Y \).

Itemsets and rules are typically characterized by support (or relative support) and confidence as follows. Support of an itemset X is denoted by sup(X) and is defined as the number of transactions in \( {\mathcal{D}} \) that contain X. Alternatively, instead of a support, a notion of a relative support is used: Relative support of an itemset X is defined as \( sup\left( X \right)/|{\mathcal{D}}| \). The relative support of itemset X can be regarded as the probability of the occurrence of X in a transaction. The relative support of X will be denoted by P(X). Clearly, if \( X \subseteq Y \), then \( sup\left( X \right) \ge sup\left( Y \right) \) and \( P\left( X \right) \ge P\left( Y \right) \).

Support of a rule \( X \to Y \) is denoted by \( sup(X \to Y) \) and is defined as the support of its base \( X \cup Y \); that is, \( sup(X \to Y) = sup(X \cup Y) \).

The confidence of a rule \( X \to Y \) is denoted by \( conf(X \to Y) \) and is defined as the conditional probability that Y occurs in a transaction provided X occurs in the transaction; that is: \( conf(X \to Y) = sup(X \to Y)/sup\left( X \right) = P\left( {XY} \right)/P\left( X \right) \).

An association rule is defined in [1] as strong if its (relative) support and confidence are greater than or equal to user defined minimum (relative) support and minimum confidence thresholds, respectively. Nevertheless, antecedents and consequents of rules that are strong with respect to support and confidence are not guaranteed to be dependent. Statistically, X and Y are dependent if \( P\left( {XY} \right) \ne P\left( X \right) \times P\left( Y \right) \). Otherwise, X and Y are independent. The dependence between X and Y is considered as positive if \( P\left( {XY} \right) > P\left( X \right) \times P\left( Y \right) \), and negative if \( P\left( {XY} \right) < P\left( X \right) \times P\left( Y \right) \). The example beneath shows the case when an antecedent and consequent of a rule are dependent negatively despite high values of (relative) support and confidence.

Example 1.

Let \( {\mathcal{D}} \) be a transaction dataset from Table 1 and \( \left\{ z \right\} \to \left\{ v \right\} \) be a rule of interest. The probabilities of the antecedent, consequent and base of rule \( \left\{ z \right\} \to \left\{ v \right\} \) are as follows: \( P\left( {\{ z\} } \right) = 0.5,P\left( {\{ v\} } \right) = 0.9,P\left( {\{ zv\} } \right) = 0.4 \) (see Table 2). Since \( P\left( {\{ z\} } \right) \times P\left( {\{ v\} } \right) = 0.45 \), which is greater than \( P\left( {\{ zv\} } \right) \), z and v are dependent negatively. On the other hand, \( conf(\left\{ z \right\} \to \left\{ v \right\}) = 0.4/0.5 = 0.8 \), which is a high value. □

Table 1. Example dataset \( {\mathcal{D}} \)
Table 2. Example rules found in \( {\mathcal{D}} \) from Table 1

Please note that (in)dependence of antecedent X and consequent Y of rule \( X \to Y \) is determined based on three probabilities: P(X), P(Y) and P(XY). In fact, large number of rule evaluation measures can be defined in terms of at most these three probabilities. Such measures were called ACBC-measures (Antecedent-Consequent-Base-Constants-measures) in [11]. Some of their popular examples are provided in Table 3.

Table 3. Example ACBC-measures of (association) rules

In [15], a rule evaluation measure μ was defined as a good interestingness measure if the following conditions were fulfilled for any rules \( X \to Y \), \( X^{{\prime }} \to Y^{{\prime }} \):

  1. 1.

    If \( P\left( {XY} \right) = P\left( X \right) \times P\left( Y \right) \), then \( \mu (X \to Y) = 0 \).

  2. 2.

    If \( P\left( {X^{{\prime }} } \right) = P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) = P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) > P\left( {XY} \right) \), then \( \mu (X^{{\prime }} \to Y^{{\prime }} ) > \mu (X \to Y) \).

  3. 3.

    If \( P\left( {X^{{\prime }} } \right) = P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) < P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) = P\left( {XY} \right) \), then \( \mu (X^{{\prime }} \to Y^{{\prime }} ) > \mu (X \to Y) \).

  4. 4.

    If \( P\left( {X^{{\prime }} } \right) < P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) = P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) = P\left( {XY} \right) \), then \( \mu (X^{{\prime }} \to Y^{{\prime }} ) > \mu (X \to Y) \).

Nevertheless, a number of rule evaluation measures, including all measures listed in Table 3, do not satisfy at least one of the postulates. In particular, relative support does not satisfy postulates 1, 3, 4; confidence does not satisfy postulates 1, 3; novelty does not satisfy postulate 3 (if P(X) = 0) and postulate 4 (if P(Y) = 0); lift, cosine, Jaccard, accuracy and F-score do not fulfill postulate 1.

In the next section, we offer a generic notion of a canonical measure, which embraces a large number of ACBC-measures (including all measures from Table 3).

3 Canonical Measures for Evaluating Rules

Let us consider two rules \( X \to Y \) and \( X^{{\prime }} \to Y^{{\prime }} \) such that X′ occurs in \( {\mathcal{D}} \) no more frequently than \( X \, (P\left( {X^{{\prime }} } \right) \le P\left( X \right)) \), Y′ occurs in \( {\mathcal{D}} \) no more frequently than Y \( (P\left( {Y^{{\prime }} } \right) \le P\left( Y \right)) \), but \( X^{{\prime }} \cup Y^{{\prime }} \) occurs in \( {\mathcal{D}} \) no less frequently than \( X \cup Y \, (P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( {XY} \right)) \). In such a case, if the dependence between Y′ and X′ is positive, then it is natural to expect that Y and X are either dependent positively, but to degree that is not higher than for Y′ and X′, or are independent, or are dependent negatively. A rule measure reflecting this expectation would evaluate rule \( X^{{\prime }} \to Y^{{\prime }} \) not lower than rule \( X \to Y \). A notion of a canonical measure, which we offer in this paper, is based on this observation.

Example 2.

Let \( {\mathcal{D}} \) be a transaction dataset from Table 1 and \( \left\{ x \right\} \to \left\{ y \right\} \), \( \left\{ z \right\} \to \left\{ y \right\} \) and \( \left\{ z \right\} \to \left\{ v \right\} \) be rules under investigation. In Table 2, we provide the values of probabilities of their antecedents, consequents and bases. We note that the probabilities of antecedents of \( \left\{ x \right\} \to \left\{ y \right\} \) and \( \left\{ z \right\} \to \left\{ y \right\} \) are the same and that the probabilities of their consequents are the same, but the probabilities of their bases are different; namely, \( P\left( {\{ zy\} } \right) > P\left( {\{ xy\} } \right) \), and the dependence between y and z is positive, while y and x are independent.

Let us now compare rule \( \left\{ z \right\} \to \left\{ y \right\} \) with rule \( \left\{ z \right\} \to \left\{ v \right\} \). The probabilities of their antecedents are the same, but the probability of the consequent of \( \left\{ z \right\} \to \left\{ y \right\} \) is lower than the probability of the consequent of \( \left\{ z \right\} \to \left\{ v \right\} \), while the probability of the co-occurrence of {zy} is greater than the probability of the co-occurrence of {zv}. In this case, the dependence between v and z is negative in contrast to the positive dependence between y and z. □

We define a measure \( \mu :\left[ {0..1} \right] \times \left[ {0..1} \right] \times \left[ {0..1} \right] \to R \) as canonical if for any values \( p_{a} ,p_{c} ,p_{b} \in \left[ {0,1} \right] \) and \( p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} ,p_{{b^{{\prime }} }} \in \left[ {0,1} \right] \) such that \( (p_{a} ,p_{c} \ge p_{b} ),(p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} \ge p_{{b^{{\prime }} }} ),(p_{{a^{{\prime }} }} \le p_{a} ),(p_{{c^{{\prime }} }} \le p_{c} ) \) and \( (p_{{b^{{\prime }} }} \ge p_{b} ) \) the following holds:

$$ \mu \left( {p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} ,p_{{b^{{\prime }} }} } \right) \ge \mu \left( {p_{a} ,p_{c} ,p_{b} } \right). $$

Proposition 1.

Let μ be a canonical measure, \( X \to Y \) and \( X^{{\prime }} \to Y^{{\prime }} \) be rules such that \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( {XY} \right) \). Then

$$ \mu \left( {P\left( {X^{{\prime }} } \right),P\left( {Y^{{\prime }} } \right),P\left( {X^{{\prime }} Y^{{\prime }} } \right)} \right) \ge \mu \left( {P\left( X \right),P\left( Y \right),P\left( {XY} \right)} \right). $$

Proof.

Let μ be a canonical measure, \( X \to Y \) and \( X^{{\prime }} \to Y^{{\prime }} \) be rules such that \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( {XY} \right) \). Let \( p_{a} = P\left( X \right) \), \( p_{c} = P\left( Y \right) \), \( p_{b} = P\left( {XY} \right) \), \( p_{{a^{{\prime }} }} = P\left( {X^{{\prime }} } \right) \), \( p_{{c^{{\prime }} }} = P\left( {Y^{{\prime }} } \right) \), \( p_{{b^{{\prime }} }} = P\left( {X^{{\prime }} Y^{{\prime }} } \right) \). Then, \( p_{a} ,p_{c} ,p_{b} \in \left[ {0,1} \right] \), \( p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} ,p_{{b^{{\prime }} }} \in \left[ {0,1} \right] \), \( \left( {p_{a} ,p_{c} \ge p_{b} } \right) \), \( \left( {p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} \ge p_{{b^{{\prime }} }} } \right) \), \( (p_{{a^{{\prime }} }} \le p_{a} ) \), \( (p_{{c^{{\prime }} }} \le p_{c} ) \) and \( (p_{{b^{{\prime }} }} \ge p_{b} ) \). Hence and by the fact that μ is a canonical measure, \( \mu \left( {p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} ,p_{{b^{{\prime }} }} } \right) \ge \mu \left( {p_{a} ,p_{c} ,p_{b} } \right) \). Therefore, \( \mu \left( {P\left( {X^{{\prime }} } \right),P\left( {Y^{{\prime }} } \right),P\left( {X^{{\prime }} Y^{{\prime }} } \right)} \right) \ge \mu \left( {P\left( X \right),P\left( Y \right),P\left( {XY} \right)} \right) \). □

In the remainder of the paper, when using a canonical measure to evaluate a rule \( X \to Y \), we will write interchangeably \( \mu \left( {P\left( X \right),P\left( Y \right),P\left( {XY} \right)} \right) \) and \( \mu (X \to Y) \). Thus, Proposition 1 can be rewritten equivalently as Proposition 2:

Proposition 2.

Let μ be a canonical measure, \( X \to Y \) and \( X^{{\prime }} \to Y^{{\prime }} \) be rules such that \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( {XY} \right) \). Then, \( \mu (X^{{\prime }} \to Y^{{\prime }} ) \ge \mu (X \to Y) \).

It can be easily shown for many ACBC-measures that they are canonical. In Proposition 3, we show that in fact all ACBC-measures from Table 3 are canonical.

Proposition 3.

Let \( X \to Y \) and \( X^{{\prime }} \to Y^{{\prime }} \) be rules such that \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( {XY} \right) \). Then:

  1. (a)

    \( relativeSup(X^{{\prime }} \to Y^{{\prime }} ) \ge relativeSup(X \to Y) \)

  2. (b)

    \( conf(X^{{\prime }} \to Y^{{\prime }} ) \ge conf(X \to Y) \)

  3. (c)

    \( novelty(X^{{\prime }} \to Y^{{\prime }} ) \ge novelty(X \to Y) \)

  4. (d)

    \( lift(X^{{\prime }} \to Y^{{\prime }} ) \ge lift(X \to Y) \)

  5. (e)

    \( cosine(X^{{\prime }} \to Y^{{\prime }} ) \ge cosine(X \to Y) \)

  6. (f)

    \( Jaccard(X^{{\prime }} \to Y^{{\prime }} ) \ge Jaccard(X \to Y) \)

  7. (g)

    \( accuracy(X^{{\prime }} \to Y^{{\prime }} ) \ge accuracy(X \to Y) \)

  8. (h)

    \( F{\hbox{ - }}score(X^{{\prime }} \to Y^{{\prime }} ) \ge F{\hbox{ - }}score(X \to Y) \)

Proof.

Let \( X \to Y \) and \( X^{{\prime }} \to Y^{{\prime }} \) are rules such that \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( {XY} \right) \). Then:

  • Ad (a) \( relativeSup(X^{{\prime }} \to Y^{{\prime }} ) = P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( {XY} \right) = relativeSup(X \to Y) \).

  • Ad (b) \( conf(X^{{\prime }} \to Y^{{\prime }} ) = \frac{{P(X^{{\prime }} Y^{{\prime }} )}}{{P(X^{{\prime }} )}} \ge \frac{P(XY)}{P(X)} = conf(X \to Y) \).

  • Ad (c) \( novelty(X^{{\prime }} \to Y^{{\prime }} ) = P\left( {X^{{\prime }} Y^{{\prime }} } \right) - P\left( {X^{{\prime }} } \right) \times P\left( {Y^{{\prime }} } \right) \ge P\left( {XY} \right) - P\left( X \right) \times P\left( Y \right) = novelty(X \to Y). \)

  • Ad (d–h) Proof in these cases is analogous to the proof of case (c). □

In the remainder of the paper, we assume that a set \( {\mathcal{M}} = \{ \mu_{1} , \ldots ,\mu_{n} \} \) of canonical measures is used to evaluate (association) rules with respect to their corresponding threshold values \( E = \{ \varepsilon_{1} , \ldots ,\varepsilon_{n} \} \). An expression \( \mu_{i} (X \to Y) \ge \varepsilon_{i} \), where μ i is a canonical measure and ε i is a threshold value, will be called a canonical evaluation criterion.

An association rule \( X \to Y \) will be called (\( {\mathcal{M}} \), Ε)-strong if ∀μ i \( {\mathcal{M}} \) (μ i (XY) ≥ ε i ). The set of all (\( {\mathcal{M}} \), Ε)-strong association rules will be denoted by \( AR_{{({\mathcal{M}},E)}} \).

4 Representative Rule Templates

In this section, we will first introduce a new notion of a rule template and examine its properties and relationship with association rules. Then, we will offer a lossless representation of (\( {\mathcal{M}} \), Ε)-strong association rules based on so called (\( {\mathcal{M}} \), Ε)-representative rule templates.

4.1 Rule Templates and Association Rules

A construct \( X\mathop \to \limits^{Z} Y \) will be called a rule template if \( X \cap Y = \varnothing \) and \( (X \cup Y) \subseteq Z \). Itemsets X, Y and Z, which occur in the rule template \( X\mathop \to \limits^{Z} Y \), are called its antecedent, consequent and base, respectively. The set of all rule templates is denoted by RT.

Please note that for \( Z = X \cup Y \), rule template \( X\mathop \to \limits^{Z} Y \) coincides with association rule \( X \to Y \). However, in general case, the base Z of rule template \( X\mathop \to \limits^{Z} Y \) is a (proper or improper) superset of the base \( X \cup Y \) of association rule X → Y. In spite of this difference, both the probability of base \( X \cup Y \) of rule \( X \to Y \) as well as the probability of base Z of rule template \( X\mathop \to \limits^{Z} Y \) exceeds neither P(X) nor P(Y).

Let μ be a canonical measure. We redefine μ for a rule template \( X\mathop \to \limits^{Z} Y \) as follows:

$$ \mu (X\mathop \to \limits^{Z} Y) = \mu \left( {P\left( X \right),P\left( Y \right),P\left( Z \right)} \right). $$

For example, \( novelty(X \to Y) = P\left( {XY} \right) - P\left( X \right) \times P\left( Y \right) \), while \( novelty(X\mathop \to \limits^{Z} Y) = P\left( Z \right) - P\left( X \right) \times P\left( Y \right) \).

A rule template \( X\mathop \to \limits^{Z} Y \) is called (\( {\mathcal{M}} \), Ε)-strong if ∀μ i \( {\mathcal{M}} \) (μ i (X \( \mathop \to \limits^{Z} \) Y) ≥ ε i ). The set of all (\( {\mathcal{M}} \), Ε)-strong rule templates will be denoted by \( RT_{{({\mathcal{M}},E)}} \).

Proposition 4.

Let \( X\mathop \to \limits^{Z} Y \) be a rule template and μ be a canonical measure. Then for each rule \( X^{{\prime }} \to Y^{{\prime }} \) such that \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( Z \right) \), the following holds: \( \mu (X^{{\prime }} \to Y^{{\prime }} ) \ge \mu (X\mathop \to \limits^{Z} Y) \).

Proof.

Let \( X\mathop \to \limits^{Z} Y \) be a rule template, μ be a canonical measure and \( X^{{\prime }} \to Y^{{\prime }} \) be a rule such that \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( Z \right) \). Let \( p_{a} = P\left( X \right) \), \( p_{c} = P\left( Y \right) \), \( p_{b} = P\left( {XY} \right) \), \( p_{{a^{{\prime }} }} = P\left( {X^{{\prime }} } \right) \), \( p_{{c^{{\prime }} }} = P\left( {Y^{{\prime }} } \right) \), \( p_{{b^{{\prime }} }} = P\left( Z \right) \). Then, \( p_{a} ,p_{c} ,p_{b} \in \left[ {0,1} \right] \), \( p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} ,p_{{b^{{\prime }} }} \in \left[ {0,1} \right] \), \( (p_{a} ,p_{c} \ge p_{b} ) \), \( (p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} \ge p_{{b^{{\prime }} }} ) \), \( (p_{{a^{{\prime }} }} \le p_{a} ) \), \( (p_{{c^{{\prime }} }} \le p_{c} ) \) and \( (p_{{b^{{\prime }} }} \ge p_{b} ) \). Hence and by the fact that μ is a canonical measure, \( \mu (p_{{a^{{\prime }} }} ,p_{{c^{{\prime }} }} ,p_{{b^{{\prime }} }} ) \ge \mu (p_{a} ,p_{c} ,p_{b} ) \). So, \( \mu (X^{{\prime }} \to Y^{{\prime }} ) = \mu \left( {P\left( {X^{{\prime }} } \right),P\left( {Y^{{\prime }} } \right),P\left( {X^{{\prime }} Y^{{\prime }} } \right)} \right) \ge \mu \left( {P\left( X \right),P\left( Y \right),P\left( Z \right)} \right) = \mu (X\mathop \to \limits^{Z} Y) \). □

Proposition 5.

Let \( X\mathop \to \limits^{Z} Y \) be a rule template and μ be a canonical measure. Then for each rule \( X^{{\prime }} \to Y^{{\prime }} \) such that \( X^{{\prime }} \supseteq X \), \( Y^{{\prime }} \supseteq Y \), \( (X^{{\prime }} \cup Y^{{\prime }} ) \subseteq Z \), the following holds: \( \mu (X^{{\prime }} \to Y^{{\prime }} ) \ge \mu (X\mathop \to \limits^{Z} Y) \).

Proof.

Let \( X\mathop \to \limits^{Z} Y \) be a rule template and μ be a canonical measure (*). Let \( X^{{\prime }} \to Y^{{\prime }} \) be a rule such that \( X^{{\prime }} \supseteq X \), \( Y^{{\prime }} \supseteq Y \), \( (X^{{\prime }} \cup Y^{{\prime }} ) \subseteq Z \). Then, \( P\left( {X^{{\prime }} } \right) \le P\left( X \right) \), \( P\left( {Y^{{\prime }} } \right) \le P\left( Y \right) \) and \( P\left( {X^{{\prime }} Y^{{\prime }} } \right) \ge P\left( Z \right) \). Hence, by (*) and Proposition 4, \( \mu (X^{{\prime }} \to Y^{{\prime }} ) \ge \mu (X\mathop \to \limits^{Z} Y) \). □

Proposition 5 tells us that whenever a rule template \( X\mathop \to \limits^{Z} Y \) reaches some value for a canonical measure, then a number of association rules \( X^{{\prime }} \to Y^{{\prime }} \) associated in a certain way with this rule template \( (X^{{\prime }} \supseteq X,Y^{{\prime }} \supseteq Y,(X^{{\prime }} \cup Y^{{\prime }} ) \subseteq Z) \) will also reach at least the same value of this measure. We will use this observation when defining a cover of a rule template.

The cover of rule template \( X\mathop \to \limits^{Z} Y \) is denoted by \( TC(X\mathop \to \limits^{Z} Y) \) and defined as follows:

$$ TC(X\mathop \to \limits^{Z} Y) = \{ (X^{{\prime }} \to Y^{{\prime }} ) \in AR \, | \, X^{{\prime }} \supseteq X,Y^{{\prime }} \supseteq Y,(X^{{\prime }} \cup Y^{{\prime }} ) \subseteq Z\} . $$

Each association rule in \( TC(X\mathop \to \limits^{Z} Y) \) will be called covered by \( X\mathop \to \limits^{Z} Y \).

Proposition 6.

Let \( X\mathop \to \limits^{Z} Y \) be an \( ({\mathcal{M}},E) \)-strong rule template. Then, each association rule \( (X^{{\prime }} \to Y^{{\prime }} ) \) in \( TC(X\mathop \to \limits^{Z} Y) \) is \( ({\mathcal{M}},E) \)-strong.

Proof.

Follows from definition of the cover of a rule template and Proposition 5. □

Thus, all association rules covered by an (\( {\mathcal{M}} \), Ε)-strong rule template are (\( {\mathcal{M}} \), Ε)-strong, too. We will focus now on determining a set of association rules that are covered by a given rule template and on determining its cardinality.

Example 3.

Let be a rule template. Then,  = . □

We find that the number of association rules covered by a single rule template depends on the number of items in its base that occur neither in the antecedent of the rule template nor in its consequent (see Theorem 1).

Theorem 1.

Let \( X\mathop \to \limits^{Z} Y \) be a rule template. Then:

$$ \left| {TC(X\mathop \to \limits^{Z} Y)} \right| = 3^{m} ,{\text{where}}\;m = \left| {Z\backslash (X \cup Y)} \right|. $$

Proof.

By definition of TC, \( X \to Y \) belongs to \( TC(X\mathop \to \limits^{Z} Y) \). In addition, \( TC(X\mathop \to \limits^{Z} Y) \) contains association rules having supersets of X as antecedents and supersets of Y as consequents provided additional items in antecedents and consequents come from \( Z\backslash (X \cup Y) \). In fact, association rules belonging to \( TC(X\mathop \to \limits^{Z} Y) \) could be created from \( X \to Y \) by extending its antecedent and consequent in the following way. Let V be a set of items from Z that occur neither in antecedent of \( X\mathop \to \limits^{Z} Y \) nor in its consequent; that is \( V = Z\backslash (X \cup Y) \). A new rule to be included in \( TC(X\mathop \to \limits^{Z} Y) \) can be obtained from rule \( X \to Y \) by performing one of the following three operations on each item v from V: (1) add v to the antecedent of \( X \to Y \), (2) add v to the consequent of \( X \to Y \), (3) do not use v when building a new rule. Thus, there are \( 3^{|V|} = 3^{{\left| {Z\backslash (X \cup Y)} \right|}} \) possible ways of building distinct association rules based on given rule template \( X\mathop \to \limits^{Z} Y \). □

Example 4.

Let be a rule template and . So, . In consequence, if is (\( {\mathcal{M}} \), Ε)-strong, then 6 591 association rules are (\( {\mathcal{M}} \), Ε)-strong, too. □

Theorem 2.

Let \( X\mathop \to \limits^{Z} Y \) and \( V\mathop \to \limits^{U} W \) be distinct rule templates such that \( V \supseteq X \), \( W \supseteq Y \), and \( U \subseteq Z \). Then, \( TC(X\mathop \to \limits^{Z} Y) \supset TC(V\mathop \to \limits^{U} W) \).

Proof.

Let \( X\mathop \to \limits^{Z} Y \) and \( V\mathop \to \limits^{U} W \) be distinct rule templates such that \( V \supseteq X \), \( W \supseteq Y \) and \( U \subseteq Z \). \( TC(X\mathop \to \limits^{Z} Y) \) = {(X′→Y′) ∈ AR| X′ ⊇ X, Y′ ⊇ Y, (X′∪Y′) ⊆ Z} ⊇ {(X′→Y′) ∈ AR| X′ ⊇ VX, Y′ ⊇ WY, (X′∪ Y′) ⊆ UZ} = \( TC(V\mathop \to \limits^{U} W) \). Hence, we proved that \( TC(X\mathop \to \limits^{Z} Y) \supseteq TC(V\mathop \to \limits^{U} W) \) (*). However, \( X\mathop \to \limits^{Z} Y \) and \( V\mathop \to \limits^{U} W \) are distinct rule templates, so \( V \supset X \) or \( W \supset Y \) or \( U \supset Z \).

Case VX:

Then, association rule \( X \to Y \) will belong to \( TC(X\mathop \to \limits^{Z} Y) \), but will not belong to \( TC(V\mathop \to \limits^{U} W) \) (since no rule in \( TC(V\mathop \to \limits^{U} W) \) will have antecedent X). Hence and by (*), \( TC(X\mathop \to \limits^{Z} Y) \supset TC(V\mathop \to \limits^{U} W) \).

Case WY:

Then, association rule \( X \to Y \) will belong to \( TC(X\mathop \to \limits^{Z} Y) \), but will not belong to \( TC(V\mathop \to \limits^{U} W) \) (since no rule in \( TC(V\mathop \to \limits^{U} W) \) will have consequent Y). Hence and by (*), \( TC(X\mathop \to \limits^{Z} Y) \supset TC(V\mathop \to \limits^{U} W) \).

Case UZ:

Then, association rule \( X \to Z\backslash X \) will belong to \( TC(X\mathop \to \limits^{Z} Y) \), but will not belong to \( TC(V\mathop \to \limits^{U} W) \) (since no rule in \( TC(V\mathop \to \limits^{U} W) \) will have base Z). Hence and by (*), \( TC(X\mathop \to \limits^{Z} Y) \supset TC(V\mathop \to \limits^{U} W) \). □

4.2 Representative Rule Templates as a Lossless Representation of Association Rules

We denote (\( {\mathcal{M}} \), Ε)-representative rule templates by \( RRT_{{({\mathcal{M}},E)}} \) and define as follows:

\( RRT_{{({\mathcal{M}},E)}} = \{ (X\mathop \to \limits^{Z} Y) \in RT_{{({\mathcal{M}},E)}} |\neg \exists (V\mathop \to \limits^{U} W) \in RT_{{({\mathcal{M}},E)}} \) such that \( (X\mathop \to \limits^{Z} Y) \ne (V\mathop \to \limits^{U} W),V \subseteq X,W \subseteq Y,U \supseteq Z\} \).

In Theorem 3, we claim that (\( {\mathcal{M}} \), Ε)-representative rule templates cover all (\( {\mathcal{M}} \), Ε)-strong association rules.

Theorem 3.

\( \bigcup_{{(X\mathop \to \limits^{Z} Y) \in RRT({\mathcal{M}},E)}} TC(X\mathop \to \limits^{Z} Y) = AR_{{({\mathcal{M}},E)}} \).

Proof.

We will prove Theorem 3 in two steps by showing that: (1) the union of covers of all (\( {\mathcal{M}} \), Ε)-strong rule templates equals \( AR_{{({\mathcal{M}},E)}} \); (2) the union of covers of all (\( {\mathcal{M}} \), Ε)-representative rule templates equals the union of covers of all (\( {\mathcal{M}} \), Ε)-strong rule templates, and by this, equals \( AR_{{({\mathcal{M}},E)}} \).

  1. (1)

    We note that for each (\( {\mathcal{M}} \), Ε)-strong association rule, say \( X^{{\prime }} \to Y^{{\prime }} \), there is an (\( {\mathcal{M}} \), Ε)-strong rule template; namely, \( X^{{\prime }} \mathop \to \limits^{{X^{{\prime }} \cup Y^{{\prime }} }} Y^{{\prime }} \) that covers \( X^{{\prime }} \to Y^{{\prime }} \). On the other hand, for each (\( {\mathcal{M}} \), Ε)-strong rule template \( X\mathop \to \limits^{Z} Y \), its cover \( TC(X\mathop \to \limits^{Z} Y) \subseteq AR_{{(\mathcal{M},E)}} \) (by Proposition 6). Thus, \( \bigcup_{{(X\mathop \to \limits^{Z} Y) \in RT({\mathcal{M}},E)}} TC(X\mathop \to \limits^{Z} Y) = AR_{{({\mathcal{M}},E)}} \).

  2. (2)

    Let \( RRT_{{({\mathcal{M}},E)}} \) do not contain an (\( {\mathcal{M}} \), Ε)-strong rule template \( X\mathop \to \limits^{Z} Y \). By definition of \( RRT_{{({\mathcal{M}},E)}} \), this implies that there is a non-empty set of rule templates \( S = \{ (V\mathop \to \limits^{U} W) \in RT_{{({\mathcal{M}},E)}} |(V\mathop \to \limits^{U} W) \ne (X\mathop \to \limits^{Z} Y),V \subseteq X,W \subseteq Y,U \supseteq Z\} \). By Theorem 2, the cover of each rule template in S is a proper superset of \( TC(X\mathop \to \limits^{Z} Y) \). Let U max be the base of a rule template in S with the maximum number of items. Let S′ contain all rule templates with base U max. Let V min be the antecedent of a rule template in S′ with the minimum number of items. Let S″ contain all rule templates in S′ with antecedent V min. Let W min be the consequent of a rule template in S″ with the minimum number of items. Let S″′ contain all rule templates in S″ with consequent W min. By construction of S″′, each rule template in S″′ is representative and its cover contains \( TC(X\mathop \to \limits^{Z} Y) \). In this way, for each non-representative rule template \( (X\mathop \to \limits^{Z} Y) \in RT_{{({\mathcal{M}},E)}} \), we will be able to identify at least one representative rule template \( V\mathop \to \limits^{U} W \) such that \( TC(V\mathop \to \limits^{U} W) \supset TC(X\mathop \to \limits^{Z} Y) \). So, \( \bigcup_{{(X\mathop \to \limits^{Z} Y) \in RRT({\mathcal{M}},E)}} TC(X\mathop \to \limits^{Z} Y) = \bigcup_{{(X\mathop \to \limits^{Z} Y) \in RT({\mathcal{M}},E)}} TC(X\mathop \to \limits^{Z} Y) = AR_{{({\mathcal{M}},E)}} \). □

Theorem 4.

Let \( X^{{\prime }} \to Y^{{\prime }} \) be an association rule. There is an (\( {\mathcal{M}} \), Ε)-representative rule template covering \( X^{{\prime }} \to Y^{{\prime }} \) if an only if \( X^{{\prime }} \to Y^{{\prime }} \) is an (\( {\mathcal{M}} \), Ε)-strong association rule.

Proof.

Let \( X^{{\prime }} \to Y^{{\prime }} \) be an association rule.

(⇒) Follows from Proposition 6.

(⇐) By Theorem 3, each (\( {\mathcal{M}} \), Ε)-strong association rule is covered by at least one (\( {\mathcal{M}} \), Ε)-representative rule template. □

5 Related Work on Lossless Representations of Association Rules

A number of concise lossless representations of strong association rules with respect to support and confidence have been proposed in the literature (see e.g. [7, 8] for an overview). Such representations typically consist of rules built from particular itemsets called generators; that is, itemsets the supports of which are less than the supports of all their proper subsets, and/or closed itemsets; that is, itemsets the supports of which are greater than the supports of all their proper supersets [2, 4, 6,7,8, 19, 20]. For example, in the case of representative association rules [6,7,8] and minimal non-redundant association rules [2], antecedents of rules are generators, whereas their consequents are set theoretical differences between closed itemsets and rules’ antecedents. The set of representative rules is a subset of the set of minimal non-redundant association rules [7, 8] and typically is by at least on order of magnitude less numerous. On the other hand, representative rules allow determination of pessimistic estimations of supports and confidences of strong association rules, while minimal non-redundant association rules allow exact determination of values of these measures. Yet, these and other developed representations of strong association rules with respect to support and confidence are not flexible enough to represent rules that are strong with respect to arbitrary ACBC/canonical measures.

In [11], we proposed the first representation of strong association rules with respect to any set of ACBC-measures. It consists of rule templates, where each rule template is a pair composed from a lower rule built from generators and an upper rule built from closed itemsets. For a rule template \( (X \to Y,Z \to V) \), the following holds: \( P\left( X \right) = P\left( Z \right) \), \( P\left( Y \right) = P\left( V \right) \), \( P\left( {XY} \right) = P\left( {ZV} \right) \). For each strong association rule \( U \to W \), there is a rule template \( (X \to Y,Z \to V) \) such that \( X \subseteq U \subseteq Z,Y \subseteq W \subseteq V \), which implies that: \( P\left( X \right) = P\left( U \right) = P\left( Z \right) \), \( P\left( Y \right) = P\left( W \right) = P\left( V \right) \), \( P\left( {XY} \right) = P\left( {UW} \right) = P\left( {ZV} \right) \) and that for any ACBC-measure μ: \( \mu (X \to Y) = \mu (U \to W) = \mu (Z \to V) \).

In the current paper, we proposed a rule representation which does not require rule templates in the form of pairs of two rules. In addition, a representative rule templates we offered here, may cover strong association rules whose values of canonical measures may be the same or greater than values of respective canonical measures of covering rule templates. This should result in higher conciseness of the new representation. Beneath we formulate additional properties of representative rule templates (lack of space does not allow us to provide their proves).

Proposition 7.

  1. (a)

    Antecedents and consequents of representative rule templates are generators.

  2. (b)

    Bases of representative rule templates are closed itemsets.

  3. (c)

    The set of representative rule templates is never more numerous than the set of rule templates from [11].

6 Summary

In this paper, we offered the notion of a canonical measure and proved that a number of example measures definable (at most) in terms of the probabilities of rule antecedents, consequents and bases are, in fact, canonical. Then we proposed representative rule templates as a representation of association rules satisfying multiple evaluation criteria expressible in terms of canonical measures. We proved that this representation allows deriving all and only association rules strong with respect to a given set of criteria based on canonical measures. We proved a number of properties of rule templates. In particular, we derived the formula determining the number of association rules covered by a rule template. The found formula suggests that representative rule templates can be a very concise representation of strong association rules. By means of an example, we showed that a single rule template may represent thousands of not less strong association rules.