Generalization of Derivation Rules for Join Dependencies in Database

Zykin, S. V.

doi:10.3103/S0146411621070191

Generalization of Derivation Rules for Join Dependencies in Database

Published: 01 February 2022

Volume 55, pages 731–737, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Automatic Control and Computer Sciences Aims and scope Submit manuscript

Generalization of Derivation Rules for Join Dependencies in Database

Download PDF

S. V. Zykin ORCID: orcid.org/0000-0002-0576-2149¹

53 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we consider a generalization of the derivation rules for join dependencies, which are used in designing a database scheme that satisfies the requirements of the fifth normal form. Attempts to construct systems of axioms of such dependencies based on derivation rules were made in previous works devoted to this problem. However, while the substantiation of the consistency (soundness) of the obtained axioms does not cause difficulties, then a satisfactory solution to prove the completeness in the general case was not obtained. First of all, this is due to the limitations of the derivation rules themselves. Here, we focus on two original axiom systems presented in the Sciore and Malvestuto works. For join dependencies, a system of rules that generalizes existing systems and at the same time has fewer restrictions is obtained. A proof of the derivability of known systems of axioms from the presented derivation rules is present. Besides, evidence of the consistency (soundness) of these rules is provided. A positive solution to the issue of the completeness of the formal system based on the presented rules was not found. In conclusion, the theoretical and practical importance of derivation rules for join dependencies is noted.

A Formal Context for Acyclic Join Dependencies

On Distributive Join Semilattices

A Formal Context for Closures of Acyclic Hypergraphs

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

INTRODUCTION

The basis for designing the logical structure of a database (DB) is the data dependencies in the application area. Currently, functional dependencies, multivalued dependencies, join dependencies [1–3], and inclusion dependencies [3–7] are relevant in the design. To consider the problem, we introduce the necessary notation. Let a finite set of all relations of a relational DB be given: $({{R}_{1}},{{R}_{2}},...,{{R}_{k}})$ is the result of decomposition during normalization of relations, $[{{R}_{i}}]$ is a set of attributes (${{R}_{i}}$ relation scheme), ${{R}_{i}}[V]$ is a projection of relation ${{R}_{i}}$ onto a set of attributes $V$, and $U = \{ {{A}_{1}},{{A}_{2}}, \ldots ,{{A}_{n}}\} $ is a finite set of all attributes, on which DB relations are specified.

This paper is aimed at studying properties of the join dependencies necessary for the construction of the fifth normal form. In order to emphasize the practical significance of this problem, we consider an example borrowed from the analytical materials of S.D. Kuznetsov. Let ${{A}_{1}}$ be an employee, ${{A}_{2}}$ be project, and ${{A}_{3}}$ be project stage. If we assume the presence of a dependency of join ⋈$({{A}_{1}}{{A}_{2}},{{A}_{2}}{{A}_{3}})$, where ⋈ is the sign of the natural join operation [1, 2], then as a result of the decomposition of relation $R[{{A}_{1}},{{A}_{2}},{{A}_{3}}]$, two relations will be obtained:

$${{R}_{1}} = R[{{A}_{1}},{{A}_{2}}],\,\,\,\,{{R}_{2}} = R[{{A}_{2}},{{A}_{3}}].$$

The presence of dependency of ⋈$({{A}_{1}}{{A}_{2}},{{A}_{2}}{{A}_{3}})$ guarantees that the equality

$R = {{R}_{1}}$ ⋈ R₂

is fulfilled for any implementation of $R$. The business rule corresponding to the dependency can be formulated as follows: If an employee participates in a project, then he automatically participates in the execution of all stages of this project. Note that dependency of ⋈$({{A}_{1}}{{A}_{2}},{{A}_{2}}{{A}_{3}})$ is equivalent to multivalued dependency of ${{A}_{2}} \to \to {{A}_{1}}|{{A}_{3}}$, and therefore, decomposition relations $({{R}_{1}},{{R}_{2}})$ satisfy the fourth normal form (4NF).

If we assume the presence of dependency of relation ⋈$({{A}_{1}}{{A}_{2}},{{A}_{2}}{{A}_{3}},{{A}_{1}}{{A}_{3}})$, then as a result of the decomposition of relation $R[{{A}_{1}},{{A}_{2}},{{A}_{3}}]$, three relations will be obtained:

$${{R}_{1}} = R[{{A}_{1}},{{A}_{2}}],{{R}_{2}} = R[{{A}_{2}},{{A}_{3}}],{{R}_{3}} = R[{{A}_{1}},{{A}_{3}}].$$

The presence of dependency of ⋈$({{A}_{1}}{{A}_{2}},{{A}_{2}}{{A}_{3}},{{A}_{1}}{{A}_{3}})$ guarantees the fulfillment of the equality:

$R = {{R}_{1}}$ ⋈ R₂ ⋈ R₃

for any implementation of $R$. The business rule corresponding to the dependency is: “The employee participates in the execution of only some stages of the project,” about which the corresponding data is written to relation ${{R}_{3}}$. In relations ${{R}_{1}}$ and ${{R}_{2}}$, information about the employee’s participation in the project and the presence of a corresponding stage in the project must be also contained. Decomposition relations $({{R}_{1}},{{R}_{2}},{{R}_{3}})$ satisfy the fifth normal form (5NF).

We consider that we need to estimate the participation of an employee in a project. For a relational DB scheme formed by dependencies, additions are executed without destroying the existing scheme. For the dependency of ⋈$({{A}_{1}}{{A}_{2}},{{A}_{2}}{{A}_{3}})$, it is needed to add the corresponding attribute to relation ${{R}_{1}}$. In this case, the employee’s participation in the entire project will be estimated. For the dependency of ⋈$({{A}_{1}}{{A}_{2}},{{A}_{2}}{{A}_{3}},{{A}_{1}}{{A}_{3}})$, a corresponding attribute must be added to relation ${{R}_{3}}$. In this case, the employee’s participation in each stage of the project will be estimated.

It is a common statement that join dependencies are rarely met in practice. The considered example refutes this statement. Similar dependencies arise wherever it is required to map objects with a limited set of events in the system.

1 RELATED WORK

Rissanen [8] was the first to introduce the concept of a join dependency and defined an important property of data: the lossless join of relations ${{R}_{1}}$, ${{R}_{2}}$, …, ${{R}_{k}}$ for any implementation of relation $R$:

$R = {{R}_{1}}$ ⋈ ${{R}_{2}}$ ⋈ … ⋈ ${{R}_{k}}$

which corresponds to join dependency ⋈$({{V}_{1}},{{V}_{2}}, \ldots ,{{V}_{k}})$, where ${{V}_{i}} = [{{R}_{i}}]$.

A system of rules for join dependencies is proposed in [9]. Let ${\mathbf{S}} = \{ {{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{p}}\} $, ${\mathbf{R}} = \{ {{Y}_{1}},{{Y}_{2}}, \ldots ,{{Y}_{q}}\} $, $attr({\mathbf{S}}) = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{p}}$, and $attr({\mathbf{R}}) = {{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{q}},| = $ be a logically implications, then:

A0: $\emptyset $ $\left| = \right.$ ⋈$(X)$ for any set $Y \subseteq attr({\mathbf{S}})$.

A1: ⋈$({\mathbf{S}})$ ⋈ $({\mathbf{S}},Y)$ if $Y \subseteq attr({\mathbf{S}})$.

A2: ⋈$({\mathbf{S}},Y,Z)$ $\left| = \right.$ ⋈$({\mathbf{S}},YZ)$.

A3: ⋈$({\mathbf{S}},Y)$, ⋈ $({\mathbf{R}})$ $\left| = \right.$ ⋈$({\mathbf{S}},{\mathbf{R}})$ if $attr({\mathbf{R}}) = Y$.

A4: ⋈$({\mathbf{S}},YA)$ $\left| = \right.$ ⋈$({\mathbf{S}},Y)$ if .

At was shown in [9], the rules A0–A4 are consistent (soundness), but do not have the completeness property. Besides, a generalization of the considered rules is considered when the set $U$ is expanded with attributes of a special type.

To find a soundness and complete formal system for join dependencies was attempted in [10]. A bounded formal system that allows deriving join dependencies using only generalized join dependencies is obtained. A nonbounded formal system that is complete for extended join dependencies is also obtained, and a robust and complete nonbounded Gentzen-style system for join dependencies is obtained. In this case, the question of the existence of a soundness and complete bounded formal system for extended join dependencies remains open.

A method for decomposing join dependencies in a relational database using the concept of a hinge is discussed in [11]. It is shown that the decomposition method can be used to improve the efficiency of integrity checking. Integrity checking for join dependencies means checking that the dependency is fulfilled against the relationship after the upgrade. In the general case, such a problem is $NP$-complete. The paper shows that this integrity check can be avoided by decomposing the relation into a set of smaller relations. The concept of $n$-cyclicity for join dependencies is considered. It is shown that at a fixed $n$ value, the integrity check can be performed for polynomial time.

A sound and complete formal system for full acyclic join dependencies, which is obtained by adding three rules to the axioms of multivalued dependencies, is proposed in [12]. The hypergraph model is used when determining acyclicity [2]. We write the axioms in this system as follows:

B1. ⋈$({{X}_{1}},{{X}_{2}})$ $\left| = \right.$ ⋈ $({{X}_{1}} \cup Y,{{X}_{2}} \cup Y)$, $Y \subseteq {{X}_{1}} \cup {{X}_{2}}$.

B2. ⋈$({{X}_{1}},{{X}_{2}}),$, ⋈$(Y,{{X}_{1}} \cap {{X}_{2}})$ $\left| = \right.$ ⋈ $({{X}_{1}},{{X}_{2}} \cap Y)$.

B3. $\left| = \right.$ ⋈ $(U)$.

B4. ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ $\left| = \right.$ ⋈ $({{Y}_{1}},{{Y}_{2}})$, where $\{ {{Y}_{1}},{{Y}_{2}}\} $ is the component (two-edge hypergraph).

B5. ⋈$(X,\{ {{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}}\} ),({{Y}_{1}},{{Y}_{2}})$, ⋈ $({{Y}_{1}},{{Y}_{2}})$ $\left| = \right.$ ⋈ $(X \cap {{Y}_{1}},X \cap {{Y}_{2}},\{ {{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}}\} )$, where ${{Y}_{1}} \cap {{Y}_{2}} \subseteq X,X \cap {{X}_{i}} \subseteq {{Y}_{1}}$ or $X \cap {{X}_{i}} \subseteq {{Y}_{2}}$, $i = \overline {1,k} $.

We will return to discussing these axioms below.

The relationship between the axiomatization of dependencies in relational databases and cylindrical algebras is considered in [13]. The join dependences and the corresponding class of (representable) cylindrical semigrids are considered. Since representable cylindrical semigrids have an infinite axiomatizable quasi-equivalent theory, there is no finite axiomatization for $n$-dimensional unbounded join dependences. In the work, counterexamples to the statement that cylindrical dependencies are finitely axiomatizable are obtained.

Full hierarchical dependencies (FHDs), which correspond to the natural join of at least two DB relations, are considered in [14, 15]. A complete set of derivation rules for FHD is defined in [14]. The first result establishes a finite axiomatization for FHD. The second main result establishes a finite axiomatization for the deducibility of an FHD. How the derivation of full hierarchical dependencies can be modeled using the derivation of multivalued dependencies and vice versa are shown in [15]. This allows both main results to be applied to a combined class of functional and multivalued dependencies. Besides, a new axiomatization is established for the class of nontrivial functional dependencies.

A generalization for the problem of deducibility of join atoms and inclusion atoms was obtained in [16], which gives a complete axiomatization for the problem of deducibility of built-in join dependences and inclusion dependences in the special case. In a more general sense, a way to search for axiomatization for various collections of dependencies in a DB using the mathematical logic methods is present in the paper.

The work of acyclic dependencies was continued in [17]. A formal context for join dependencies that generalizes the approach for analyzing multivalued dependencies is presented. This result is the basis for further study of acyclic join dependences in order to minimize their set.

In this paper, only classical [8] join dependencies, which are not supplemented by any restrictions or extensions, are considered. We assume that only this kind of dependencies is significant for practice.

2 DERIVATION RULES FOR JOIN DEPENDENCIES

Let $R$ be a relation on a set of attributes $U = \{ {{A}_{1}},{{A}_{2}}, \ldots ,{{A}_{n}}\} $.

Definition 1. A join dependency of ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ is satisfied if, for any implementation of relation $R$, the following is satisfied:

$R[X] = R[{{X}_{1}}]$ ⋈ $R\left[ {{{X}_{2}}} \right]$ ⋈ … ⋈ $R\left[ {{{X}_{k}}} \right]$,

where $R[{{X}_{i}}]$ is the projection [1] of relation $R$ onto the set of attributes ${{X}_{i}}$, $X = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$.

In [2], an obvious attribute of the presence of a join dependency was formulated.

Statement 1. The dependency of ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ is satisfied in $R$ if, for any implementation of $R$, the presence of tuples ${{t}_{1}},{{t}_{2}}, \ldots ,{{t}_{k}}$ not necessarily different guarantees the existence of a tuple: $t \in R$: $t[{{X}_{i}}] = {{t}_{i}}[{{X}_{i}}]$, $i = \overline {1,k} $, for which ${{t}_{i}}[{{X}_{i}} \cap {{X}_{j}}] = {{t}_{j}}[{{X}_{i}} \cap {{X}_{j}}]$ if ${{X}_{i}} \cap {{X}_{j}} \ne \emptyset $.

The analysis of Statement 1 shows that the separation of join dependencies into full and inline is artificial. In relation $R$, all the full join dependencies in practice can be built-in during the DB operation. This must not be the reason for rebuilding the decomposition of relations. Arbitrary dependency of ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ is full in projection $R[X]$, where $X = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$, and generally embedded in relation $R$. In this case, it is appropriate to consider the definition domain of a dependency.

Definition 2. The definition domain of the join dependency of ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ in relation $R$ will be called set $X = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$.

In [18], the concept of the definition domain of a functional dependency was introduced with the aim of separating the values “not defined (Null)” and “does not exist.” A formal theory, which is a natural generalization of the classical theory and allows getting rid of the values “does not exist” at the design stage of the DB scheme was also developed. The definition domain of a dependency in [18] is a set of attributes, which is consistent with Definition 2. Thus, the dependencies considered below are full, but in their own definition domain. In this case, the deduced dependencies may have a larger or smaller definition domain than that of the initial dependencies.

Consideration of formal theories in a DB allows answering the main problems of a DB design:

• whether the database project built on the basis of dependencies set is optimal and correct in relation to some criteria (presence of anomalies, minimal coverage, etc),

• whether the decomposition of $({{R}_{1}},{{R}_{2}}, \ldots ,{{R}_{k}})$ preserves dependencies as integrity constraints on the data.

The solution to these problems is based on the dependency derivation rules. We consider a set of rules for join dependencies.

(P0) $\emptyset $ $\left| = \right.$ ⋈ $(X)$, $X \subseteq U$.

(P1) ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ $\left| = \right.$ ⋈ $({{Y}_{1}},{{Y}_{2}}, \ldots ,{{Y}_{m}})$ if:

(p11) for any ${{X}_{i}}$, there is ${{Y}_{j}}$: ${{X}_{i}} \subseteq {{Y}_{j}}$;

(p12) for any ${{Y}_{j}}$, ${{Y}_{j}} \subseteq {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$ is satisfied.

(P2) ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ $\left| = \right.$ ⋈ $({{Y}_{1}},{{Y}_{2}}, \ldots ,{{Y}_{k}})$ if:

(p21) ${{X}_{i}} \cap ({{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{{i - 1}}} \cup {{X}_{{i + 1}}} \ldots \cup {{X}_{k}}) \subseteq {{Y}_{i}}$;

(p22) ${{Y}_{i}} \subseteq {{X}_{i}}$.

(P3) ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}})$ $\left| = \right.$ ⋈ $({{Y}_{1}},{{Y}_{2}}, \ldots ,{{Y}_{k}})$ if:

(p31) ${{Y}_{i}} = {{X}_{i}} \cup Z$, $Z \subseteq {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}.$

(P4) ⋈ $({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}},V)$, ⋈ $({{Y}_{1}},{{Y}_{2}}, \ldots ,{{Y}_{m}})$ $\left| = \right.$ ⋈ $({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}},V \cap {{Y}_{1}},V \cap {{Y}_{2}}, \ldots ,V \cap {{Y}_{m}})$ if:

(p41) ${{Y}_{i}} \cap {{Y}_{j}} \subseteq V,i \ne j$;

(p42) $V \cap ({{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}) \subseteq V \cap ({{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{m}})$.

We consider the compliance of the rules presented here.

Statement 2. Rules A0–A4 are a consequence of rules P0–P4.

Proof. Obviously, rules A0 and P0 are equivalent. Rule A1 is a consequence of P1 on a set of attributes $X = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$. Indeed, condition p12 of rule P1 coincides with condition $Y \subseteq attr({\mathbf{S}})$ of rule A1, and condition p11 of rule P1 is satisfied by the construction of rule A1. Since P1 corresponds to the more general case, A1 is a consequence of P1 on the set of attributes $X = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$. It is not difficult to check that $X = {{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{k}}$. Rule A2 is an obvious consequence of rule P1. Rule A3 is a special case of rule P4. Indeed, if $V = {{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{m}}$, then the conditions p41 and p42 will always be satisfied, and the rule P4 itself will coincide with rule A3. Rule A4 becomes an obvious consequence of rule P2 if we make identifications of ${\mathbf{S}} = ({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{{i - 1}}},{{X}_{{i + 1}}}, \ldots ,{{X}_{k}})$ and $YA = {{X}_{i}}$.

Statement 3. Rules B1–B5 are a consequence of rules P0–P4.

Proof. Obviously, rule P3 is a generalization of rule B1. Rule B2 is derived using rules P4 and P1. Indeed, we write the left side of rule B2 in the following form: ⋈$(X,V)$, ⋈$(Y,X \cap V)$, and it is necessary to obtain the rule derivation of ⋈$(X,V \cap Y)$. Substituting the left side into the rule P4, we obtain deducibility of ⋈$(X,V \cap Y,V \cap X)$, and conditions p41 and p42 are satisfied. Then, we apply rule P1 to the resulting expression, we obtain deducibility of ⋈$(X,V \cap Y)$, and the conditions p11 and p12 are satisfied, which is what was required to be shown. Rule B3 is equivalent to rule P0. Rule B4 is a special case of rule P1 since component [12] is a two-edged hypergraph obtained by joining the edges of the initial hypergraph. Rule B5 is a special case of rule P4. Indeed, condition ${{Y}_{1}} \cap {{Y}_{2}} \subseteq X$ guarantees the fulfillment of the condition p41 at $X = V$, and condition $X \cap Xi \subseteq {{Y}_{1}}$ or $X \cap {{X}_{i}} \subseteq {{Y}_{2}}$, $i = \overline {1,k} $ guarantees the fulfillment of the condition p42. Since the order of the join dependency arguments does not matter, rule B5 is the same as rule P4 at $m = 2$.

3 SOUNDNESS OF DERIVATION RULES

It is necessary to prove soundness for rules P0–P4 proposed here. In other words, the dependencies obtained due to rules P0–P4 will satisfy the logical implication. This is a very important problem from a practical viewpoint, since it guarantees the preservation of data integrity during DB design.

Theorem 1. Rules P0–P4 are sound.

Proof. The soundness of P0 rule is obvious.

To prove the rule P1, we denote: ${{R}_{X}} = R[{{X}_{1}}]$ ⋈ $R[{{X}_{2}}]$ ⋈ … ⋈ $R[{{X}_{k}}]$ and ${{R}_{Y}} = R[{{Y}_{1}}]$ ⋈ $R[{{Y}_{2}}]$ ⋈ … ⋈ $R[{{Y}_{m}}]$, where $X = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$ and $Y = {{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{m}}$. Lemma 5.5 [1] implies that $R[X] \subseteq {{R}_{X}}$ and $R[Y] \subseteq {{R}_{Y}}$. It is necessary to show that if ${{R}_{X}} \subseteq R[X]$, then ${{R}_{Y}} \subseteq R[Y]$. We consider arbitrary tuple ${{t}_{Y}} \in {{R}_{Y}}$, and it is necessary to show that ${{t}_{Y}} \in R[Y]$. For condition ${{t}_{Y}} \in {{R}_{Y}}$ to be satisfied, it is necessary to have unnecessarily different ${{t}_{1}},{{t}_{2}}, \ldots ,{{t}_{m}}$ tuples in relation $R$ (statement 1), for which ${{t}_{p}}[{{Y}_{p}}] = {{t}_{Y}}[{{Y}_{p}}]$, $p = \overline {1,m} $ is satisfied. We consider any pair of attribute sets ${{X}_{i}}$ and ${{X}_{j}}$. Condition p11 implies that there are $p$ and $q$: ${{X}_{i}} \subseteq {{Y}_{p}}$ and ${{X}_{j}} \subseteq {{Y}_{q}}$. If ${{X}_{i}} \cap {{X}_{j}} \ne \emptyset $, then ${{t}_{p}}[{{X}_{i}} \cap {{X}_{j}}] = {{t}_{q}}[{{X}_{i}} \cap {{X}_{j}}]$ since they are formed from single tuple ${{t}_{Y}}$. We denote ${{t}_{{pi}}} = {{t}_{p}}[{{X}_{i}}]$. Condition p11 implies that tuple ${{t}_{{pi}}}$ exists for any $i$: $1 \leqslant i \leqslant k$. Condition p12 implies that there is no attribute ${{A}_{l}} \in {{Y}_{p}}$ whose value ${{t}_{Y}}[{{A}_{l}}]$ does not belong to any tuple ${{t}_{{pi}}}$. A natural join of tuples ${{t}_{{pi}}}$ is the same as tuple ${{t}_{Y}}$. Therefore, ${{t}_{Y}} \in {{R}_{X}}$. We note that equality $X = Y$ follows from conditions p11 and p12. By assumption, ${{R}_{X}} = R[X]$, which means ${{t}_{Y}} \in R[Y]$. The rule is proven.

To prove rule P2, we use similar notation: ${{R}_{X}} = R[{{X}_{1}}]$ ⋈ $R[{{X}_{2}}]$ ⋈ … ⋈ $R[{{X}_{k}}]$ and ${{R}_{Y}} = R[{{Y}_{1}}]$ ⋈ $R[{{Y}_{2}}]$ ⋈ … ⋈ $R[{{Y}_{k}}]$, where $X = {{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}$ and $Y = {{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{k}}$. Lemma 5.5 [1] implies that $R[X] \subseteq {{R}_{X}}$ and $R[Y] \subseteq {{R}_{Y}}$. It is necessary to show that if ${{R}_{X}} \subseteq R[X]$, then ${{R}_{Y}} \subseteq R[Y]$. We consider arbitrary tuple ${{t}_{Y}} \in {{R}_{Y}}$; it suffices to show that ${{t}_{Y}} \in R[Y]$. For condition ${{t}_{Y}} \in {{R}_{Y}}$ to be satisfied, it is necessary to have unnecessarily different tuples ${{t}_{i}} \in R$, for which ${{t}_{i}}[{{Y}_{i}}] = {{t}_{Y}}[{{Y}_{i}}]$, $i = \overline {1,k} $ is fulfilled. We consider an arbitrary pair of sets ${{X}_{i}}$ and ${{X}_{j}}$, where $i \ne j$. Equality ${{X}_{i}} \cap {{X}_{j}} \subseteq {{X}_{i}} \cap ({{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{{i - 1}}} \cup {{X}_{{i + 1}}} \cup \ldots \cup {{X}_{k}})$ is fulfilled, and condition p21 implies that ${{X}_{i}} \cap {{X}_{j}} \subseteq {{Y}_{i}}$. On the other hand, ${{X}_{i}} \cap {{X}_{j}} \subseteq {{X}_{i}} \cap ({{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{{j - 1}}} \cup {{X}_{{j + 1}}} \cup \ldots \cup {{X}_{k}})$. Condition p21 implies that ${{X}_{i}} \cap {{X}_{j}} \subseteq {{Y}_{j}}$. As a result, we obtain that ${{X}_{i}} \cap {{X}_{j}} \subseteq {{Y}_{i}} \cap {{Y}_{j}}$ and ${{t}_{i}}[{{X}_{i}} \cap {{X}_{j}}] = {{t}_{j}}[{{X}_{i}} \cap {{X}_{j}}]$. By the property of the natural join operation, there is tuple ${{t}_{X}} = {{t}_{1}}[{{X}_{1}}]$ ⋈ ${{t}_{2}}[{{X}_{2}}]$ ⋈ … ⋈ ${{t}_{k}}[{{X}_{k}}] \in {{R}_{X}}$. When set ${{X}_{i}} \cap {{X}_{j}}$ is empty, there is no obstacle to joining tuples. Condition p22 implies that ${{t}_{Y}} = {{t}_{X}}[Y]$. Since ${{R}_{X}} = R[X]$ and $Y \subseteq X$, then ${{t}_{Y}} \in R[Y]$ is what was required to prove.

In the proof of P3, we use the same notation for ${{R}_{X}}$, ${{R}_{Y}}$, $X$, and $Y$. As before, $R[X] \subseteq {{R}_{X}}$ and $R[Y] \subseteq {{R}_{Y}}$. It is necessary to show that if ${{R}_{X}} \subseteq R[X]$, then ${{R}_{Y}} \subseteq R[Y]$. We consider arbitrary tuple ${{t}_{Y}} \in {{R}_{Y}}$, and it is necessary to show that ${{t}_{Y}} \in R[Y]$. The fulfillment of condition ${{t}_{Y}} \in {{R}_{Y}}$ guarantees the presence of unnecessarily different ${{t}_{i}} \in R$ tuples, for which ${{t}_{i}}[{{Y}_{i}}] = {{t}_{Y}}[{{Y}_{i}}]$, $i = \overline {1,k} $ is satisfied. These tuples satisfy the following conditions: ${{t}_{i}}[{{Y}_{i}} \cap {{Y}_{j}}] = {{t}_{j}}[{{Y}_{i}} \cap {{Y}_{j}}]$, $i = \overline {1,k} $. Since ${{X}_{i}} \cap {{X}_{j}} \subseteq {{Y}_{i}} \cap {{Y}_{j}}$ and $Z \subseteq {{Y}_{i}} \cap {{Y}_{j}}$, $i,j = \overline {1,k} $, then tuple ${{t}_{X}} = {{t}_{1}}[{{X}_{1}}]$ ⋈ ${{t}_{2}}[{{X}_{2}}]$ ⋈ … ⋈ ${{t}_{k}}[{{X}_{k}}]$ belongs to ${{R}_{X}}$. Condition ${{R}_{X}} \subseteq R[X]$ implies that ${{t}_{X}} \in R[X]$ and there is tuple $t \in R$, for which ${{t}_{X}} = t[X]$. By the totality of the considered equalities, we have ${{t}_{Y}} = t[Y]$, i.e., ${{t}_{Y}} \in R[Y]$. Consequently, ${{R}_{Y}} \subseteq R[Y]$ is what was required to be proved.

We prove the soundness of rule P4. Let ${{W}_{j}} = V \cap {{Y}_{j}}$, $j = \overline {1,m} $, $W = {{W}_{1}} \cup {{W}_{2}} \cup \ldots \cup {{W}_{m}}$. We show that ⋈$({{Y}_{1}},{{Y}_{2}}, \ldots ,{{Y}_{m}})$ $\left| = \right.$ ⋈ $({{W}_{1}},{{W}_{2}}, \ldots ,{{W}_{m}})$ is logically implied. For this reason, we use the P2 derivation rule. Condition p41 guarantees the fulfillment of condition ${{Y}_{i}} \cap {{Y}_{j}} \subseteq V \cap {{Y}_{j}}$, $i \ne j$, which, in turn, guarantees the fulfillment of condition p21: ${{Y}_{j}} \cap ({{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{{j - 1}}} \cup {{Y}_{{j + 1}}} \ldots \cup {{Y}_{k}}) \subseteq {{W}_{j}} = V \cap {{Y}_{j}}$. Condition p22 ${{W}_{j}} \subseteq {{Y}_{j}}$ is satisfied by construction. Next, we show that ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}},V)$ $\left| = \right.$ ⋈ $({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}},W)$ is logically implied. For this reason, we also use rule P2. Condition p42 implies condition p21: $V \cap ({{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}}) \subseteq {{W}_{j}} = V \cap ({{Y}_{1}} \cup {{Y}_{2}} \cup \ldots \cup {{Y}_{m}})$. Condition p22 ${{W}_{j}} \subseteq V$ is satisfied by construction. Since the soundness of rule P2 was already proved, the dependencies deduced with its help will be logically implied. It remains to show that the dependency on the right-hand side of rule P4 is implied. By assumption, the current implementation of relation $R$ satisfies dependencies of ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}},V)$ and ⋈$({{Y}_{1}},{{Y}_{2}}, \ldots ,{{Y}_{m}})$. Therefore, $R$ satisfies dependencies of ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}},W)$ and ⋈$({{W}_{1}},{{W}_{2}}, \ldots ,{{W}_{m}})$. It means that:

$R[{{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}} \cup W] = R[{{X}_{1}}]$ ⋈

⋈ $R[{{X}_{2}}]$ ⋈ … ⋈ $R[{{X}_{k}}]$ ⋈ $R[W]$

and

$R[W] = R[{{W}_{1}}]$ ⋈ $R[{{W}_{2}}]$ ⋈ … ⋈ $R[{{W}_{m}}].$

We substitute $W$ and $R[W]$ in the previous formula:

$R[{{X}_{1}} \cup {{X}_{2}} \cup \ldots \cup {{X}_{k}} \cup {{W}_{1}} \cup {{W}_{2}} \cup \ldots \cup {{W}_{m}}] = R[{{X}_{1}}]$ ⋈ $R[{{X}_{2}}]$ ⋈

… ⋈ $R[{{X}_{k}}]$ ⋈ $R[{{W}_{1}}]$ ⋈ $R[{{W}_{2}}]$ ⋈… ⋈ $R[{{W}_{m}}],$

which corresponds to the feasibility of dependency of ⋈$({{X}_{1}},{{X}_{2}}, \ldots ,{{X}_{k}},V \cap {{Y}_{1}},V \cap {{Y}_{2}}, \ldots ,V \cap {{Y}_{m}})$.

Comment. The proof of the theorem did not require the assumption that the hypergraphs corresponding to the join dependencies are acyclic.

To transform the set of rules into a system of axioms, it is necessary to prove the completeness of this system. Unfortunately, all efforts of researchers in this direction were not yet crowned with success.

CONCLUSIONS

The system of rules for join dependencies obtained in this work has theoretical significance from the viewpoint of the development of the database design theory. To analyze the completeness of the system of axioms, the chase procedure [2], which did not yield satisfactory results both here and in all previous studies, is traditionally used. This means that the search for new rules and the generalization of known rules should be continued. The obtained rules are of limited practical value in the design of the database scheme. Indeed, after separating functionally defined attributes into separate relations when decomposing relation R or when forming a generalized key (superkey) when synthesizing relations, each time one has to deal with a relation, in which there are no functional dependencies. However, this relation is not interpretable. The use of join dependencies allows finding the correct decomposition, obtaining interpreted relations, and guaranteeing the execution of lossless join property.

REFERENCES

Ullman, J., Principles of Database Systems, Stanford Univ.: Comput. Sci. Press, 1980.
Maier, D., The Theory of Relational Databases, Rockville: Comput. Sci. Press, 1983.
MATH Google Scholar
Abiteboul, S., Hull, R., and Vianu, V., Foundations of Databases: The Logical Level, Addison-Wesley, 1994.
Google Scholar
Casanova, M., Fagin, R., and Papadimitriou, C., Inclusion dependencies and their interaction with functional dependencies, Comput. J. Syst. Sci., 1984, vol. 28, no. 1, pp. 29–59. https://doi.org/10.1016/0022-0000(84)90075-8
Article MathSciNet MATH Google Scholar
Levene, M. and Vincent, M., Justification for inclusion dependency normal form, IEEE Trans. Knowl. Data Eng., 2000, vol. 12, no. 2, pp. 281–291. https://doi.org/10.1109/69.842267
Article Google Scholar
Köhler, H. and Link, S., Inclusion dependencies reloaded, The 24th ACM International on Conference on Information and Knowledge Management (CIKM ’15), 2015, pp. 1361–1370. https://doi.org/10.1145/2806416.2806539
Zykin, V. and Zykin, S., Analysis of typed inclusion dependences with null values, Autom. Control Comput. Sci., 2018, vol. 52, no. 7, pp. 638–646. https://doi.org/10.3103/S0146411618070258
Article Google Scholar
Rissanen, J., Theory of relations for databases – a tutorial survey, in Mathematical Foundations of Computer Science 1978, Winkowski, J., Ed., Berlin, Heidelberg: Springer Berlin Heidelberg, 1978, pp. 536–551.
Google Scholar
Sciore, E., A complete axiomatization of full join dependencies, J. ACM, 1982, vol. 29, no. 2, pp. 373–393. https://doi.org/10.1145/322307.322313
Article MATH Google Scholar
Beeri, C. and Vardi, M., Formal systems for join dependencies, Theor. Comput. Sci., 1985, vol. 38, pp. 99–116. https://doi.org/10.1016/0304-3975(85)90212-9
Article MathSciNet MATH Google Scholar
Gyssens, M., On the complexity of join dependencies, ACM Trans. Database Syst., 1986, vol. 11, no. 1, pp. 81–108. https://doi.org/10.1145/5236.5237
Article MathSciNet MATH Google Scholar
Malvestuto, F., A complete axiomatization of full acyclic join dependencies, Inf. Process. Lett., 1998, vol. 68, no. 3, pp. 133–139.
Article MathSciNet Google Scholar
Düntsch, I. and Mikulás, S., Cylindric structures and dependencies in relational databases, Theor. Comput. Sci., 2001, vol. 269, pp. 451–468. https://doi.org/10.1016/S0304-3975(01)00016-0
Article MathSciNet MATH Google Scholar
Hartmann, S., Köhler, H., and Link, S., Full hierarchical dependencies in fixed and undetermined universes, Ann. Math. Artif. Intell., 2007, vol. 50, pp. 195–226. https://doi.org/10.1007/s10472-007-9067-0
Article MathSciNet MATH Google Scholar
Biskup, J. and Link, S., Appropriate inferences of data dependencies in relational databases, Ann. Math. Artif. Intell., 2011, vol. 63, nos. 3–4, pp. 213–255.
Article MathSciNet Google Scholar
Hannula, M. and Kontinen, J., A finite axiomatization of conditional independence and inclusion dependencies, Inf. Comput., 2016, vol. 249, pp. 121–137. https://doi.org/10.1016/j.ic.2016.04.001
Article MathSciNet MATH Google Scholar
Baixeries, J., A formal context for acyclic join dependencies, in ISMIS 2017: Foundations of Intelligent Systems; Lecture Notes in Computer Science, Springer, 2017, vol. 10352, pp. 563–572. https://doi.org/10.1007/978-3-319-60438-1_55
Book Google Scholar
Zykin, S., Domains of functional dependences in databases, Tr. Inst. Mat. Mekh. Ural. Otd. Ross. Akad. Nauk, 2016, vol. 22, no. 3, pp. 117–129.
MathSciNet Google Scholar

Download references

Funding

The research was funded in accordance with the state task of the the Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences, project no. FWNF-2022-0016.

Author information

Authors and Affiliations

Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences, 630090, Novosibirsk, Russia
S. V. Zykin

Authors

S. V. Zykin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. V. Zykin.

Ethics declarations

The author declares that he has no conflicts of interest.

Additional information

Translated by A. Ivanov

About this article

Cite this article

Zykin, S.V. Generalization of Derivation Rules for Join Dependencies in Database. Aut. Control Comp. Sci. 55, 731–737 (2021). https://doi.org/10.3103/S0146411621070191

Download citation

Received: 06 July 2020
Revised: 07 September 2020
Accepted: 09 September 2020
Published: 01 February 2022
Issue Date: December 2021
DOI: https://doi.org/10.3103/S0146411621070191

Keywords:

Use our pre-submission checklist

Avoid common mistakes on your manuscript.