Distributional Learning of Regular Formal Graph System of Bounded Degree

Shoudai, Takayoshi; Matsumoto, Satoshi; Suzuki, Yusuke

doi:10.1007/978-3-319-63342-8_6

Takayoshi Shoudai¹⁵,
Satoshi Matsumoto¹⁶ &
Yusuke Suzuki¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10326))

Included in the following conference series:

International Conference on Inductive Logic Programming

529 Accesses
1 Citations

Abstract

In this paper, we describe how distributional learning techniques can be applied to formal graph system (FGS) languages. An FGS is a logic program that deals with term graphs instead of the terms of first-order predicate logic. We show that the regular FGS languages of bounded degree with the 1-finite context property (1-FCP) and bounded treewidth property can be learned from positive data and membership queries.

T. Shoudai—This work was partially supported by JSPS KAKENHI (26280087, 15K00313) and MEXT KAKENHI (24106010).

Access provided by CONRICYT-eBooks. Download conference paper PDF

Learning Context-Free Grammars from Positive Data and Membership Queries

Learning Deterministic Variable Automata over Infinite Alphabets

Partial Learning of Recursively Enumerable Languages

1 Introduction

In the field of algorithmic learning theory, many models and algorithmic techniques for learning from examples have been developed. Distributional learning was first proposed by Clark and Eyraud [3] to learn a subclass of context-free grammars efficiently. Recently, distributional learning techniques have been developed for learning of various subclasses of context-free grammars [11]. These techniques were extended to languages that have more complex structures [7]. Yoshinaka [12] introduced distributional properties on grammars and showed that grammars with distributional properties are learnable with standard distributional learning techniques if the grammars satisfy certain conditions, e.g. polynomial time decomposability of objects into contexts and substructures.

Graph grammar has been developed as an extension to graphs from strings of grammatical forms. Graph grammar has been applied to a wide range of fields including pattern recognition and image analysis. Uchida et al. [10] introduced a framework called formal graph system (FGS) as a graph grammar. An FGS is a logic program that deals with term graphs, which can be considered to be types of hypergraphs, instead of the terms of first-order predicate logic.

For the learning of graph grammar, Okada et al. [9] showed that some classes of graph pattern languages are learned from a minimally adequate teacher (MAT) in polynomial time. Hara and Shoudai [6] proposed an algorithm for learning the class of c-deterministic regular FGS languages in the framework of MAT learning. There have been other studies on graph grammars from the viewpoint of application, but discussions on computational learning of graph grammars are not yet sufficient. In this paper, we show that the regular FGS languages of bounded degree with the 1-finite context property (1-FCP) [2] and bounded treewidth property can be learned from positive data and membership queries with current distributional learning techniques [11].

2 Preliminaries

For a set or a list S, |S| denotes the number of all elements that are contained in S. For a set S, $S^{*}$ denotes the set of all finite lists consisting of elements in S. For a list S and an integer i $(1\le i\le |S|)$, S[i] denotes the i-th member of S. Let $\varSigma $ and $\varLambda $ be finite alphabets. Let X be an infinite alphabet, whose elements are called variables. We assume that each symbol $x\in X$ has a nonnegative integer rank(x), $\varSigma \cap X=\emptyset $ and $\varLambda \cap X=\emptyset $.

Definition 1

(Term graph). A term graph $g = (V,E,\varphi ,\psi ,H,\lambda ,ports)$ is defined as follows:

1.
(V, E) is a vertex- and edge-labeled (directed or undirected) graph,
2.
$\varphi : V\rightarrow \varSigma $ and $\psi : E\rightarrow \varLambda $ are vertex- and edge-labeling functions,
3.
H is a finite multiset of hyperedges that are elements of $2^V$,
4.
$\lambda : H\rightarrow X$ is a variable-labeling function, and
5.
$ports : H\rightarrow V^*$ is a mapping s.t. for every $h\in H$, ports(h) is a list of $rank(\lambda (h))$ distinct vertices in V. These vertices are called the ports of h.

We give an example of term graphs in Fig. 1. A hyperedge is drawn as a box with lines to its ports. The order of the ports is indicated by digits at these lines.

For a term graph g, its 7-tuple is denoted by $(V_g,E_g,\varphi _g,\psi _g,H_g,\lambda _g,ports_g)$. A term graph g is called ground if $H_g=\emptyset $ and both $\lambda _g$ and $ports_g$ are empty functions $\emptyset $. We define the size of a term graph g, denoted by |g|, as $|V_{g}|+|E_{g}|+|H_{g}|$. A term graph g is a star term graph if $E_g=\emptyset $ and $H_g=\{V_g\}$. For a star term graph g, $h_{g}$ denotes the unique hyperedge of g.

Definition 2

(Treewidth [5]). A tree decomposition of a term graph $g=(V,E,\varphi ,\psi ,H,\lambda ,ports)$ is a rooted tree $\mathcal{T}=(\mathcal{I},\mathcal{F})$ whose vertices $i\in \mathcal{I}$ are associated with $V_i\subseteq V$, $E_i\subseteq E$ and $H_i\subseteq H$ that satisfy the following conditions:

1.
For each $v\in V$, there is a vertex $i\in \mathcal{I}$ such that $v\in V_i$.
2.
For each $e=(u,v)\in E$, there is exactly one vertex $i\in \mathcal{I}$ such that $u,v\in V_i$ and $e\in E_i$.
3.
For each $h=\{v_1,\ldots ,v_m\}\in H$, there is exactly one vertex $i\in \mathcal{I}$ such that $v_1,\ldots ,v_m\in V_i$ and $h\in H_i$.
4.
For each $v\in V$, the subtree of $\mathcal{T}$ induced by $\{i\in \mathcal{I}\mid v\in V_i\}$ is connected.

The width of $\mathcal{T}$ is defined as $\max _{i\in \mathcal{I}}|V_i|-1$. The treewidth of g is the minimum width of any tree decomposition $\mathcal{T}=(\mathcal{I},\mathcal{F})$ of g.

Two term graphs f and g are said to be isomorphic, if there is a bijection $\pi $ from $V_f$ to $V_g$, such that (1) $(u,v)\in E_f$ if and only if $(\pi (u),\pi (v))\in E_g$, (2) $\varphi _f(u) = \varphi _g(\pi (u))$ for each vertex $u\in V_f$ and $\psi _f(u,v) = \psi _g(\pi (u),\pi (v))$ for each edge $(u,v)\in E_f$, (3) $\{v_1,\ldots ,v_{\ell }\} \in H_f$ if and only if $\{\pi (v_1),\ldots ,\pi (v_{\ell })\} \in H_g$, (4) $\lambda _f(\{v_1,\ldots ,v_{\ell }\})=\lambda _f(\{u_1,\ldots ,u_{\ell }\})$ if and only if $\lambda _g(\{\pi (v_1),\ldots ,\pi (v_{\ell })\}) = \lambda _g(\{\pi (u_1),\ldots ,\pi (u_{\ell })\})$ for each hyperedge $\{v_1,\ldots ,v_{\ell }\}, \{u_1,\ldots ,u_{\ell }\} \in H_f$, and (5) $ports_f(\{v_1,\ldots ,v_{\ell }\}) = ports_g(\{\pi (v_1),\ldots ,\pi (v_{\ell })\})$ for each $\{v_1,\ldots ,v_{\ell }\}$ $\in H_f$. A bijection $\pi $ satisfying (1)–(5) is called an isomorphism from f to g.

Let d and w be nonnegative integers. The degree of a vertex v is defined as $|\{e\in E_{g} \mid v\in e\}| + |\{h\in H_{g} \mid v\in h\}|$. $\mathcal{G}(\varSigma ,\varLambda ,X)$ (resp. $\mathcal{G}_{d,w}(\varSigma ,\varLambda ,X)$) denotes the set of all term graphs (resp. all term graphs of maximum degree d and treewidth w) over $\langle \varSigma ,\varLambda ,X\rangle $. Moreover, $\mathcal{G}(\varSigma ,\varLambda )$ (resp. $\mathcal{G}_{d,w}(\varSigma ,\varLambda )$) denotes the set of all ground term graphs (resp. all ground term graphs of maximum degree d and treewidth w).

Let f be a term graph in $\mathcal{G}(\varSigma ,\varLambda ,X)$ and $\sigma $ an ordered list of $\ell $ distinct vertices in $V_f$ $(0\le \ell \le |V_f|)$. A pair $[f,\sigma ]$ is called a term graph fragment. If f is a ground term graph, we call it a ground term graph fragment. Let $\mathcal{F}(\varSigma ,\varLambda )$ be the set of all ground term graph fragments. For nonnegative integers d and w, $\mathcal{F}_{d,w}(\varSigma ,\varLambda )=\{[f,\sigma ]\in \mathcal{F}(\varSigma ,\varLambda )\mid f\in \mathcal{G}_{d,w}(\varSigma ,\varLambda ) \text{ and } |\sigma | \le d+1\}$. For a term graph fragment $[f,\sigma ]$ and a variable $x\in X$ with $rank(x)=|\sigma |$. Let $\sigma =(v_1,\ldots ,v_\ell )$ $(\ell \ge 1)$. The binding $x:=[f,\sigma ]$ for a term graph g is defined to be an operation on g that works in the following way: For each $h\in H_g$ with $\lambda _{g}(h)=x$, let $f'=(V_{f'},E_{f'},\varphi _{f'},\psi _{f'},H_{f'},\lambda _{f'},ports_{f'})$ be a copy of f. For a vertex $v\in V_f$, we denote the corresponding copy vertex of $f'$ by $v'$. We attach $f'$ to g by removing the hyperedge h from $H_g$ and by identifying the ports $u_1,\ldots ,u_\ell $ of h in g with $v_1',\ldots ,v_\ell '$ in $f'$, respectively. We set the new vertex-label of $u_i$ to be the original vertex-label of $u_i$, i.e., $\varphi _g(u_i)$. A substitution $\theta $ is a finite set of bindings $\{x_1:=[f_1,\sigma _1],\ldots ,x_n:=[f_n,\sigma _n]\}$, where $x_i$’s are mutually distinct variables in X and each $f_i$ has no hyperedge labeled with a variable in $\{x_1,\ldots ,x_n\}$. We give an example of term graphs and substitutions in Fig. 2.

Definition 3

(Formal graph system [10]). Let $g_1,\ldots ,g_n\in \mathcal{G}(\varSigma ,\varLambda ,X)$ $(n\ge 1)$. Let $\varPi _n$ be a finite set of n-ary predicate symbols. Let $\varPi = \bigcup _{i \ge 0} \varPi _i$. For $p \in \varPi _n$, we say that $p(g_1,\ldots ,g_n)$ is an atom. Let $A,B_1,\dots ,B_m$ be atoms $(m\ge 0)$ consisting of term graphs in $\mathcal{G}(\varSigma ,\varLambda ,X)$ and predicate symbols in $\varPi $. A graph rewriting rule over $\langle \varSigma ,\varLambda ,X,\varPi \rangle $ is a clause of the form $A\leftarrow B_1,\ldots ,B_m$. For a clause $A\leftarrow B_1,\ldots ,B_m$, the atom A is called the head and the right hand side of the arrow $B_1,\ldots ,B_m$ is called the body of the rule. Let $\varGamma $ be a finite set of graph rewriting rules over $\langle \varSigma ,\varLambda ,X,\varPi \rangle $. A formal graph system (abbreviated to FGS) is the 5-tuple $S = (\varSigma ,\varLambda ,X, \varPi , \varGamma )$.

For a substitution $\theta $ and an atom $p(g_1,\ldots ,g_n)$, we define $p(g_1,\ldots ,g_n)\theta $ to be $p(g_1\theta ,\ldots ,g_n\theta )$. For a graph rewriting rule $A \leftarrow B_1,\ldots ,B_m$, we also define $(A \leftarrow B_1,\ldots ,B_m)\theta $ to be $A\theta \leftarrow B_1\theta ,\ldots ,B_m\theta $.

Definition 4

Let $S = (\varSigma ,\varLambda ,X, \varPi , \varGamma )$ be an FGS. For a clause C, relation $\varGamma \vdash C$ is defined recursively in the following way:

1.
If $C\in \varGamma $, then $\varGamma \vdash C$ holds.
2.
If $\varGamma \vdash C$, then $\varGamma \vdash C\theta $ for an arbitrary substitution $\theta $.
3.
If $\varGamma \vdash A \leftarrow B_1,\ldots ,B_n$ and for some i $(1\le i\le n)$, $\varGamma \vdash B_i \leftarrow C_1,\ldots ,C_m$, then $\varGamma \vdash A \leftarrow B_1,\ldots ,B_{i - 1},C_1,\ldots ,C_m,B_{i + 1},\ldots ,B_n$ holds.

For an FGS $S=(\varSigma ,\varLambda ,X,\varPi ,\varGamma )$ and a unary predicate symbol p, we define the graph language of (S, p) as $GL(S,p) = \{g \in \mathcal{G}(\varSigma ,\varLambda ) \mid \varGamma \vdash p(g) \leftarrow \}$. We say that a graph language $L\subseteq \mathcal{G}(\varSigma ,\varLambda )$ is definable by an FGS or an FGS language if such a pair $(\varGamma ,p)$ exists. In Fig. 3, we give an example of the FGSs and its FGS language.

Let $\varSigma $ be a set of vertex labels and $\varPi $ a set of predicate symbols. Let $\delta $ be a function from $\varPi $ to $\varSigma ^{*}$. We call the function $\delta $ a pointer of $\varPi $ if for any predicate symbol p, $\delta (p)[i]\not =\delta (p)[j]$ for all i, j $(1\le i < j\le |\delta (p)|)$. Let $\delta (\varPi )$ be the set of all vertex labels appearing in $\delta (p)$ for all predicate symbols $p\in \varPi $. For a term graph g and a list of vertices $\sigma =(v_1,\ldots ,v_{\ell })\in V_{g}^{\ell }$ $(\ell \ge 1)$, $\varphi _g(\sigma )$ denotes $(\varphi _{g}(v_1),\ldots ,\varphi _{g}(v_{\ell }))$.

Definition 5

(Regular formal graph system [10]). We say that an FGS $S=(\varSigma ,\varLambda ,X,\varPi ,\varGamma )$ is regular with a pointer $\delta $ of $\varPi $ if all graph rewriting rules in $\varGamma $ are of the form $q_0(g_0) \leftarrow q_1(g_1),\ldots ,q_m(g_m)$ that satisfies the following conditions:

1.
All $q_{i} \in \varPi $ $(0 \le i \le m)$ are unary predicate symbols.
2.
Each $g_i$ $(1 \le i \le m)$ is a star term graph s.t. $\varphi _{g_i}(ports_{g_i}(h_{g_i}))=\delta (q_i)$.
3.
There is a list $(v_1,\ldots ,v_{|\delta (q_0)|}) \in V_{g_0}^{|\delta (q_0)|}$ s.t. $\varphi _{g_0}(v_1,\ldots ,v_{|\delta (q_0)|})=\delta (q_0)$ and for any $u\in V_{g_0}{\setminus }\{v_1,\ldots ,v_{|\delta (q_0)|}\}$, $\varphi _{g_0}(u) \in \varSigma {\setminus }\delta (\varPi )$.
4.
$\bigcup _{h \in H_{g_{0}}}\{\lambda _{g_{0}}(h)\} = \bigcup _{i=1}^{m}\{\lambda _{g_{i}}(h_{g_{i}})\}$ and $\lambda _{g_i}(h_{g_i}) \ne \lambda _{g_j}(h_{g_j})$ for $1 \le i < j \le m$.
5.
For every $h_1,h_2\in H_{g_0}$, $h_1\not = h_2$ if and only if $\lambda _{g_0}(h_1)\not =\lambda _{g_0}(h_2)$.

A regular FGS $S=(\varSigma ,\varLambda ,X,\varPi ,\varGamma )$ with a pointer $\delta $ is denoted by $(S, \delta )$ or $((\varSigma ,\varLambda ,X,\varPi ,\varGamma ),\delta )$. Below we call a regular FGS with a pointer a regular FGS.

Let $(S, \delta )$ be a regular FGS and p a unary predicate symbol in $\varPi $. We define the graph language of $(S, \delta ,p)$ as $GL(S, \delta ,p) = \{g \in \mathcal{G}(\varSigma ,\varLambda ) \mid \varGamma \vdash p(g) \leftarrow \}$. We say that a graph language $L\subseteq \mathcal{G}(\varSigma ,\varLambda )$ is definable by a regular FGS or a regular FGS language if a triplet $(S,\delta ,p)$ exists such that $L=GL(S,\delta ,p)$. In Fig. 4, we give an example of the regular FGSs and its regular FGS language.

Definition 6

(Chomsky normal form). Let $f_0$ be a ground term graph of one vertex or two vertices with one edge, $f_1$ a term graph with two hyperedges and no edge, and $f_2,f_3$ star term graphs. Let $p_0,p_1,p_2,p_3$ be unary predicate symbols. A regular FGS $(S, \delta )$ is in Chomsky normal form if every graph rewriting rule of S is of the form.

Terminal rule: $p_0(f_0) \leftarrow $,
Unary rule: $p_1(f_1) \leftarrow p_2(f_2)$.
Binary rule: $p_1(f_1) \leftarrow p_2(f_2),\ p_3(f_3)$.

The regular FGS in Figs. 3 and 4 is written in Chomsky normal form.

We say that a term graph g is connected if for any two vertices u and v of g, there is a sequence of vertices $v_0(=u),v_1,\ldots ,v_m(=v)$ for an integer m such that for all i $(0\le i\le m-1)$, $v_i$ and $v_{i+1}$ are contained in the same edge or hyperedge. In this paper, we assume that all term graphs are connected.

Graph grammar has been defined in various ways. One of the famous context-free graph grammars is a hyperedge replacement grammar (HRG) [4]. Uchida et al. [10] showed that a class of graphs is generated by an HRG if and only if it is defined by a regular FGS. This result shows that regular FGSs can generate interesting graph classes including trees, two-terminal series parallel graphs (in Fig. 4), and so on.

In the research of HRGs, Lautemann [8] gave some conditions on either grammar or input graphs whose parsing can be done in polynomial time. A parsing algorithm due to Lautemann is known to be polynomial time for graphs that are connected and of bounded degree. As a more precise characterization of the algorithm’s complexity, Chiang et al. [1] showed that the parsing algorithm runs in polynomial time if the maximum degree and treewidth of graphs in an HRG are bounded by some constants. Hence, we conclude the following lemma:

Lemma 1

([1, 10]). Let $(S,\delta )$ be a regular FGS and p a unary predicate symbol. Given a ground term graph g, the problem of deciding whether or not $g\in GL(S,\delta ,p)$ is computed in $O((3^dn)^{w + 1})$ time, where n is the number of vertices of g, d is the maximum degree of g, and w is the maximum treewidth of the term graphs in the heads of graph rewriting rules in S.

3 Learning Regular FGS with 1-Finite Context Property

We consider $\mathcal{G}_{d,w}(\varSigma ,\varLambda )$ as a universal set ($d, w\ge 0$). A positive presentation of a nonempty graph language $L \subseteq \mathcal{G}_{d,w}(\varSigma , \varLambda )$ is an infinite sequence $g_{1}, g_{2}, \ldots $ of elements in L such that $\{g_{1},g_{2},\ldots \} = L$.

An inductive inference machine (IIM, for short) is an effective procedure, or a certain type of Turing machine, which outputs a regular FGS and a predicate symbol each time a ground term graph is given. Let $L_{*} \subseteq \mathcal{G}_{d,w}(\varSigma , \varLambda )$ be a target graph language. We assume that an IIM has an access to an oracle $Mem_{L_{*}}$ who answers membership queries. The query asks whether or not an arbitrary ground term graph g is included in $L_{*}$. Let $\tau = g_{1}, g_{2}, \ldots $ be a positive presentation of $L_{*}$. An IIM outputs a regular FGS $(S_{i}, \delta _{i})$ and a predicate symbol $p_{i}$ by using membership queries each time a ground term graph $g_{i}$ in $\tau $ is given. An IIM is said to converge to a regular FGS $(S,\delta )$ and a predicate symbol p for $\tau $ with polynomial time update by using membership queries, if M outputs a regular FGS $(S_{i},\delta _{i})$ and a predicate symbol $p_{i}$ in polynomial time w.r.t. the sum of the size of the given ground term graphs so far, i.e., $|g_{1}| + |g_{2}| + \cdots + |g_{i}|$, and there exists a positive integer $n \ge 1$ with $(S_{m},\delta _{m}) = (S,\delta )$ and $p_{m} = p$ for any $m \ge n$. Let $\mathcal{C} \subseteq 2^{\mathcal{G}_{d,w}(\varSigma , \varLambda )}$ be a class and $L_{*} \in \mathcal{C}$. A class $\mathcal{C}$ is said to be identifiable in the limit with polynomial time update by using membership queries from positive data, if there exists an IIM M such that for any $L_{*} \in \mathcal{C}$ and any presentation $\tau $ of $L_{*}$, M converges to a regular FGS $(S,\delta )$ and a predicate symbol p for $\tau $ with $GL(S,\delta ,p) =L_{*}$ with polynomial time update by using membership queries.

Let $g = (V_g,E_g,\varphi _g,\psi _g,\emptyset ,\emptyset ,\emptyset )$ be a ground term graph and $\sigma =(v_1,\ldots ,v_\ell )$ a list of distinct vertices in $V_g$ $(1\le \ell \le |V_g|)$. Let x be a new variable label in X that does not appear so far. For the ground term graph fragment $[g,\sigma ]$, we denote by $g(\sigma )$ the term graph $(V_g,E_g,\varphi _g,\psi _g,\{h\},\lambda _g,ports_g)$ where $h=\{v_1,\ldots ,v_\ell \}$, $\lambda _g(h)=x$, and $ports_g(h)=\sigma $. In order to make the argument easier, we assume that g has no isolated vertex. Let $\{E_\alpha ,E_\beta \}$ be a partition of $E_g$. Let $V_\alpha $ be the set of all endpoints of edges in $E_\alpha $ and $V_\beta $ the set of all endpoints of edges in $E_\beta $. Let $\sigma $ be one of the ordered lists of all vertices in $V_\alpha \cap V_\beta $. We obtain two ground term graph fragments $[\alpha ,\sigma ]$ and $[\beta ,\sigma ]$. We easily see that $\alpha (\sigma )\{x:=[\beta ,\sigma ]\}$ and $\beta (\sigma )\{x:=[\alpha ,\sigma ]\}$ are isomorphic to g.

For $[\alpha ,\sigma _\alpha ],[\beta ,\sigma _\beta ]\in \mathcal{F}(\varSigma ,\varLambda )$, we define an operation $\odot $ as follows:

$$ [\alpha ,\sigma _\alpha ]\odot [\beta ,\sigma _\beta ]=\left\{ \begin{array}{ll}\alpha (\sigma _\alpha )\{x:=[\beta ,\sigma _\beta ]\} &{} \text{ if } |\sigma _\alpha |=|\sigma _\beta |,\\ undefined &{} \text{ otherwise. } \end{array}\right. $$

In Fig. 5, we give an example of $\alpha $ and $\beta $ by a partition of $E_g$ of a ground term graph g. Note that in general, $[\alpha ,\sigma _\alpha ]\odot [\beta ,\sigma _\beta ]$ is not always equivalent to $[\beta ,\sigma _\beta ]\odot [\alpha ,\sigma _\alpha ]$, because the vertex labels in the first operand always survive by any binding. If $\varphi _\alpha (\sigma _\alpha )=\varphi _\beta (\sigma _\beta )$, $[\alpha ,\sigma _\alpha ]\odot [\beta ,\sigma _\beta ]=[\beta ,\sigma _\beta ]\odot [\alpha ,\sigma _\alpha ]$ holds.

Let d and w be constant nonnegative integers. For a nonempty finite set of ground term graphs $D\subseteq \mathcal{G}_{d,w}(\varSigma ,\varLambda )$, let

$$\begin{aligned} Sub(D)= & {} \{[\beta ,\sigma _\beta ] \in \mathcal{F}_{d,w}(\varSigma ,\varLambda ) \mid \exists [\alpha ,\sigma _\alpha ]\in \mathcal{F}_{d,w}(\varSigma ,\varLambda )[\,[\alpha ,\sigma _\alpha ]\odot [\beta ,\sigma _\beta ]\in D\,]\},\\ Con(D)= & {} \{[\alpha ,\sigma _\alpha ] \in \mathcal{F}_{d,w}(\varSigma ,\varLambda ) \mid \exists [\beta ,\sigma _\beta ]\in \mathcal{F}_{d,w}(\varSigma ,\varLambda )[\,[\alpha ,\sigma _\alpha ]\odot [\beta ,\sigma _\beta ]\in D\,]\}. \end{aligned}$$

We give an example of Con(D) in Fig. 6. It is easy to see that $Con(D)\subseteq Sub(D)$ holds. For any $[\beta ,\sigma _\beta ]\in Sub(D){\setminus } Con(D)$, there is a ground term graph fragment $[\alpha ,\sigma _\alpha ]\in Con(D)$ such that $\alpha $ is isomorphic to $\beta $ via an isomorphism $\xi $ with $\xi (\sigma _\alpha )=\sigma _\beta $, ignoring the vertex labels in $\sigma _\alpha $. Thus, unlike string grammars, we only use Con(D) to learn a target regular FGS language from positive data. We have the following proposition:

Proposition 1

Let D be a nonempty finite subset of $\mathcal{G}_{d,w}(\varSigma ,\varLambda )$. Both |Sub(D)| and |Con(D)| are of polynomial size w.r.t. $\sum _{g\in D} |g|$.

Let $(S,\delta )= ((\varSigma ,\varLambda ,X,\varPi ,\varGamma ),\delta )$ be a regular FGS. For a term graph f and $q \in \varPi $, if there are distinct $|\delta (q)|$ vertices $v_1,\ldots ,v_{|\delta (q)|}$ in $V_f$ such that $(\varphi _f(v_1),\ldots ,\varphi _f(v_{|\delta (q)|}))=\delta (q)$ and for any $v\in V_f{\setminus }\{v_1,\ldots ,v_{|\delta (q)|}\}$, $\varphi _f(v)$ is not a member of $\delta (q)$, we define $\varphi _f^{-1}(\delta (q))=(v_1,\ldots ,v_{|\delta (q)|})$, otherwise we define $\varphi _f^{-1}(\delta (q))=()$. Let p, q in $\varPi $ and $[g,\sigma _g]$ in $\mathcal{F}_{d,w}(\varSigma ,\varLambda )$. We define

$$ C(S,\delta ,p,q,[g,\sigma _g])=\{f\in \mathcal{G}_{d,w}(\varSigma ,\varLambda )\mid [g,\sigma _g]\odot [f,\varphi _f^{-1}(\delta (q))]\in GL(S,\delta ,p)\}. $$

Definition 7

(1-FCP). Let $(S,\delta )$ be a regular FGS and p, q unary predicate symbols of S. A term graph fragment $[g,\sigma _g]$ is said to be a context of q w.r.t. $(S,\delta ,p)$ if $C(S,\delta ,p,q,[g,\sigma _g])=GL(S,\delta ,q)$ holds. We say that $(S,\delta ,p)$ has the 1-finite context property (1-FCP) if every predicate $q\in \varPi $ has a context of it.

For the ground term graphs $\alpha ,\beta $ in Fig. 5, $[\alpha ,(v_2,v_3)]$ is a context of $q_{(2,2)}$ and $[\beta ,(v_2,v_3)]$ is a context of $q_{(1,1)}$ w.r.t. $(S^{(3)}_3,\delta ^{(3)}_3,p)$ in Fig. 7. We give an example $C(S^{(3)}_3,\delta ^{(3)}_3,p,q_{(2,1)},[g,\sigma _g])$ in Fig. 8 for some $[g,\sigma _g]$. We easily see that for any $d\ge 2$, the regular FGS in Fig. 7 has the 1-finite context property (1-FCP).

Definition 8

(1-FCP regular FGS language class). 1-FCP-$\mathcal{RFGSL}(d,w)$ denotes the set of all regular FGS languages $L\subseteq \mathcal{G}_{d,w}(\varSigma ,\varLambda )$ that satisfies the following conditions:

1.
L is definable by $(S,\delta ,p)= ((\varSigma ,\varLambda ,X,\varPi ,\varGamma ),\delta ,p)$ that has the 1-FCP,
2.
$\varGamma $ is written in Chomsky normal form, and
3.
The treewidth of each term graph in $\varGamma $ is at most w. Therefore, the maximum length of ports of the hyperedges in $\varGamma $ is also at most $w+1$.

Let $L_{*}\subseteq \mathcal{G}_{d,w}(\varSigma ,\varLambda )$ be a target regular FGS language. We give a learning algorithm for 1-FCP-$\mathcal{RFGSL}(d,w)$ in Algorithm 1, which is a process of searching in Con(D) for contexts of the predicate symbols in $L_{*}$. We construct a regular FGS $S(F,K)=(\varSigma ,\varLambda ,X,\varPi ,\varGamma )$, pointer $\delta $, and initial predicate p as follows:

$\varSigma =\varSigma '\cup \{s_1,\ldots ,s_{w+1}\}$, where $\varSigma '=\{a \mid \exists g_i\in D, \exists v\in V_{g_i} [~\varphi _{g_i}(v)=a~]\}$ and $\varSigma '\cap \{s_1,\ldots ,s_{w+1}\}=\emptyset $.
$\varLambda =\{a \mid \exists g_i\in D,\exists e\in E_{g_i} [~\psi _{g_i}(e)=a~]\}$.
X: We use a new variable label only when needed.
$\varPi = \{\llbracket \alpha ,\sigma _\alpha \rrbracket \mid [\alpha ,\sigma _\alpha ]\in F \subseteq Con(D)\}$. Let $\llbracket \emptyset ,() \rrbracket $ be the initial predicate p.
$\delta $, $\varGamma $: In Table 1, we describe the pointer $\delta (q)$ for each predicate q in $\varPi $ and the graph rewriting rules in $\varGamma $. In the table, we use the following notations. Let $k,\ell $ be two positive integers $(k\le \ell )$ and $\mathcal{P}_{k,\ell }$ the set of all list of k distinct positive integers that are less than or equal to $\ell $. Let $\sigma =(\ell _1,\ldots ,\ell _k) \in \mathcal{P}_{k,\ell }$. For a list of elements $\nu =(v_1,\ldots ,v_\ell $) $(k\le \ell )$, $\chi _\sigma (\nu )$ denotes $(v_{\ell _1},\ldots ,v_{\ell _k})$ and $\bar{\chi }_\sigma (\nu )$ denotes the list obtained from $\nu $ by deleting $v_{\ell _1},\ldots ,v_{\ell _k}$.

The graph rewriting rule $R_1$ in Fig. 9 is an example of the graph rewriting rules constructed by Table 1 for the target regular FGS language $GL(S^{(3)}_3,\delta ^{(3)}_3,p)$.

Table 1. $(S(F,K),\delta ,p)$: There are three types of terminal rules and one type of binary rule. Each graph rewriting rule is created if the corresponding condition is satisfied. All conditions can be determined by asking to the membership oracle $Mem_{L_*}$. The unary rules can be constructed in a similar way to the binary rules. We omit its detail.

Full size table

Theorem 1

Let d and w be constant integers greater than zero. The class 1-FCP-$\mathcal{RFGSL}(d,w)$ is identifiable in the limit with polynomial time update by using membership queries from positive data.

Proof

Let $(S_{1},\delta _{1},p_{1}), (S_{2},\delta _{2},p_{2}), \ldots , (S_{i},\delta _{i},p_{i}), \ldots $ be hypotheses output by Algorithm 1, and $(S_{i},\delta _{i},p_{i}) = ((\varSigma ,\varLambda ,X,\varPi _{i},\varGamma _{i}),\delta _{i},p_{i})$. We prove that there exists a positive integer k such that $GL(S_{n},\delta _{n},p_{n})=L_{*}$ for any integer $n \ge k$. Let $(S_{*},\delta _{*}) = ((\varSigma ,\varLambda ,X,\varPi _{*},\varGamma _{*}),\delta _{*})$ be a regular FGS and $p_{*}$ a predicate symbol in $\varPi _{*}$ with $L_{*} = GL(S_{*},\delta _{*},p_{*})$. Let $G_{i}$ be a ground term graph given to Algorithm 1 at the i-th time, and $D_{i} = \{G_{1}, G_{2}, \ldots , G_{i}\}$. From the property of positive presentations, there exists a positive integer $n \ge 1$ such that $Con(D_{n})$ has a ground term graph fragment $[g,\sigma _{g}]$ with $C(S_{*},\delta _{*},p_{*},q,g) = GL(S_{*},\delta _{*},q)$ for any $q \in \varPi _{*}$. From the n-th input and after, for any predicate symbol $q \in \varPi _{*}$, Algorithm 1 has a ground term graph fragment corresponding to q. Thus, any graph rewriting rule in $\varGamma _{*}$ is included in $\varGamma _{m}$ for any $m \ge n$. It follows that $L_{*} \subseteq GL(S_{m},\delta _{m},p_{m})$ for any $m \ge n$.

We assume that for any $n \ge 1$, there exists a positive integer $m \ge n$ such that $GL(S_{m},\delta _{m},p_{m}) \not \subseteq L_{*}$. Then there exists a ground term graph $G' \in GL(S_{m},\delta _{m},p_{m}){\setminus } L_{*}$. Since $G' \in GL(S_{m},\delta _{m},p_{m})$, there exist a graph rewriting rule $p_{1}(f_{1}) \leftarrow p_{2}(f_{2}), p_{3}(f_{3})$ in $\varGamma _{m}$ and ground term graph fragments $[\rho _2,\sigma _{\rho _2}]$ and $[\rho _3,\sigma _{\rho _3}]$ such that $[g_2,\sigma _{g_2}] \odot [\rho _2,\sigma _{\rho _2}] \in L_{*}$, $[g_3,\sigma _{g_3}] \odot [\rho _3,\sigma _{\rho _3}] \in L_{*}$ and $G'$ is isomorphic to $[g_1,\sigma _{g_1}] \odot [[\rho _2,\chi _{\sigma _2}(\sigma _{\rho _2})] \odot [\rho _3,\chi _{\sigma _3}(\sigma _{\rho _3})],\xi ]$, where $p_{1}$, $p_{2}$ and $p_{3}$ correspond to $[g_1,\sigma _{g_1}]$, $[g_2,\sigma _{g_2}]$ and $[g_3,\sigma _{g_3}]$, respectively. There exist ground term graph fragments $[\tau _2,\sigma _{\tau _2}], [\tau _3,\sigma _{\tau _3}] \in K=Con(D_{\ell })$ with $[g_2,\sigma _{g_2}] \odot [\tau _2,\sigma _{\tau _2}] \in L_{*}$, $[g_3,\sigma _{g_3}] \odot [\tau _3,\sigma _{\tau _3}] \in L_{*}$ and $[g_1,\sigma _{g_1}] \odot [[\tau _2,\chi _{\sigma _2}(\sigma _{\tau _2})] \odot [\tau _3,\chi _{\sigma _3}(\sigma _{\tau _3})],\xi ] \not \in L_{*}$ for some positive integer $\ell $. Thus, $p_{1}(f_{1}) \leftarrow p_{2}(f_{2}), p_{3}(f_{3})$ is removed from $\varGamma _{\ell }$. This contradicts that $p_{1}(f_{1}) \leftarrow p_{2}(f_{2}), p_{3}(f_{3})$ in $\varGamma _{m}$. Therefore, we can show that there exists a positive integer k such that $GL(S_{n},\delta _{n},p_{n})=L_{*}$ for any integer $n \ge k$. From Proposition 1, $|Con(D_{n})|$ is of polynomial size w.r.t. $\sum _{i=1}^{n}|G_{i}|$ at the n-th step. Thus, from Lemma 1, the n-th hypothesis $(S_{n},\delta _{n},p_{n})$ is output by Algorithm 1 with polynomial update time w.r.t $\sum _{i=1}^{n}|G_{i}|$. $\Box $

4 Conclusions

We have considered the problem of learning FGS languages from the viewpoint of the computational learning theory. First, we introduced the class 1-FCP-$\mathcal{RFGSL}$ of regular FGS languages of bounded degree and treewidth with 1-finite context property (1-FCP). We also presented an algorithm for learning class 1-FCP-$\mathcal{RFGSL}$ by using current distributional learning techniques [11]. Finally, we showed that class 1-FCP-$\mathcal{RFGSL}$ can be identifiable in the limit with polynomial time update by using membership queries from positive data. This result will lead us to develop new techniques for learning other classes of FGS languages with distributional properties.

Clark et al. [2, 3] and Yoshinaka [11, 12] discussed the learnabilities of the class of languages of context-free grammars with the finite kernel property (FKP) and finite context property (FCP). As future work, we will consider the polynomial time learnabilities of the class of regular FGS languages with the FKP and FCP.

References

Chiang, D., Andreas, J., Bauer, D., Hermann, K.M., Jones, B., Knight, K.: Parsing graphs with hyperedge replacement grammars. In: Proceeding of ACL 2013, pp. 924–932. Association for Computational Linguistic (2013)
Google Scholar
Clark, A.: A learnable representation for syntax using residuated lattices. In: Groote, P., Egg, M., Kallmeyer, L. (eds.) FG 2009. LNCS, vol. 5591, pp. 183–198. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20169-1_12
Chapter Google Scholar
Clark, A., Eyraud, R.: Polynomial identification in the limit of substitutable context-free languages. J. Mach. Learn. Res. 8, 1725–1745 (2007)
MathSciNet MATH Google Scholar
Drewes, F., Kreowski, H.J., Habel, A.: Hyperedge replacement graph grammars. In: Handbook of Graph Grammars and Computing by Graph Transformation, vol. 1, pp. 95–162. World Scientific (1997)
Google Scholar
Gildea, D.: Grammar factorization by tree decomposition. Comput. Linguist. 37(10), 231–248 (2011)
Article MathSciNet Google Scholar
Hara, S., Shoudai, T.: Polynomial time MAT learning of c-deterministic regular formal graph systems. In: Proceeding IIAI-AAI 2014, pp. 204–211. IEEE (2014)
Google Scholar
Kasprzik, A., Yoshinaka, R.: Distributional learning of simple context-free tree grammars. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS (LNAI), vol. 6925, pp. 398–412. Springer, Heidelberg (2011). doi:10.1007/978-3-642-24412-4_31
Chapter Google Scholar
Lautemann, C.: The complexity of graph languages generated by hyperedge replacement. Acta Informatica 27(5), 399–421 (1990)
Article MathSciNet MATH Google Scholar
Okada, R., Matsumoto, S., Uchida, T., Suzuki, Y., Shoudai, T.: Exact learning of finite unions of graph patterns from queries. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 298–312. Springer, Heidelberg (2007). doi:10.1007/978-3-540-75225-7_25
Chapter Google Scholar
Uchida, T., Shoudai, T., Miyano, S.: Parallel algorithms for refutation tree problem on formal graph systems. IEICE Trans. Inf. Syst. E78–D(2), 99–112 (1995)
Google Scholar
Yoshinaka, R.: Integration of the dual approaches in the distributional learning of context-free grammars. In: Dediu, A.-H., Martín-Vide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 538–550. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28332-1_46
Chapter Google Scholar
Yoshinaka, R.: General perspective on distributionally learnable classes. In: Proceeding of MoL 2015, pp. 87–98. Association for Computational Linguistic (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of International Studies, Kyushu International University, Kitakyushu, Japan
Takayoshi Shoudai
Faculty of Science, Tokai University, Hiratsuka, Japan
Satoshi Matsumoto
Graduate School of Information Sciences, Hiroshima City University, Hiroshima, Japan
Yusuke Suzuki

Authors

Takayoshi Shoudai
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Matsumoto
View author publications
You can also search for this author in PubMed Google Scholar
Yusuke Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takayoshi Shoudai .

Editor information

Editors and Affiliations

Department of Computer Science, University of York, York, United Kingdom
James Cussens
Imperial College London, London, United Kingdom
Alessandra Russo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shoudai, T., Matsumoto, S., Suzuki, Y. (2017). Distributional Learning of Regular Formal Graph System of Bounded Degree. In: Cussens, J., Russo, A. (eds) Inductive Logic Programming. ILP 2016. Lecture Notes in Computer Science(), vol 10326. Springer, Cham. https://doi.org/10.1007/978-3-319-63342-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-63342-8_6
Published: 16 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63341-1
Online ISBN: 978-3-319-63342-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us