1 Introduction

Current research on knowledge graph completion is often concerned with latent approaches that are based on the idea to embed a knowledge graph into a low dimensional vector space. At the same time symbolic approaches have attracted less attention [13]. However, such approaches have a big advantage: they yield an explanation in terms of the rules that trigger a prediction. In this paper we propose a bottom-up technique for efficiently learning logical rules from large knowledge graphs inspired by classic bottom-up rule learning approaches as Golem [8] and Aleph [10]. Our approach is called AnyBURL (Anytime Bottom-Up Rule Learning). We report on experiments where we evaluated AnyBURL on datasets that have been labelled as hard cases for simple (rule-based) approaches. Our approach performs as good as and sometimes better than most models that have been proposed recently. Moreover, the required resources in terms of memory and runtime are significantly smaller compared to latent approaches. This paper is an extended abstract of an IJCAI 2019 paper [6].

2 Language Bias and Algorithm

A small subset of a knowledge graph is shown in Fig. 1. Suppose that we are interested in finding rules that explain why Ed speaks Dutch, which corresponds to the fact speaks(edd). To construct useful rules, we look at all paths of length n that start at ed or d. Note that we allow a path to be constructed by following in- and outgoing edges. We have marked three paths starting at ed in Fig. 1. Two of these paths are acyclic paths ending somewhere in the knowledge graph, while the third path is, together with speaks(edd), cyclic. On the left side of the figure these paths are shown in terms of their corresponding bottom rules.

Fig. 1.
figure 1

Three paths sampled from a knowledge graph

In [6] we argued that any useful rule, that can be generalized from such a bottom rule, must belong to a certain type of rule. Thus, we can directly instantiate these types instead of building up a complete generalization lattice. We list in the following all rules that result from generalizing the second and the third bottom rule. We use upper-case letters to refer to variables.

$$\begin{aligned} speaks(X,d) \leftarrow&\ married(X, A_2), born(A_2, a) \end{aligned}$$
(1)
$$\begin{aligned} speaks(X,d) \leftarrow&\ married(X, A_2), born(A_2, A_3) \end{aligned}$$
(2)
$$\begin{aligned} speaks(X,Y) \leftarrow&\ lives(X, A_2), lang(A_2, Y) \end{aligned}$$
(3)
$$\begin{aligned} speaks(X,d) \leftarrow&\ lives(X, A_2), lang(A_2, d) \end{aligned}$$
(4)
$$\begin{aligned} speaks(ed,Y) \leftarrow&\ lives(ed, A_2), lang(A_2, Y) \end{aligned}$$
(5)

The main algorithm of AnyBURL samples random paths of length n and generalizes them to rules as the ones shown above. Then AnyBURL computes confidence and support of each rule. If the rule fulfils a quality criteria defined by the user, e.g., being above minimal support or confidence threshold, it is stored. AnyBURL checks how much new rules are learned within a certain interval. If the fraction of new rules learned within this interval is below a threshold it increases n and continues to search for longer rules.

Given a completion task as r(a, ?), we have to compute a ranking of the top-k candidates that can substitute the question mark. It would be straight forward to create such a ranking based on the rules that have been learned, if each entity would be generated by at most one rule. We could just order the proposed entities by the confidence values of the rules that suggested them. However, an entity is usually suggested by several rules. If we would assume that these rules are independent, we could use a Noisy-Or aggregation. However, the assumption that the rules are independent is often not valid. In [6] we report also about experiments that support this claim. Instead of that we apply a rather simple but efficient approach. We order the candidates via the maximum of the confidences of all rules that have generated the candidates. If the maximum score of several candidates is the same, we order these candidates via the second best rule that generates them, and so on, until we find a rule that makes a difference. The results of this approach can be computed without grounding all relevant rules.

3 Results

In our experiments we have used the standard knowledge base completion evaluation datasets FB15k and WN18 [1] plus their harder variants FB15-237 [11] and WN18RR [2]. The first block in Table 1 lists the results of current state of the art completion techniques. We selected five models, which achieved very good results published within the last year at top conferences. AnyBURL achieves after a 1000 seconds learning phase results that are at the same level and sometimes slightly better than most of these models. AnyBURL has, for example, better hits@1, hits@10, and MMR resultsFootnote 1 than ConvE [2] on all datasets except FB15-237, where the results of AnyBURL are only slightly worse. AnyBURL outperforms SimplE [4] already when using the rules that have been learned after 10 s.

Table 1. Comparing our approach against current state of the art

Exceptionally, the ComplEx-N3 model proposed in [5], originally introduced in [12], achieves better results than any other model including AnyBURL. While we ran AnyBURL on a standard laptop, the ComplEx-N3 runtimes are based on the use of a Quadro GP100 GPU. Moreover, AnyBURL runtimes are significantly lower.

In line six we present results for the rule learner AMIE+. These results have been reported in [7] in an environment similar to ours. AMIE+ does not have an anytime behaviour, however, the parameters have been chosen dataset specific to exploit a time frame of 10 h as good as possible. Thus, the results are roughly comparable to the 10000 s results of AnyBURL. AnyBURL generates significantly better results than AMIE+ on all datasets.

4 Conclusion

AnyBURL generates in short time results that are as good and better than many recent state of the art techniques, while using only limited resources (both in terms of memory and computational power). Compared to latent representation models AnyBURL has several advantages in common with other rule based approaches: (1) The candidate ranking can be explained in terms of the rules that generated this ranking. (2) The generated model (= rule set) can be reused for a dataset using the same predicates and an overlapping set of (important) constants. (3) A rule based approach does not require to learn dataset specific hyper parameters. In the future we plan to extend the language bias of AnyBURL. AnyBURL is implemented as a Java program without any dependencies. It is available at http://web.informatik.uni-mannheim.de/AnyBURL/.