Keywords

1 Introduction

The migration of information systems from the classic desktop software to the web application can be observed as a permanent trend. This trend also applies to the knowledge based systems. The “webalisation” of information systems causes many practical and implementation problems and challenges, but we can also identify in this field a number of interesting research problems. In this paper we present a modified goal-driven inference algorithm for web knowledge based systems.

A goal-driven algorithm is a one of the two popular strategies of inference in the knowledge based systems it started from a goal and ended with a fact that leads to the goal. Since it is easy to implement, a goal-driven inference is a key to building many practically used domain expert systems. In the modern web applications goal-driven inference could be divided between the client and server part of the web application. In contrast to that, the data-driven inference can be implemented entirely on the server side, without any conversation with user, only obtaining starting facts is required.

In the presented work we introduce the modified goal-driven inference algorithm, which allow us to divide some parts of them into the two layers of the web application. Proposed approach assumes that the rule knowledge base is decomposed into the decision oriented group of rules. We argue that the knowledge base in the form of such rules group contains enough information which allows to divide inference into the client and server side, ensuring the convenience and the effectiveness.

The first part of the work briefly presents a problem description and related work. The following part of the work describes the rules partitioning approach, then the utilization of this approach in optimization of inference algorithm is described and the modified version of algorithm is presented. Next, a simple case study is presented and the preliminary evaluation of modified algorithm concludes the presented work.

2 Problem Description and Related Works

A goal-driven inference always has a single goal or goals list to confirmation, this approach starts with the desired rule’s conclusion matching to the current goal and works backward to find supporting facts [1]. If this rule requires additional information before it can succeed, the inference can execute additional rules, recursively if necessary. An inference engine will search the rules until it finds one which has a conclusion that matches a desired goal. If all conditions in the rule’s premise are facts, the current goal is confirmed. If some of the rule’s premise conditions are not known to be a fact, this conditions are added to the list of goals as new goals, pushing the other goals down in the list. At any time the algorithm only works on the one top goal [2]. If no rule is available to confirm whether the condition is a fact, the algorithm asks the environment about the truth of the considered condition. The environment may vary depending on the system application. Typically the user is the source of fact, but in the context of embedded systems, facts can be provided by the technical equipment [3].

The disadvantages of goal-driven inference follow from the inefficiency of searching in the large knowledge bases with rules that are not organized in any kind of structure and from the fact that recursive algorithms are difficult to follow [1, 2]. In large rule bases recursive calls are very often misguided, but they take time and consume memory resources. When we consider inference process distributed over the multilayer web application, missed recursive calls and large search space become a significant problem. When we consider classical inference algorithm, all above described operations are realized within the single function/class/module, implemented in the particular programming language. In the context of web-based implementation, specified operations have to be implemented in different way.

Traditional rule based systems were developed as the desktop applications and a number of development tools are available for developing traditional systems. Meanwhile, web applications have grown rapidly and have had significant impact on the application of traditional expert system. Several tools and languages are available for developing web-based expert systems—these tools use traditional expert system techniques and offer in addition the capacity for Web-based development [4, 5].

System Acquire [6], which allows the development of web-based user interfaces, is supported through a client–server development kit that supports Java and ActiveX controls, unfortunately, detailed information is enigmatic. System ExSys [7] provides the Corvid Servlet Runtime implements the Exsys Corvid Inference Engine as a Java Servlet. In this mode, the user interface is defined by HTML templates. Corvid systems can be also integrated with Adobe Flash. The Exsys Corvid Servlet Runtime uses Java Servlet technology, allowing the proven Corvid Inference Engine to be run on a server with only HTML pages sent to the client machine running the system.

The JESS is a rule engine and scripting language [8], which provides console for programming and enables basic input and output, it cannot be used directly in the web-based application but it is possible to use JESS within the JSP platform [9, 10]. The XpertRule KBS interfaces over the Web with a thin client using Microsoft’s Active Server Page technology. Web Deployment Engine is a JavaScript rules runtime engine which runs within a browser [11]. Applications developed using the Knowledge Builder Rules Authoring Studio can be generated as Java Script/HTML files for deployment as Web applications. The JavaScript engine runs the rules, calculations and the JS/HTML user interface. The eXpertise2Go’s Rule-Based Expert System provides free expert system building and delivering tools that implement expert systems as Java applets, Java applications and Android apps [12].

The goal-driven inference is available in Prolog, some of implementations allow to run interpreter within the web application. The SWI-Prolog [13] interpreter can be run in script mode, the script runs the SWI-Prolog interpreter with suppressed interactive output and the script file will produce a result of the query only. It is possible to run inference by CGI program on the server which builds the appropriate Prolog query and execute interpreter. This approach works on the server side and is not tailored to the specific of web application. It requires the usage of relevant program which builds a Prolog query, and executes the Prolog script and generates HTML [5].

Existing tools described above allow us to develop web-based expert systems, but these tools use traditional expert system techniques and offer in addition the capacity for Web-based development. Inference techniques are usually server oriented, front-end layer are used typically for visualization and simple interaction with user. It is hard to find new, modern approaches to web-based expert systems and new implementation of inference algorithms. We argue that in the modern web applications goal-driven inference must be divided between the client and server part of the web application, the data-driven inference can be implemented entirely on the server side. In the literature we can find some attempts to build web-based domain expert systems, e.g. [14, 15]. In general, our proposal is similar, but we are focused on the implementation of a domain independent system. We propose the following architecture of the knowledge based system:

  • Server side—management of rule base, which is decomposed and stored in the relational data base. All selections of applicable rule or rules are performed by the server-side services available via specialized API. The resulting information is transferred into the client side in the form of JSON objects. Server side is passive and is focused on rule-oriented services for inference.

  • Client side—initialize inference when the goal is known, realizes the main event loop, including:

    • confirmation of the rule’s condition against a dynamically created fact set,

    • acquisition of the fact from application environment (usually from the user),

    • initialization of recursive inference calls for sub-goals.

Client side utilizes JavaScript functions embedded in the HTML document, generated by the proper server side scripts from the application layer. The JavaScript functions use rules obtained from the server services via asynchronous AJAX requests. This distributed environment was used in the system described in the [16, 17].

3 Methods

Modified goal-driven inference algorithm is based on the proposed method of rule knowledge base partitioning, which allow us to obtain modular knowledge base. A significant part of this work contains a detailed description of proposed approach. introduced approach differs from other methods of the modularization. The presented solution is an extension of the research presented in [18, 19].

3.1 Knowledge Base and Rules Partition

The knowledge base is a pair KB = (RS, FS) where RS is a non-empty finite set of rules and FS is a finite set of facts. RS = {r 1, …, r n}, each rule r ∈ RS will have a form of Horn’s clause: \(r{:}\; p_{1} \wedge p_{2} \wedge \cdots \wedge p_{m} \to c,\) where m—the number of literals in the conditional part of rule r, and m ≥ 0, p ii-th literal in the conditional part of rule r, i = 1…m, c—literal of the decisional part of rule r. For each rule r ∈ RS we define following functions: concl(r)—the value of this function is the conclusive literal of rule r; cond(r)—the value of this function is a set of conditional literals of rule r. We will also consider the facts as clauses without any conditional literals. The set of all such clauses f will be called set of facts and will be denoted by FS: FS = {f: ∀ fFS cond(f) = {} ∧ f = concl(f)}.

For each rule set RS with n rules, there is a finite power set 2RS with cardinality 2n. Any arbitrarily created subset of rules R ∈ 2RS will be called a group of rules. In this work we will discuss specific subset PR ⊆ 2RS called partition of rules. Any partition PR is created by partitioning strategy denoted by PS, which defines specific content of groups of rules R ∈ 2RS, creating a specific partition of rules PR. We may consider many partitioning strategies for a single rule base, but in this work we will only present a few selected strategies. Each partitioning strategy PS for rules set RS generates the partition of rules PR ⊆ 2RS: PR = {R 1, R 2, …, R k }, where: k—the number of groups of rules creating the partition PR, R i i-th group of rules, R ∈ 2RS and i = 1,…, k.

Rules partitions terminologically correspond to the mathematical definition of the partition as a division of a given set into the non-overlapping and non-empty subset. The groups of rules which create partition are pairwise disjoint and utilize all rules from RS. The partition strategies for rule based knowledge bases are divided into two categories: simple and complex strategies. For simple strategies, the membership criterion decides about the membership of rule r in a particular group R ⊆ PR according to the membership function mc, time complexity not higher than O(n · k), where n = |RS| and k = |PR|. For complex strategies, the particular algorithm decides about the membership of the rule r in some group R ⊆ PR, with time complexity typically higher than any simple partition strategy. An example of a complex strategy is described in the [20].

3.2 Simple Partitioning Strategy

Creation of simple partition for rules set requires the definition of the membership criteria which assigns particular rule r ∈ R to the given group of rules R ⊆ PR. Proposed approach assumes that the membership criteria will be defined by the mc function, which is defined individually for every simple partition strategy. The function: RS × PR → [0…1] has the value 1 if the rule r ∈ RS with no doubt belongs to the group R ⊆ PR, 0 in the opposite case. The value of the function from the range 0 < mc < 1 means the partial membership of the rule r to the group R. Let us assume that threshold value 0 ≤ T ≤ 1 exists. The value of the mc (rR) function can be higher, higher or equal, equal, less, less or equal to the T value. Generally we can define simple partition of rule based knowledge base PR as follows: \(PR = \left\{ {R{:}\; R \in 2^{RS} \wedge \forall {\text{r}}\epsilon R \;mc\left({r, R} \right) \ge T} \right\}.\) Special case of the simple strategies is the strategy called selection. The selection divides the set of rules RS into the two subsets R and RSR. Thus we achieve the partition PR = {R, RS−R}. In practical sense, selection is the operation with linear time complexity O(n) where n denotes the number of all rules in the knowledge base.

The algorithm of creating the partition which bases on simple strategy is presented in the pseudo-code below. The input parameters are: knowledge base RS, the function mc that defines the membership criteria and the value of the threshold T. Output data is the partition PR. Time complexity of such algorithm is O(n · k), where n = |R|, k = |PR|. For each rule r ∈ RS we have to check whether the goal partition PR contains the group R with rule r (the value of the mc function has to be at least T: mc(r, R) ≥ T). If such a rule doesn’t exist the given rule r becomes the seed of a new group which is added to the created partition PR. The simple partitioning and selection algorithm are simple, were described in [20], and for this reason will be omitted.

3.3 Decision Oriented Partitioning Strategies

Let us consider the following partitioning strategy PS 1, which creates groups of the rules from R by grouping rules with the same attribute in conclusions. The membership criteria for rule r and group R is given by the function mc defined as follows: mc(r, R) = 1 if ∀r i  ∈ R concl(r i ) = concl(r), 0 otherwise. When we use the simple partition algorithm (Alg01:createPartitions) with the mc function defined in this way, we obtain decision oriented partitions. Each group of the rules generated by this algorithm will have the following form: R = {r ∈ R: ∀r i  ∈ R concl(r i ) = concl(r)}. The number of groups in the partition k: 1 ≤ k ≤ n depends on the number of different decisions included in conclusions of such rules. When we distinguish different decision, by the different conclusions appearing in the rules—we get one group for each conclusion. In every group we have rules with the same conclusion: a fixed (a, v) pair. Such partition PR 1 will be called basic decision based partition. Basic decision partition contains the set of rules, each rule in the every set contains the same attribute-value pair in the conclusion. All rules grouped within a rule set take part in an inference process confirming the goal described by the particular attribute-value—for each R ∈ PR 1 the conclusion set |Concl(R)| = 1. If we consider the specified (a, v) pair, we think about particular kind of concept or about particular property state. From pragmatic point of view we can say that for each group R of basic decision based partition, single pair (a, v) ∈ Concl(R) represent concrete. Basic decision partition represent the basic relations occurring in rule base—we can say that this partition defines the scope of the knowledge about concrete from real-word concepts within particular rule base.

Let us consider the second partitioning strategy PS 2, the membership criterion for rule r and group R is given by the function mc defined as follows: mc(r, R) = 1 if ∀r i  ∈ R attrib(concl(r i )) = attrib(concl(r)), 0 otherwise. When we utilize the simple partition algorithm with the mc function defined in such way, we obtain different ordinal decision oriented partitions. Each group of the rules generated by this algorithm may have the following form: R = {r ∈ R: ∀r i  ∈ R attrib(concl(r i )) = attrib (concl(r))}. The number of groups in the partition k: 1 ≤ k ≤ n depends again on the number of different decisions included in conclusions of these rules. Currently we distinguish decisions by the different attribute appearing in the conclusion part of the rules—we obtain one group for each decision attribute. This kind of partitioning strategy is called ordinal decision partitioning strategy. Partitions produced by the ordinal decision partition can be constructed as the composition of the basic decision partitions. Ordinal decision partition represents the relations occurring in a rule base—we can say that this partitioning strategy can be considered as a model of decision about concepts from real-word.

3.4 Modified Goal-Driven Inference

Modification of the classical goal-driven inference algorithm is based on extracting information of internal rules dependencies. This information allows to perform only promising recursive calls of backward inference algorithm, optimization relies on reducing the number of rules searched for each run of inference and reducing the number of unnecessary recursive calls.

Modified algorithm as input data takes PR—the decision partition, FS—the set of facts and g—the goal of the inference. As the output data it takes FS—the set of facts, including possible new facts obtained through inference, the function’s result as boolean value, true if the goal g is in the set of facts: g ∈ FS, false otherwise.

Only promising groups of rules are selected for further processing (select R ∈ PR where g ∈ Concl(R)), where Concl(R) is the set of conclusions for the group of rule R, containing literals appearing in the conclusion parts of the rules r from R. Only the selected subset of the not activated rules of R is processed in each iteration. Finally, only the promising recursive calls are made (w ∈ In C(R)). In C(R) denotes connected inputs of the rules group, defined in the following way: In C(R) = {(a, v) ∈ Cond(R): ∃ r∈R (a, v) = concl(r)}, where Cond(R) is the set of conditions for the group of rule R, containing literals appearing in the conditional parts of the rules r from R. In each iteration the set R contains only proper rules matching to the currently considered goal. It completely eliminates the necessity of searching for rules with conclusion matching to the inference goal, it is not necessary to search within the whole set of rules R—this information is simply stored in the decision-partitions and does not have to be generated.

4 A Simple Case Study and Discussion

To illustrate the conception of inference modification, we consider an example rule base:

r 1: (a, 1) ∧ (b, 1) → (c, 1)

r 2: (a, 1) ∧ (b, 2) → (c, 2)

r 3: (a, 1) ∧ (b, 3) → (c, 1)

r 4: (b, 3) ∧ (d, 3) → (e, 1)

r 5: (b, 3) ∧ (d, 2) → (e, 1)

r 6: (b, 3) → (e, 2)

r 7: (d, 4) → (f, 1)

r 8: (d, 4) ∧ (g, 1) → (f, 1)

r 9: (c, 1) → (d, 4)

The different variants of the partitions could be build and stored by the server side services after any knowledge base modification or could be created in the inference initialization phase. In the Table 1 we present two decision oriented partitions. In Case II the asterisk ‘*’ means any attribute value, only attributes are considered. We use basic decision partition for small or “flat” rules bases (without sub-goals). The ordinal decision partitions are useful when we consider large set with a high probability of occurrence of the sub-goals confirmation.

Table 1 Decision oriented partitions

The modified algorithm proposed in this work extracts information of internal rules dependencies from partitioned knowledge base. Important role in the modification plays the information obtained from the set of conclusions for the group of rule: Concl(R). It is possible to determine the rules set matching to the given fact only by searching within conclusions sets. When we consider the goal (f, 1), we can determine matching rules set R5 (Case I) or R3 (Case II) through the single search within the conclusions sets. This searching operation can be done by the server service—resulting rules set can be transferred into the client side as the XML or JSON data for further processing. Typically the matching group of rules has a significantly lower cardinality than the entire set of rules.

The main inference loop could be done be the JavaScript code in the client-side. Client-side code selects the rule from the rules set transferred from the server service, confirms condition from selected rule’s premise, manages the dynamically gathered facts. When the algorithm have to confirm sub-goal, it can determine the usefulness of each recursive call by examining whether the sub-goal is in the set In C (R). When we again consider the goal (f, 1), we have to analyze rules r 7 and r 8. The literal (d, 4) becomes a new sub-goal and recursive call is necessary. This call is promising, because there is another decision partition R 6 (Case I) and R 4 (Case II) supporting sub-goal (d, 4). When we consider sub-goal (g, 1), it is possible to immediately reject potential recursive call—there are no connected rules subset supporting sub-goal (g, 1). This can be done through a single asynchronous call of proper server service via AJAX. The classic version of the algorithm does not known whether the call is promising.

5 The Preliminary Experimental Results

The complexity of decision partition is O(n · k), where n = |RS|, k = |PR|, where the number of groups in the partition k: 1 ≤ k ≤ n typically is significantly smaller than the number of rules n. We have to store additional information for created rules partitions, however additional memory or disk space occupation for data structures seems acceptable. For n rules and k rules group we need approximately is · n + ps · m bytes of additional memory for data structures (is—size of integer, ps—size of a pointer or reference).

Therefore, for n = 1000 rules, k = 100 rules group we need approximately 2.5 KB (precise amount of memory or disk space depends on used programming language, the conception of organization the data structures and designated system platform).

The proposed algorithm has been tested on artificial knowledge bases prepared for tests and on all real-word knowledge bases available for authors. We present only summary of the results for real-word bases—optimistic and pessimistic results with relation to the results of classic goal-driven algorithm (100 %). A limited scope of this work does not allow us to present a detailed information about test methodology and results of tests for recursive calls (Table 2).

Table 2 The results of the experiments

The specific internal structure of knowledge bases, goals specification, and facts set configurations causes the significant differences between the optimistic and pessimistic case. Experiments for the artificial rules bases randomly generated also confirm that when the number of groups increase, the number of searched rules decreases. The optimistic case let to reduce the number of searched rules to the 2 % (in pessimistic case we still reduce the number of rules with relation to the classical algorithm). We understand that we need more experiments on the real knowledge bases, presented results are preliminary. The implementation works are still in progress [16, 17], but the next stage of research will focus on the experiments on two real bases counting over 1200 and 4000 rules.

6 Summary

We introduced a modified goal-driven algorithm and the conception of distribution of such algorithm over the web-based software architecture. Modification of the classical inference algorithm is based on information extracted from the rules of groups generated by the decision partition. The proposed modification consists of the reduction of the search space by choosing only the rules from particular rule group, according to a current structure of decision oriented rules partition and the estimation of the usefulness for each recursive call for sub-goals. Therefore, only promising recursive call of the classical backward algorithm will be made. Every rule base already contains the information necessary to achieve modification steeps mentioned above. We only have to discover and utilize these information. The goal-driven inference proposed in this work is currently used in the two experimental versions of web-based expert system described in [16, 17].