1 Introduction

Action programming languages like Golog [14] allow to define complex behaviors for agents on top of their high-level actions, which are expressed in some form of expressive (first-order) logical formalism such as the Situation Calculus [17, 24]. Among other things, the logical representation allows to model realistic scenarios of agents with only incomplete information about the world state, where further information has then to be gathered at runtime by sensing.

For defining an agent’s behavior, one can either use programming, meaning the programmer completely predetermines the agent’s actions, or one can resort to planning, i.e. provide a description of a desired goal and leave it to the system to find a suitable course of action. An advantage of Golog is that it gives the programmer the freedom to choose anywhere between these two extremes by freely combining deterministic and nondeterministic parts.

As an example, consider a mobile robot in an office environment. One of its tasks is to deliver letters between mailboxes in the different offices, and the procedure to take all letters out of a mailbox m may look like this:

It reads as follows: As long as there are objects of type letter in m, select (π) some letter l such that In(l,m) holds and take it out. Note that in addition to constructs known from imperative programming languages like sequential composition (“;”) and while loops, Golog programs may contain nondeterministic constructs such as the pick operator π. The program is viewed as a plan sketch where certain parts have been left open, and it is left to the system to fill these gaps.

After taking the letters, the robot should deliver them to their recipients. This is a typical case where planning is needed: We know the task requires a number of move actions (to travel between locations), interspersed with put_in actions (to put letters in mailboxes), but narrowing it further down in terms of a Golog program is difficult. One option is the following, highly nondeterministic subprogram:

$$\mathbf{while} ~(\lnot \mathit{Goal})\ \mathbf{do}\ (\pi a)\ [\phi(a)?; a].$$

The program says that as long as the Goal formula is currently not true, the agent has to nondeterministically choose some action a for which ϕ applies and execute it. In our example, Goal would express that letters for which the recipient is known are delivered to their destination mailboxes:

$$\forall l: \mathit{letter}.~ (\exists m:\mathit{mailbox}~\mathit{Addressee}(l,m)) \supset \mathit{Delivered}(l)$$

ϕ is some criterion to filter suitable actions; in the example it may express that a is among the actions put_in and move:

$$\begin{array}{l}\exists l_1,l_2:\mathit{loc}~(a = \mbox{\texttt{move}} (l_1,l_2))~\lor\\\quad \exists l:\mathit{letter}, m:\mathit{mailbox}~(a = \mbox{\texttt{put\_in}} (l,m)).\end{array}$$

Any successful execution of the above program corresponds to an action sequence that reaches a situation where Goal holds, i.e. a plan. When the Golog interpreter does a lookahead to search for such an action sequence, it thus performs nothing else than classical, sequential planning.

Planning is hence both necessary and possible in principle, but the performance of Golog in these cases is generally poor compared to state-of-the-art planners [2, 4]. The reason is that current Golog implementations typically resort to blind search for lookahead, while the focus of planning research ever since the introduction of Strips [8] has been on designing systems of high efficiency, so that a wide range of techniques and heuristics is available today. It seems natural to ask how this progress can be exploited to improve the performance of Golog. It is conceivable to do this by simply copying techniques from successful planners into Golog. However, this endeavor is undesirable not only as it would mean a pure re-implementation of existing results, but also because it will have to be reiterated for future developments in the area. Instead, it would be much more desirable to plug in any existing planner by means of a simple interface.

Since there is already a common interface to state-of-the-art planning systems, namely the Planning Domain Description Language (Pddl) [19], we use the following approach: Whenever the Golog interpreter encounters a planning sub-problem, we translate it into Pddl, call a Pddl planner on it, and continue the execution with the returned plan. Note that classical planning alone is not sufficient for the runtime control of the agent, as there are additional important tasks. Most notably, the agent typically has only partial knowledge about its environment and has to gather necessary information at runtime using its sensors. For example, before executing takeAllLetters(m), the robot has to use the sensing action look_into(m) to learn which letters there are in m. Golog then takes care of updating the knowledge base of the agent with the results of the sensing actions.

For the embedding, two issues arise. First, we have to ensure that the translation between Golog and Pddl is correct in the sense that the same plans are admitted. Moreover, there is the question how the two formalisms relate in terms of expressivity, i.e. which subset of the source formalism is representable by the target formalism. In this paper, we report on how we addressed these issues within the Platas project (Planning Techniques and Action Languages), which is part of the DFG-funded project cluster LogWiss.Footnote 1

2 The Situation Calculus and Golog

The Situation Calculus [17, 24] is a dialect of first-order logic for reasoning about dynamic domains. Changes in the world are assumed to be the result of primitive actions, which are performed by some implicit agent and modeled by terms like move(l 1,l 2). Properties which are affected by performing such actions are called fluents and can be predicates like Holding(l,s) or functions like position(robot,s). The last argument of a fluent is a situation, which should be understood as the current history of actions that have been executed. The constant S 0 is used to denote the initial situation, and when a is an action and s a situation, then do(a,s) denotes the situation that results from performing a in s. For example Holding(l,do(take_out(l,m),S 0)) means that the agent is holding letter l after taking it out of m. A particular domain is described by a basic action theory (BAT), which is a set of Situation Calculus formulas consisting of:

  1. 1.

    Sentences that describes the initial situation, e.g.

    $$\lnot\exists x ~\mathit{Holding} (x,S_0) \land\forall l:\mathit{letter}\,\lnot\mathit{Delivered}(l,S_0),$$
  2. 2.

    a precondition formula for each action, e.g.Footnote 2

    $$\begin{array}{l}\mathit{Poss}(\mbox{\texttt{take\_out}}(x,y),s) \equiv\\\quad \exists l:\mathit{loc}.~\mathit{At}(\mathit{robot},l,s) \land \mathit{At} (y,l,s)\land \mathit{In} (x,y,s);\end{array}$$
  3. 3.

    for each fluent, a successor state axiom (SSA) encoding how the predicate is affected by actions, e.g.

    $$\begin{array}{l}\mathit{Holding}(x,\mathit{do}(a,s)) \equiv\exists y (a = \mbox{\texttt{take\_out}}(x,y)) \lor \\\quad \mathit{Holding}(x,s) \land\lnot\exists y (a = \mbox{\texttt{put\_in}}(x,y)).\end{array}$$

Based on the Situation Calculus, Golog allows the definition of complex actions called programs. The standard variant supports the constructs shown in Fig. 1. An important aspect is that apart from imperative constructs such as conditionals, loops and procedures, a program can also contain nondeterministic parts, e.g. δ 1 | δ 2 (do either δ 1 or δ 2) and δ (perform δ zero or more times). A program thus represents an incomplete sketch of a problem solution, where nondeterministic parts constitute gaps to be filled by the system.

Fig. 1
figure 1

Basic set of Golog constructs

Variants of Golog extend this basic set of constructs by additional functionality: ConGolog [5] adds concurrency in form of interleaved processes, exogenous events and interrupts. A problem for real-world applications though is that programs are executed off-line in ConGolog, meaning that the interpreter first analyzes the entire program to find a full conforming action sequence before the first action is actually executed. Moreover, especially scenarios where the agent has incomplete knowledge about its environment make it necessary to gather information at runtime through sensing. IndiGolog [6] is an extension that tackles both issues and is hence particularly suited for our purposes. Programs are generally executed on-line without lookahead, which is only applied to subprograms explicitly marked via the search operator Σ(δ). Furthermore it supports sensing actions.

3 Pddl

Pddl [18, 19] was introduced in 1998 as the input language for the First International Planning Competition (IPC) [18] and has, after several revisions, by now become a de-facto standard for the formulation of planning domains and tasks.

In contrast to the Situation Calculus where changes to the world are described fluent-centric by SSAs, Pddl is an action-centric language where actions are specified by their applicability conditions and their effect on the world. For example, the move action from our running example could be formulated in Pddl as shown in Fig. 2. A full Pddl task is specified by the domain description that describes the dynamics of the world (mostly by such actions) and a task description that specifies task-specific aspects, e.g. additional constants, the initial state and the goal. Note that in contrast to BATs there is always a single, fully specified initial state. Since actions are deterministic, Pddl planning systems have always full information about the world state.

Fig. 2
figure 2

Pddl definition of action move

The development of Pddl has been strongly connected to the IPC and, therefore, has been oriented on the capabilities of state-of-the-art planning systems. At the same time, the introduction of new language features gave impulses to new developments. To name a few, Pddl now also covers numeric and temporal aspects (like simple and continuous durative actions that can be executed concurrently [9]), and preferences [10]. Since most of these features are tackled in special competition tracks, planning systems do not have to support all features but can specialize on certain fragments.

4 Mapping Pddl to the Situation Calculus

In order to establish a common semantical basis for Pddl and Golog, we show how a Pddl problem description is mapped to a corresponding BAT. The embedding of planners into Golog actually requires a translation in the other direction, which we obtain by taking the back-image of our mapping. This has two advantages: First, it is simpler since the Situation Calculus is more expressive than Pddl. Second, the mapping can be viewed as an alternative, declarative semantics for Pddl whose semantical definition [9] extends the state-transitional one for Strips [15] to the additional features of Pddl. In this definition, states are represented as sets of literals and a transition is specified by the addition and deletion of literals. Lin and Reiter [16] identified this as problematic in particular for incomplete theories and showed that there is an exact correspondence between Strips and certain forms of BATs in the sense that adding and deleting literals can be viewed as a mere mechanism for computing the progression of such a BAT, which means updating it to reflect the changes due to an action. We generalize this result to incrementally larger subsets of Pddl.

The mapping requires to switch from the action-centric view of Pddl to the fluent-centric one of the Situation Calculus. Pednault already sketched the general idea when he first introduced ADL [21], and Reiter [24] formalized it in his famous solution to the frame problem: For each fluent F, we collect all positive effects on it in a single positive effect axiom, and all negative ones in a single negative effect axiom. If move is the only action that affects at, we get:

Then we assume that these effects indeed describe all and only possible changes to the fluent, yielding the SSA:

$$\mathit{At}(x,y,\mathit{do}(a,s)) \equiv\gamma^+_{\mathit{At}} \lor \mathit{At} (x,y,s) \land \lnot \gamma^-_{\mathit{At}}.$$

In words, At is true after executing a if it was made true (\(\gamma^{+}_{\mathit{At}}\)) or if it was already true and not made false (\(\gamma^{-}_{\mathit{At}}\)). We verified that a similar construction is correct for the ADL fragment of Pddl, which extends basic Strips with conditional effects as well as negated, disjunctive, and quantified preconditions, but does not add higher language features like durative actions or numeric functions: In this fragment, an update to a Pddl state w.r.t. some action corresponds to the progression of the constructed BAT through that action. The progression is guaranteed to exist due to the closed-world and domain-closure assumptions being made in Pddl [2].

We then extended these results to a fragment that includes the temporal and numeric features of Pddl [3]. It uses an explicit notion of time, where a plan has to specify at which time points actions happen. In addition to simple instantaneous actions, it includes durative actions whose execution stretches over some time interval. In our robot scenario, for instance, it is reasonable to model move as durative since the robot needs a certain amount of time to get from one location to another. Our treatment of time and durative actions in the Situation Calculus is based on work by Pinto [22] and Reiter [23]: actions are extended with an additional argument for the execution time, e.g. take_out(x,y) becomes take_out(x,y,t). Furthermore, a durative action a′ is encoded by splitting it into two simple actions start(a′,t) and end(a′,t) that mark the start and end points of its execution interval. We then have a special fluent keep track of which durative actions are in progress, using the SSA:

In Pddl, preconditions and effects of durative actions are annotated with one of the time qualifiers “at start”, “at end”, and “overall”, where the former two refer to the start and end time points, respectively, and the latter to the open interval between them. As long as a precondition or an effect is only concerned with the start (end) time point, our mapping to the Situation Calculus is straightforward: We simply treat it as a precondition or effect of the corresponding start(a′,t) (end(a′,t)) action. However, there might also be inter-temporal effects. In the example of a durative variant of our robot’s move action, we may have the effect that at the end an object arrives together with the robot at the destination if the object is held at the beginning:

figure a

In such cases, we introduce auxiliary fluents that “memorize” whether the condition was true at the beginning:

We also came up with representations of the further temporal features of Pddl such as continuous effects, invariants, and timed initial literals. We proved correctness in the sense that valid temporal Pddl plans correspond to sequences of start and end actions admitted by the BAT. The remaining features of full Pddl such as trajectory constraints, preferences and derived predicates can be mapped similarly [11].

5 Expressivity

Planners are typically designed for specific Pddl fragments since restricting the scope of the system allows for more specialized and efficient algorithms. We are therefore interested in using a fragment that is as small as possible when we formulate a Golog sub-task in Pddl. To understand which fragment of BATs corresponds to which fragment of Pddl, we compare their expressivity using the compilation scheme framework introduced by Nebel [20]. Compilation schemes compare how concisely planning domains and plans can be expressed in different formalisms, not how concisely an individual planning instance can be expressed. The intuition is that it is justifiable to perform significant work on translating a domain description from one formalism to another, as long as this remains a one-off effort, and individual instances of the domain can subsequently be transformed efficiently.

As we are measuring the expressivity of a formalism, the mapping may use arbitrary computational resources, but we require that the result is at most polynomially sized and that the transformation of the domain description does not depend on the initial state and goal. In contrast, the translation of the initial state and the goal is very limited: it should be efficiently computable and not depend on the whole specification. If such a mapping is solution preserving, i.e. there is a plan for the original task iff there is a plan for the result task, we call it a compilation scheme. To compare the expressivity of two planning formalisms we moreover have to measure the size of generated plans. If it is possible to formulate a compilation scheme so that the size of the shortest plan for the result task is bounded linearly by the size of an optimal plan for the source task, we conclude that the target formalism is at least as expressive as the source formalism. If plans are required to grow faster, the source formalism is more expressive. In our project, it turned out that compilation schemes always directly provided a polynomial-time translation to Pddl and the resulting plan can directly be interpreted as a solution situation for the source task.

A particularly interesting subset of Pddl is its ADL fragment, which is more expressive than the rather restrictive Strips but still supported by a large number of planning systems. Based on earlier work [7], we identified a maximal subset of the Situation Calculus that is equally expressive [25, 26]. Put shortly, we can compile all BATs that provide full information on the initial state in an explicit (non-compact) form into the ADL fragment. More precisely, this Situation Calculus fragment can briefly be characterized as all BATs whose initial database consists of exactly the following sentences (where c i denotes a constant):

  1. 1.

    For each relational fluent F there is an expression

  2. 2.

    There are analogous expressions for all situation-independent predicates.

  3. 3.

    There are analogous expressions for all functions except constants and action functions with object arguments.

  4. 4.

    There are unique names axioms c i c j for each pair c i , c j of different constants.

  5. 5.

    Optionally, there may be a domain closure axiom of the form (x=c 1)∨⋯∨(x=c n ).

6 Experimental Results

Based on our results we extended IndiGolog with access to a planning system that supports the ADL fragment, and conducted experiments to assess the practical benefit of the embedding. A simple simulator played the role of the agent’s environment by generating appropriate sensing results.

In one experiment, we tested the mail robot domain of our running example. We let the building consist of a number of locations, each of which is either an office or a hallway. Each office is connected to some hallway, and hallways are connected to one another. Furthermore, each mailbox serves both for incoming and outgoing mail and is located in some office. Initially, each letter is in one of the mailboxes. The robot executes the following program: As long as not all letters are delivered (which we signal from outside), randomly pick a mailbox, go to it, get all letters from it and deliver them. In our encoding, moving to the mailbox is one planning sub-task, getting the letters out is achieved via the procedure takeAllLetters from the introduction, and delivering the letters is another planning sub-problem.

We randomly generated benchmark scenarios for different numbers of offices (= number of mailboxes), hallways and letters, with 10 instances of each combination. We then compared the performance of two variants of our system which differ in how planning sub-tasks are treated: In one, the internal iterative deepening mechanism of Golog is used to solve them. In the other one, each planning task is translated into a Pddl problem as described above, after which an external Pddl planner is called, and execution is continued in Golog with the returned solution plan. The planner we used is the state-of-the-art Fast Downward planning system [12] in a configuration that guarantees optimal plans (A search with the lmCut heuristic [13]). All experiments were conducted on a 64 bit Intel Core2 Duo PC with 2.66 Ghz and 2 GiB of main memory and a 300 seconds timeout per task.

With the external Pddl planner, IndiGolog could complete all 180 benchmark tasks within the time limit, while it only did so for 63 with its internal search, not solving any of the 16 letter instances. Figure 3 shows the median runtimes for both variants. The difference between the two is considerable (note the logarithmic scale), showing that the embedding definitely improves performance and scalability.

Fig. 3
figure 3

Median runtimes in seconds

7 Conclusion

The areas of planning on the one hand and action languages on the other hand have developed largely independently for many years. The planning community was mostly concerned with developing efficient systems, whereas the focus in action logics was more on formalisms of high expressivity. The Platas project is a contribution to integrate both fields, yielding results of both theoretical and practical relevance: The mapping presented in Sect. 4 can be viewed as a declarative semantics for Pddl, which facilitates the analysis of the language through expressing its properties in terms of entailments in a well-understood logic. The comparison of Pddl with the Situation Calculus by means of compilation schemes as described in Sect. 5 furthermore gives us insight into how the two formalisms relate in terms of mutual expressivity, and hence which subset of one formalism can be translated into the other. Both aspects form the theoretical basis for the practical integration of efficient planners into a Golog system. As shown in Sect. 6, this approach enables a drastic increase in performance, while it retains the full expressivity of the underlying action language.

Future work will for one extend the above beyond Pddl to other planning formalisms. For another, we aim to increase the efficiency of Golog also on sub-tasks where the agent has only partial knowledge, possibly at the cost of losing the completeness of the underlying reasoning method. A first step in this direction has already been explored [1].