1 Introduction

Accountability has become an increasingly common term in public discourse, with frequent demands for organisations and officials such as politicians, business leaders, government agencies and public service organisations to be held accountable for their actions (or lack of action). Dubnick [1] describes the term “accountability” as a cultural keyword—one that was “culturally innocuous” until the 1960s–70s, but has since undergone a massive growth in usage and become an “expansive, ambiguous, and often enigmatic term with considerable cultural gravitas”.

With the increasing capabilities and uptake of machine learning and other AI techniques to aid human decision-making, the public desire for accountability has begun to encompass the development and deployment of AI software [2, 3], and is likely to provide increasing urgency for researchers to address the emerging field of the ethical use of AI [4,5,6] (see also DeepMind’s “Ethics and Society” initiativeFootnote 1). Due to the conspicuous success of deep learning classifiers and reinforcement learning systems (e.g., Alphabet’s AlphaGoFootnote 2), one particular research focus is on understanding and addressing the inherent biases due to the dependency of such systems on large sets of training data [7]. This is an example of accountability applied to the people and organisations involved in developing and deploying AI: academic (and increasingly public) debate is driving the development and application of norms of best practice [7].

However, in the context of AI systems that can act autonomously, the question arises of whether, and how, such systems could themselves be considered as “accountable”. This is particularly important for systems that are adaptive, i.e., those that have the flexibility to modify their behaviour-generating processes due to changes in their current knowledge of the world and their interactions with other “agents”, which might be humans or other autonomous software systems. This paper addresses the accountability of adaptive autonomous systems, with a particular focus on agents that reason using goals and plans, such as belief-desire-intention (BDI) agents [8,9,10], which have a long history of investigation by researchers in the field of multi-agent systems.

The contributions of this article are: (i) a survey of the relevant literature on accountability, drawing from diverse areas such as sociology, healthcare, management, policy-making and artificial intelligence (especially autonomous and multi-agent systems); (ii) a differentiation and correlation among concepts closely connected to accountability such as responsibility, answerability, and others; we also discuss the functional purpose of accountability; (iii) a justified list of requirements for accountable autonomous agents and research questions stemming from these; and (iv) a preliminary formalisation of one core aspect of accountability: answerability.

The rest of this paper is organised as follows. Section 2 surveys contributions from disparate areas, to answer the question “what is accountability?”. Section 3 proposes, based on the literature surveyed, requirements to support accountability in autonomous practical reasoning agents; for each requirement we list associated research questions. In Sect. 4 we present a preliminary formal model of one aspect of accountability: answerability. We conclude the paper in Sect. 5, discussing our approach, contributions and further research.

2 What Is Accountability?

There has been a small amount of prior work related to accountability of autonomous systems, but it is not clear that this work has formed a consensus on what accountability entails, or how well that work aligns with the view of accountability in other academic fields. Therefore, in this section we survey the literature on accountability from disparate fields such as policy-making, sociology, management and computing science (especially artificial intelligence and multi-agent systems). Our aim is to identify the key requirements that an autonomous agent would need to satisfy in order to be considered accountable.

Chopra and Singh [11] describe accountability as a normative concept in the context of socio-technical systems: “accountability requirements describe how principals ought to act in each other’s eyes, providing a basis for their mutual expectations”. They give two examples of accountability requirements: a meeting participant who is accountable for turning up to a meeting after accepting an invitation, and a food company that is accountable to a regulator for maintaining certain tracking information and providing it to a regulator on demand. However, it is not clear from this discussion to what degree (if any) the authors believe the computational representations and processes needed to support accountability might differ from existing techniques developed by multi-agent systems researchers for reasoning about norms and commitments [12, 13].

Baldoni et al. [14] propose the study of computational accountability. They consider accountability to be an ethical value, and define accountability as “the acknowledgment and assumption of responsibility for decisions and actions that an individual, or an organization, has towards another party”. They note that, implicitly, “individuals are expected to account for their actions and decisions when put under examination”. The paper focuses on multi-agent systems that track the state of conditional social commitments using business artifacts, in order to “coordinate their activities, e.g. through responsibility assignment, as well as to identify liabilities”. It is argued that the “analysis of accountability can be accomplished by looking at commitment relationships”.

In later work, Baldoni et al. [15, 16] take the viewpoint of accountability as a mechanism, summarised by Bovens et al. [17] as “an institutional relation or arrangement in which an agent can be held to account by another agent or institution”. They consider how such an institutional mechanism can be provided by design in a multi-agent system (MAS), and seek to provide “structures that allow assessing who is accountable without actually infringing on the individual and private nature of agents” and to “determine action impact or significance by identifying the amount of disruption it causes in terms of other agents and/or work affected” [15]. To this end, they present five “necessary-but-not-sufficient principles that an MAS system must exhibit in order to support accountability determination” [15]. These principles state that (i) agents should interact within the scope of an organisation, (ii) must join the organisation by taking on a role, (iii) can be accountable only for goals they have explicitly accepted, and (iv) may specify the resources they need to satisfy a goal (which may be provided, or not, at the organisation’s discretion). The fourth principle is endowed with particular significance for accountability determination: “Should an uniformed agent stipulate insufficient provisions for an impossible goal that is then accepted by an organization, that agent will be held accountable because by voicing its provisions, it declared an impossible goal possible” [16]. Baldoni et al. operationalise these principles as an “accountability protocol” to be followed when an agent joins an organisation. This protocol ensures the creation of specific types of commitment between agents and between agents and the organisation. This work is situated within a particular paradigm of organisational multi-agent systems in which organisations are supported by specialised coordination artifacts, whereas we seek a more general model of computational accountability.

Dignum [18] addresses the question of how AI systems can be designed responsibly to ensure they are “sensitive to moral principles and human value [sic]”. She discusses three principles of responsible AI: accountability, responsibility and transparency (ART). Accountability is described as “the need to explain and justify one’s decisions and actions to its partners, users and others with whom the system interacts”. In addition, there is a need for moral values and social norms to be represented and included in the system’s deliberations and explanations of its decisions.

Other multi-agent systems researchers have investigated related concepts such as responsibility, which we discuss in Sect. 2.1, after a more general look at the literature on accountability.

Dubnick [1] notes that it is difficult to find a definition of accountability that is not circular or specific to a qualifying adjective (e.g. “political accountability”). In the latter case, Dubnick observes that “whatever substantive meaning might be in the word accountability is overwhelmed and subordinated to the demands of the specific task environment”. Fox [19] also notes the lack of clarity around the meaning of accountability and related concepts, stating that “the terms transparency and accountability are both quite malleable and therefore – conveniently – can mean all things to all people”.

Bovens et al. [17] discuss the views of accountability in the social psychological, accounting, public administration, political science, international relations and constitutional law literature. They observe that there is a “minimal consensus” in the academic literature. Schillemans [20] expresses this consensus as follows:

(1) Accountability is about providing answers, about answerability, towards others with a legitimate claim in some agents’ work. (2) Accountability is furthermore a relational concept: it focuses our attention on agents who perform tasks for others \(\ldots \) . (3) Accountability is retrospective \(\ldots \) and focuses on the behavior of some agent in general, ranging from performance and results to financial management, regularity or normative and professional standards. (4) \(\ldots \) accountability consists of three analytically distinct phases. In the first phase, the agent/accountor/actor renders an account on his conduct and performance to a significant other. This may be coined the information phase. In the second phase, the principal/accountee/forum assesses the \(\ldots \) transmitted information and both parties often engage in a debate on this account. The principal/accountee/forum may ask for additional information and pass judgment on the behaviour of the agent/accountor/actor. The agent/accountor/actor will then answer to questions and if necessary justify and defend his course of action. This is the debating phase. Finally, the principal/accountee/forum comes to a concluding judgment and decides whether and how to make use of available sanctions. This is the sanctions or judgment phase.

From this, we note that accountability revolves around some form of accountability relationship between an accountee and accountor. As discussed in Sect. 3.1, many of the properties of this relationship have not yet been formalised.

Emanuel and Emanuel [21] give a definition of accountability in the domain of healthcare: “Accountability \(\ldots \) entails procedures and processes by which one party provides a justification and is held responsible for its actions by another party that has an interest in the actions”. They consider the following components of accountability: the locus of accountability, i.e. who can be held accountable, the domain of accountability, i.e. for what activities, practices or issues “a party can legitimately be held responsible and called on to justify or change its action”, and the procedures of accountability, divided into evaluation of compliance and dissemination of evaluations to seek “responses or justifications” from accountable parties.

2.1 Related Concepts

Dubnick [1, Fig. 2.4] categorises various concepts related to accountability that are motivated by “moral pull” (i.e., due to external forces): liability, answerability, responsibility, responsiveness (in the legal, organisational, professional and political settings, respectively), and those motivated by “moral push” (i.e., due to internal managerial efforts): obligation, obediance, fidelity, amenability (in the same four settings, respectively).

The relationships between accountability, responsibility and answerability seem especially subject to varying viewpoints. Dubnick [1] notes that one can be “responsible for some event, for example the marriage of two people who met because (one) did not take the empty seat between them on the bus, without being held to account for it”. Eshleman [22] discusses various philosophical views on moral responsibility. The accountability view holds that “an agent is responsible, if and only if it is appropriate for us to hold her responsible, or accountable, via the reactive attitudes \(\ldots \) (e.g. resentment)”. Another influential view, referred to by Eshleman as the answerability view, is that “someone is responsible for an action or attitude just in case it is connected to her capacity for evaluative judgment in a way that opens her up, in principle, to demands for justification from others”.

In the practice of business management, a Responsible, Accountable, Consulted, and Informed (RACI) matrix is a recognised [23] tool to map where responsibility and accountability are assigned for activities. In this context, the responsible parties are those who work on the activity (responsibility may be shared), whereas the accountable party is the (unique) person with “yes or no authority” over the activity and “about whom it is said ‘The buck stops here’ ” [24].

Researchers in multi-agent systems and deontic logic have addressed the concept of responsibility as the problem of assigning blame for failures of group plans or norms [25,26,27,28,29,30,31,32,33,34,35,36]. This problem has been well studied in the literature, and as determining responsibility is a process performed by a principal, it is largely orthogonal to our focus in this paper: the capabilities needed for an accountable agent to play its role in an accountability relationship with a principal. Therefore, we do not attempt to summarise the literature on responsibility as blame assignment.

In the context of the responsible development of AI systems, Dignum [18] defines transparency as “the need to describe, inspect and reproduce the mechanisms through which AI systems make decisions and learn to adapt to their environment, and to the governance of the data used or created”. Fox [19] discusses the relationship between transparency and accountability in human institutions, which is conventionally expressed as “transparency generates accountability”. After reviewing the empirical literature he concludes that transparency is necessary for accountability, but far from sufficient. In particular, his analysis shows that “opaque transparency” (limited to providing access to information) does not necessarily result in accountability, whereas an overlap between transparency and accountability occurs when there is answerability, i.e. the capacity or right to demand answers. However, answerability without consequences (e.g. sanctions) is a “soft” form of accountability. To guarantee “hard accountability” (answerability plus consequence, such as sanctions), the intervention of other “public sector actors” is needed.

Winikoff [37] considers the question of the trustability of autonomous systems, i.e., how humans can come to trust them, and proposes three prerequisites for such trust: there should be a social framework for recourse; if the system makes a decision with negative consequences for the user, the system should be able to explain its behaviour; and the system should be subject to verification and validation to give assurance that key behavioural properties hold.

2.2 The Functional Purpose of Accountability

When setting out to design accountable software agents it is important to consider the functional purpose of accountability. Is accountability simply something that satisfies a human desire to feel empowered (even if there is no other effect), or are there some system-level benefits? In the former case, there may be no point in creating accountable agents unless they are interacting with people or other agents. In the latter case, it is necessary to identify the benefits that we wish our agents (or their society) to enjoy.

The purpose of accountability has been analysed in the human context. Bovens provides this commentary [38]:

“So why is accountability important? \(\ldots \) In the academic literature and in policy publications about public accountability, three answers recur, albeit implicitly, time and again. Accountability is important to provide a democratic means to monitor and control government conduct, for preventing the development of concentrations of power, and to enhance the learning capacity and effectiveness of public administration.”

The first and last of these answers seem most relevant to software agents (assuming that our agents are not power-seeking). The first reason (control) is also noted by Mulgan [39]:

“The core sense of accountability is clearly grounded in the general purpose of making agents or sub-ordinates act in accordance with the wishes of their superiors. Subordinates are called to account and, if necessary, penalized as means of bringing them under control.”

We note that this also highlights a motivational aspect of accountability: a rational agent (as software agents are generally designed to be) will be likely to prioritise goals for which it is accountable, and devote more resources to them. This is due to the expected costs of requests for answers and possible sanctions in the event of sub-standard performance or failure.

Bovens elaborates on the third reason above (enhancing learning) as follows:

“The purpose of public accountability is to induce the executive branch to learn. The possibility of sanctions from clients and other stakeholders in their environment in the event of errors and shortcomings motivates them to search for more intelligent ways of organising their business. Moreover, the public nature of the accountability process teaches others in similar positions what is expected of them, what works and what does not.”

The last sentence implies a norm-alignment and spreading function of accountability, as Bovens notes elsewhere in his article: “Norms are (re)produced, internalised and, where necessary, adjusted through accountability”.

We conclude that for (software) multi-agent systems, accountability has a role to play in motivating good performance, and in monitoring and control (when one agent is a subordinate of another). It can also allow for incremental system improvement through learning or instruction, e.g. one agent may send new plans to another agent as an outcome of an accountability dialogue, and can enable the alignment and spreading of norms. When human users or partners are involved, we also see accountability contributing to the alignment of values.

3 Requirements for Accountable Autonomous Agents

Based on the literature discussed above, we propose that in order to support accountability, an autonomous practical reasoning agent should have the following four properties:

  • Expectation-Aware. The agent should be able to understand when it becomes subject to the expectations of others, for example through norms and commitments, such as the obligation to provide answers to accountability queries. It should also expect to be held to account, and possibly incur a sanction, after poor performance and failure—this provides the motivation to perform well. Its practical reasoning should be informed by these expectations. This property is likely to be crucial in ensuring that the following two properties are exercised correctly.

  • Answerable. The agent should be able to answer retrospective queries about its decision-making, within some pre-established scope. These queries may not be made immediately, so it must maintain sufficient information about its past reasoning to enable these queries to be answered. Note that answerability is similar to the concept of explainability, but includes the relational aspects of accountability: an accountable agent is answerable to a specific party that may send queries within some (possibly limited) scope, and these must be answered.

  • Argumentative. Full accountability cannot be achieved by one-off queries alone. To enable accountability processes to lead to system improvement (including norm and value alignment), an accountable agent should be capable of undertaking extended accountability dialogues in which beliefs, plans, norms and values are challenged, justified and further queried.

  • Meta-Cognitive. The agent must be able to adapt its reasoning mechanisms as a result of accountability dialogues. For example, an agent may need to update its plans, its plan selection mechanism, its failure-handing mechanism, its norms, or its values as a result of advice from its principal. The ability of an agent to alter its own decision-making components is known as meta-cognition [40], although we do not require the agent to monitor its own cognition, but rather to make changes when required by accountability mechanisms.

Additionally, when the scope of accountability includes actions that affect people, the following property is also required:

  • Value-Aware. The agent should maintain information about the relative importance of human values to its organisation or human partner(s) or client(s), and take these into account during its reasoning [41]. This in line with Dignum’s ART model of responsible AI [18].

3.1 Research Questions

Various research questions stem from the requirements above. When extending autonomous agents to meet the requirements, we have:

  • Expectation-Aware. Research on norm-aware planning in BDI agents, e.g., [42], indicate that it is desirable and possible to extend a standard practical reasoning mechanism to address normative concerns. Our research questions are

    • What practical reasoning approach is most appropriate to be extended with expectations stemming from accountability relationships?

    • What is the minimal information required to enable expectation-aware behaviour in autonomous agents?

    • What game-theoretic aspects are there when agreeing (or not) to be accountable for something?

  • Answerable. There is a wealth of research on summarising and presenting data and information to different stakeholders, e.g., [43, 44]. We anticipate queries to refer to rich and comprehensive records of decision-making processes and their rationale. Some questions arising are

    • What knowledge/information should be represented to support accountability?

    • What extensions/adaptations are required in the decision-making process(es) to ensure the knowledge and information of the previous question is adequately represented?

    • What kinds of queries should be supported in accountability relations?

  • Argumentative. Research on formal argumentation has matured and has been applied to many contexts/domains [45]. Some issues arising are:

    • Which formal argumentation techniques can be (re-)used, adapted or extended in the context of accountability queries and how this can be done?

    • How can accountable behaviour (stemming from practical reasoning) be combined/extended with argumentation capabilities?

    • How can argumentation interactions support and affect accountable behaviour (stemming from practical reasoning)?

  • Meta-Cognitive. Multi-agent plan selection and revision have been explored through different approaches (e.g., [41, 46, 47]) indicating that practical reasoning must tackle meta-cognitive issues – agents not only build and follow plans, but they must also reconsider/revisit decisions and reason about the actual decision processes. Some questions arising are:

    • Is there a need for many levels of meta-cognition, whereby agents become aware about being aware about being aware and so on, or would a single meta-cognition level suffice?

    • Would meta-interpretation [48, 49] be an adequate and flexible approach to both meta-cognition and answerability?

    • Should practical reasoning always embed meta-cognitive concerns or should these only be addressed when agents are accountable for some behaviour or result?

  • Value-Aware. Accountable agents seek to act, or answer queries, in a manner which promotes the values of the organisation(s), human partner(s) or client(s) to which they are accountable. In this context, research questions include

    • How can the actions for which one is held accountable be shown to align to the values that should be promoted? Existing work on argument based practical reasoning (e.g., [50]) demonstrates the links between action and values, but not between accountability and values.

    • How can the lack of promotion of a value (e.g., due to the sub-standard execution of a task) trigger the accountability process?

4 Towards a Formalisation of Accountability

In this section we propose an initial high-level formalism of accountability, focusing on answerability. We assume the accountable agent is equipped with a well studied form of expectation-awareness: the ability to represent and perform practical reasoning informed by norms such as obligations [51]. We consider that answerability is naturally expressed as an organisational norm, or as a commitment (if implicitly created via a commitment protocol [52]). We focus here on the normative view and model answerability as a conditional obligation norm. It is not the intention of this paper to define or commit to any specific formalisation for obligations, so for brevity we use an existing notation from the literature: the logic of Dignum et al. [53] for specifying temporal deontic constraintsFootnote 3.

figure a

where:

  • ag and at refer to the account-giver (or accountable party) and the account-taker (or principal), following the terminology of Chopra and Singh [11].

  • \( QL \) is an agreed (or imposed) query language in which accountability queries will be expressed.

  • S is an agreed (or imposed) scope of queries—not all queries that can be expressed in QL may be relevant to the accountability relationship. Restrictions might include the types of goal considered, and the roles under which the queried activities are performed. We make no commitment regarding how S is expressed.

  • \(\delta {}t\) is the length of the retrospective time period that accountability queries can ask about (where \(\delta _t = \infty \) means there is no limit). This limits the time interval for which ag must keep records of its decision-making processes.

  • rt is the maximum time allowed for an answer to an accountability query to be sent.

  • \(\mathop {PREV}(a)\) means that the action leading to the current state was a.

  • \(\mathop {ask}(at, ag, q)\) is the action of at asking ag the query q.

  • \(\mathop {in\_scope}(q, QL ,S)\) denotes the condition that the query q is expressed in the query language \( QL \) and is within the scope S.

  • \(O(a < t)\) denotes the obligation for action a to be done before time t.

  • \(\mathop {valid\_reply}(ag, at, q, S, \delta {}t)\) is the action of ag sending at a valid answer for query q within scope S, based on a trace of its reasoning for the last \(\delta {}t\) time units. We do not attempt, within this obligation, to specify the notion of a valid reply. Instead, we consider this an abstract action, and assume that ag and at have a common understanding of what counts as [54] a valid reply. Below we propose one option.

  • \(\mathop {now}\) is a special variable used in the logic of Dignum et al. [53] to refer to the time at which the obligation’s conditions become true.

We now consider what could count as a valid answer to the query. An answerable agent should be obliged to provide information about its practical reasoning that led to the queried behaviour, and that is relevant to the query. Before formalising this, we define some notation.

  • \(\tau _{ag}^{[t-\delta {}t,t]}\) denotes a full trace of the agent ag’s reasoning during the interval \([t-\delta {}t,t]\). As well as recording successful plan executions, this trace must include information about options considered and not selected, and action and plan failures.

  • Given a full trace \(\tau \), we write to denote the restriction of the trace to contain only information relevant to the query q and scope S, and omit S if there is no scope restriction. We leave as an open question whether such a notion of relevance can be defined—if not, .

We assume that queries are expressed declaratively, with answers returned as variable bindings (or \(\bot \) to indicate failure), and that the trace is viewed as a set of facts, and can therefore be decomposed into disjoint sets of facts. We then propose the following conditions for a query reply to be considered valid (where \(\sigma \) ranges over variable substitutions and denotes disjoint union):

The first clause states that a reply containing \(\bot \) is valid if the query cannot be answered using the time- and scope-restricted trace. The first line of the second clause expresses the condition that the answer is correct, i.e. \(\sigma (q)\) is entailed by the scope- and time-restricted trace. The second line first extracts a set of reasons from the trace, to help justify the query result, and then requires that at least some of the reasons provided in the answer are necessary for the truth of the answer—removing them from the trace would not allow the query to be answered. When these conditions hold, a reply containing the substitution, i.e. a set of variable bindings, and the reasons is considered valid. This notion of a valid answer does not fully specify the reasons that should be given to justify the answer. We believe these will be domain- and context-dependent, and in general, we envisage the need for a dialogue between the two agents to build up mutual information through a series of queries.

The use of \(\tau _{ag}^{[t-\delta {}t,t]}\) above implies that ag should give an answer that is correct with respect to the full trace over the required retrospective time window. However, that does not necessarily mean that ag must actually record the full trace as implied by its semantics. Given a query scope S, it may be possible to answer queries within that scope using a subset of the information in \(\tau _{ag}^{[t-\delta {}t,t]}\). We explain this intuition by using the notion of an abstraction of a transition system. We can view the full trace as a transition system on time-stamped agent internal states (but note that the transitions must include the evaluation of failed reasoning rule conditions, as well as successes). Answering queries with a subset of information means reasoning with an abstraction of the transition system [55], which is defined over information states that are (potentially lossy) projections of the full agent states.

For a projection function f and a trace \(\tau \), we denote the abstracted transition system that f induces by \(\tau ^f\). The task for the account-giver (or its designer) is then, given a scope S, to find a projection function \(f_S\) such that the following property holds:

This states that answering queries within scope S by projecting traces using \(f_S\) produces the same answers as would be obtained using scope-restricted traces.

This model of answerability opens a number of research directions, including the following:

  • There is a need to underpin the notation above with a formal model of agent reasoning. In the context of debugging BDI agent programs, Winikoff [49] provides such a model in the context of debugging agents by asking “why?” and “why not?” questions, which are answered using traces of agent reasoning. His formalism provides much of what is needed here. However, some aspects of this approach may not suit the problem of answerability. For example, queries may be asked some time after the computation in question was run, especially in the case of suboptimal outcomes or failures, and the account-taker may only have partial observability of the agent trace when asking its queries. Also, Winikoff’s semantics assume that new beliefs can be semantically associated with the actions they were consequences of. In practice, the world is more complicated: actions can have various degrees of success and failure, and their effects can vary accordingly. Also, the effects may not always be immediately observable. To cater for these complexities, a richer domain model may be needed, and explanations may need to be contingent on the most likely causes of observations.

  • A range of useful notions of query language and scope should be investigated. Winikoff investigated questions seeking reasons for why, at a given point of execution, plan steps were or were not performed, or specific conditions were or were not believed. These could be extended to consider extended models of agent reasoning, e.g., those incorporating norms [51] and values [41]. Another potentially useful query type when the account-taker lacks the full trace is “could you have performed X?” for a plan or action X. For argumentative agents, the notion of a query language should be extended to include assertions such as “P would have been a better plan to choose”.

  • The problem of choosing a projection function \(f_S\) given a scope S is important to ensure that agents only need to record the minimal required information. Also, there is the inverse question of what scope of queries can be answered by an agent that keeps a specific type of audit trail.

5 Conclusions, Discussion and Future Work

This paper surveyed the meaning and purpose of accountability in many areas, connecting and differentiating it from closely related concepts such as responsibility and transparency, among others. We identify the functional purpose of accountability: it enables monitoring and control of self-interested agents of a multi-agent system, and facilitates incremental improvements in the system. The improvement comes about as agents, aware of what they are accountable for, factor this in their choices of autonomous behaviour; the interactions among agents as they query and answer each other (this being guided by their accountability relations) will enable sharing of “best practices” (plans which withstand scrutiny and criticism), whilst aligning and spreading global norms. We have put forward requirements for accountable practical reasoning agents, and for each of these requirements we listed related research questions. We sketched a formalisation for one aspect of accountability: answerability, as part of an investigation into the normative constructs, the information model and reasoning mechanisms necessary for accountable practical reasoning.

Concerns about advances in AI and their impact in society have caught the attention of the media, governments and people in general. AI, coupled with autonomous behaviour, has immense potential, and initiatives have championed ethical and responsible principles for systems and their design. We hope we have made a step towards accountable autonomy, whereby the design and execution of practical reasoning agents is influenced by accountability. Ultimately, this paper aims to increase awareness among the multi-agent systems and software agents community of accountability and related ethical matters in our research. We would also like to consider this paper as a call-to-arms: we can, as a community, and building on the wealth of our research, lead the AI community in this quest for ethical and responsible AI.

In addition to the various research questions raised in previous sections, we are currently extending BDI practical reasoning technologies to explore accountability issues. We are also developing our formalisation of accountability, especially its connections with normative aspects as well as norm-aware BDI reasoning.