Computationally rational agents can be moral agents

Mabaso, Bongani Andy

doi:10.1007/s10676-020-09527-1

Computationally rational agents can be moral agents

Original Paper
Published: 24 February 2020

Volume 23, pages 137–145, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Ethics and Information Technology Aims and scope Submit manuscript

Computationally rational agents can be moral agents

Download PDF

Bongani Andy Mabaso ORCID: orcid.org/0000-0003-2610-8303¹

1179 Accesses
5 Citations
Explore all metrics

Abstract

In this article, a concise argument for computational rationality as a basis for artificial moral agency is advanced. Some ethicists have long argued that rational agents can become artificial moral agents. However, most of their views have come from purely philosophical perspectives, thus making it difficult to transfer their arguments to a scientific and analytical frame of reference. The result has been a disintegrated approach to the conceptualisation and design of artificial moral agents. In this article, I make the argument for computational rationality as an integrative element that effectively combines the philosophical and computational aspects of artificial moral agency. This logically leads to a philosophically coherent and scientifically consistent model for building artificial moral agents. Besides providing a possible answer to the question of how to build artificial moral agents, this model also invites sound debate from multiple disciplines, which should help to advance the field of machine ethics forward.

The Human Side of Artificial Intelligence

Article 07 July 2020

What makes full artificial agents morally different

Article Open access 18 February 2024

Whose morality? Which rationality? Challenging artificial intelligence as a remedy for the lack of moral enhancement

Article Open access 07 October 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Although there is evidence of pursuits in the area of Artificial Intelligence (AI) before 1956, it is widely considered that its birthplace was in Dartmouth, wherein John McCarthy and nine other scientists spent two months working on a detailed study of the subject (McCarthy et al. 2006; Russell and Norvig 2009). Fast forward to 2019, and it’s difficult to imagine an industry that has not been impacted by AI. For example, artificial agents are used to drive cars (Daily et al. 2017), help us with personal assistant tasks (Leviathan and Matias 2017), beat the world’s best players in games like Go (Silver et al. 2017), assist us in providing better healthcare (Jiang et al. 2017), and even help militaries gain strategic advantages (Sapaty 2015).

Due to the increase in scope and autonomy of artificial agents, many philosophers and ethicists have raised concerns around deploying them without the necessary measures in place for safe and ethical integration into society (Moor 2006; Dameski 2018; Allen and Wallach 2012; Anderson and Anderson 2007). In particular, there are concerns about how increasingly autonomous artificial agents will treat human beings and whether this treatment will be considered ethical. Moor (2006) states it bluntly when he writes: “we want machines to treat us well”. The emergent field of enquiry dealing with how machines treat us is called Machine Ethics ^{Footnote 1}(Anderson and Anderson 2007; Moor 2006; Allen and Wallach 2012), and it is primarily focused on “developing computer systems and robots capable of making moral decisions” (Allen and Wallach 2012).

Many philosophers have argued that computationally-based agents can be considered artificial moral agents (AMA’s) if they are built to incorporate the relevant ethical dimensions in their decision making processes (Abney 2012; Scheutz and Malle 2017; Floridi and Sanders 2004; Sullins 2006; Moor 2006; Johnson 2006). Abney (2012), for instance, argues that non-cognitive and emotional elements contribute to moral decision making. He further argues, however, that they do not ultimately determine whether or not an agent is moral. What ultimately determines the morality of an agent, according to Abney (2012), is their ability to deliberately and rationally choose ethical decisions and actions. In other words, a rational, though emotionless, robot could be classified as an AMA if it were to meet the requirement above. This is the central philosophical idea in the claim that computationally-based agents can be AMA’s. It is a claim that computational rationality can entail artificial moral agency.

The AMA project is held back by a seemingly disintegrated approach in which its advocates have sought to advance it. For example, there are enough projects from the sciences that have sought to build AMA’s independently of meaningful considerations from normative ethics.^{Footnote 2} These projects often end up with poorly conceptualised AMA’s that will not stand the test of philosophical scrutiny. Similarly, there have been enough philosophical arguments, both for and against, the possibility of artificial moral agency. However, philosophical arguments alone will not advance the AMA project. This dichotomy of approaches in the AMA project can “distract from the immediate task of making increasingly autonomous robots safer and more respecting of moral values, given present or near-future technology” (Allen and Wallach 2012).

Consequently, the purpose of this article is to invite both developers (i.e. engineers and scientists) and philosophers to consider how models of computational rationality might be applied in the building of well conceptualised and formulated AMA’s. I will do this by putting forward a proposal of such a model for computational rationality applied to the problem of solving for artificial morality. This will hopefully shift the discussion from a mostly philosophical debate about whether or not artificial morality is possible, to the models can practically demonstrate it. The next three sections will seek to clarify the concepts of computational rationality and artificial moral agency, before delving into the proposed model and some of its anticipated limitations.

Computational rationality

Computational rationality is perhaps best described as approximating decision making for maximum utility while using the optimal computational resources (Lewis et al. 2014). It is about making rational decisions within a computational framework. As Gershman et al. (2015) note, computational rationality is a convergence of ideas from AI, cognitive science and neuroscience around intelligence, and in particular, its computational nature. They go into extensive lengths in their work to show how ideas of computation from AI have inspired researchers in the cognitive and neurosciences, and vice versa. To get a proper grasp of computational rationality, however, we need to look back a few decades to the works of Simon (1955), Horvitz (1987), and others.

Many of the ideas in computational rationality stem from the tradition of Herbert Simon, who was an economist and political scientist. While Turing (1950) and others were postulating about the nature of machine intelligence, Simon brought much-needed constraints on the kind of rationality that could be achieved by computationally bounded agents. He started looking at candidate definitions for bounded rationality when he was deriving a model for rational choice (Simon 1955, 1972; Selten 1990). He argued that agents do not always have all the information they require to make a decision and that their internal computation was limited in how they could use the available data to make rational decisions. Bounded rationality was, therefore, a way for him to ‘‘formulate the process of rational choice in situations where we wish to take explicit account of the “internal” as well as the “external” constraints that define the problem of optimisation for the organism” (Simon 1955, p. 2).

These ideas inspired many works in AI, a field which also found itself dealing with creating intelligent agents that operate with much of the constraints that Herbert Simon saw in general organisms. Most notably, Horvitz (1987, 1988), and others at the then Medical Computer Science at Stanford took the ideas forward (Horvitz et al. 1989). Horvitz argued that probability and utility theories,^{Footnote 3} both of which were generally considered normative decision making in computer science, were insufficient for the real-world problems that machine intelligence systems were trying to solve. Real-world problems often go beyond the standard axiomatic basis defined by utility and probability theories, as they are often characterised by uncertain and limited information (thus making the process of modelling and knowledge representation difficult). Furthermore, machine intelligence systems have limited computational resources, which made the application of classical decision-theoretic approaches to many real-world problems difficult, and many times, intractable (Horvitz 1987).

To deal with these problems, Horvitz suggested looking at various optimisation and heuristic strategies to resolve some of the challenges in real-world decision making. Notably, he proposed the notions of flexible inference and decision-theoretic control. Various inference techniques have been developed over the years that allow partial inference with limited information or partial execution. This also paved the way for the concept of meta-reasoning, which is just a program that is aware of various inference strategies and can select the best strategy based on the type of problem that needs to be solved (Horvitz 1989). These types of inference strategies present a natural fit for the optimisation and heuristic framework of Horvitz. Decision-theocratic control represents the ability for the agent to determine how best to execute a specific inference strategy based on a trade-off between computation time, precision, maximum expected utility (MEU) and cost of delaying the action. Balancing these trade-offs, along with suitable or multiple inference strategies, represents the core idea in the approach of Horvitz.

The ideas of Horvitz and Simon have persisted well over time, with many AI researchers adopting them (Marwala 2013; Zilberstein 2013; Russell and Subramanian 1995; Genewein et al. 2015; Lewis et al. 2014; Gershman et al. 2015). Russell and Subramanian (1995) use these ideas to develop what they call provably bounded-optimal agents. Bounded-optimal agents are machine intelligence systems whose solutions to problems are optimal for the information that they can acquire from the task environment and the limitations of their programs and architectures. In other words, optimality is what the agent can achieve, given its internal and external constraints, and not necessarily what a perfectly rational agent would do for a given task. This conception of a bounded-optimal agent formed the foundation for what is now referred to as computationally rational agents in recent literature (Gershman et al. 2015; Lewis et al. 2014).

Quite suitably, the work of Gershman et al. (2015), with Horvitz as one of the co-authors, likely represents one of the clearest pictures of what computational rationality is, and what it can be. As the authors note, computational rationality has the potential to be a “unifying framework for the study of intelligence in minds, brains, and machines” (Gershman et al. 2015, p. 278). I support this claim and further posit that computational rationality can be a unifying framework not only for ideas in the sciences, but also in Philosophy, and more specifically, in Machine Ethics. After all, it was Aristotle who first placed a strict emphasis on practical rationality^{Footnote 4} as a basis for virtuous and ethical action (Miller 1984). I aim to clarify how exactly computational rationality can be an integrative framework for machine ethics by showing how the ideas of Gershman et al. (2015), Horvitz (1987), Russell and Subramanian (1995), and others, can be applied to the question of building artificial moral agents. I will do this by discussing the epistemic capacities required for moral agency and considering whether these capacities can be replicated or approximated within a framework of computational rationality.

Artificial moral agency

Before delving into the details of the computability of the capacities necessary for moral agency, I need to first define what I mean by an artificial moral agent. Generally speaking, the idea of agency denotes the capacity for an agent to act independently (Schlosser 2015). In contrast, moral agency denotes the capacity for an agent to act independently in so far as making morally charged decisions and actions, and to have a level of responsibility and accountability for the consequences resulting from its decisions and actions (Parthemore and Whitby 2014). Moral agency implies a certain understanding and knowledge of what is good and what is bad (morality) and being able to discern what is right from what is wrong (ethics). Moral agency should not be confused with moral goodness or ethical uprightness. Its emphasis is on the agent’s ability to be responsible for its decisions and actions, regardless of whether or not those actions are evaluated as morally good or bad.

The definition above gives us a good idea of the notion of moral agency, but it does not address who or what can be included in the class of moral agents. How the concept of moral agency is framed is important because asking who is a moral agent already presupposes personhood, which is generally taken to be embodied in human beings. Parthemore and Whitby (2014) suggest framing the question more broadly by asking “when is any agent a moral agent”. Such open-ended framing of the question allows one to consider a wider set of agents for inclusion in the class of moral agents. When one asks the question in this way, three broad categories of moral agents seem to emerge from the literature. These categories are: biological moral agents (Torrance 2008; Churchland 2014; Liao 2010; Rottschaefer 2000) ; conscious moral agents (Parthemore and Whitby 2013, 2014; Himma 2009); and artificial moral agents (Abney 2012; Scheutz and Malle 2017; Floridi and Sanders 2004; Sullins 2006; Moor 2006; Johnson 2006). I will place my focus on artificial moral agents.

The proponents of artificial moral agency can be further subdivided into two. The first group are those that argue that most, if not all, of the full range of moral decisions, can be computed by some near or future term artificial agent (Abney 2012; Sullins 2006; Allen and Wallach 2012). The second group are those that argue that only certain kinds of moral decisions can be computed using current approaches to AI and that the full range of moral decisions will require super-rational capacities (Scheutz and Malle 2017; Johnson 2006). Let us call the former group of views strong machine ethics, and the latter weak machine ethics. Strong machine ethics refers to the argument that moral agency can likely be fully achieved with an appropriate level of (computational) intelligence. On the other hand, weak machine ethics refers to the argument that full moral agency, at least in its historic and somewhat anthropomorphic roots (Torrance 2013), will not be achieved using current computational approaches to AI. As a result, robots will only have a pseudo or functional morality. I will consider definitions of artificial moral agency in both the strong and weak machine ethics perspectives.

Given this context, I can now discuss my candidate definition for artificial moral agency. To do this; it is essential to understand that current approaches to machine ethics are primarily computational, i.e. they are dealing with computational morality. Outside of significant advances in new approaches to designing artificial agents, it seems unlikely that this will change soon. Even those that recognise that some notion of consciousness will be required for general intelligence (Franklin 2003), and indeed full moral agency (Wallach et al. 2011), are only working towards functional approximations of it—mostly using a combination of cognitive architectures and computational implementations (Franklin et al. 2014; Lucentini and Gudwin 2015). The nature of machine ethics implementations, it would seem, will remain almost certainly computational, at least for the foreseeable future.

The definition of moral agency given by Parthemore and Whitby (2014, p. 1)^{Footnote 5} serves as a good reference. However, the previous discussion showed that different people mean different things when they use the term ‘moral agent’. What is important for researchers and designers in machine ethics is to state clearly in which sense we mean the term ‘moral agent’ and also clearly specify what exactly our definition for it is. To illustrate, I define artificial moral agency (in the weak sense) by modifying Parthemore and Whitby’s definition as follows:

An artificial moral agent is a computationally-based agent whom one appropriately holds responsible for its actions and consequences, and artificial moral agency is the distinct type of agency that agent possesses.

I refer to moral agency in the weak sense, meaning that I believe not all moral decisions can be made rationally—super-rational capacities are required for others. A strong machine ethics view of artificial moral agency can also be defined and clarified by following a similar process. The definition of artificial moral agency above is somewhat ontological in that it emphasises the nature of the agent. However, in theory, a definition based on the agent’s moral capability could also be derived. Thankfully, Moor (2006) has already developed a taxonomy that helps characterise the level of ethical capability in artificial agents.

Moor describes four different kinds of AMA’s, each according to capability. These four kinds are (in order of increasing ethical capability): ethical impact agents; implicit ethical agents; explicit ethical agents; and full ethical agents. Though a full examination of Moor’s taxonomy is outside of the scope of this article, I submit that my sample definition is quite consistent with what Moor calls a explicit ethical agent^{Footnote 6}. For the remainder of this article, I will use the sample definition of artificial moral agency stated above (in the weak sense), complemented by the use of the term explicit ethical agent, to be what I mean when referring to an AMA.

Artificial moral agency within a framework of computational rationality

I now need to show how the concept of artificial moral agency is compatible with a framework of computational rationality. Firstly, I will argue that the capacities necessary for moral agency lend themselves naturally to being computable. Secondly, I will also argue that many of the problems that were envisaged could be solved by computational rationality, are also present in computational morality, and that these same problems can also be solved through a framework computational rationality in the tradition of Gershman et al. (2015), Horvitz (1987), Russell and Subramanian (1995), and others. Let me begin by examining the claim that the capacities required for moral agency can be computed.

So far, I have avoided stating which capacities are required for moral agency. In the literature, these capacities can include emotions, empathy, free will, rationality, cognition (including mental and intentional states), concepts, awareness, amongst others (Wallach et al. 2011; Parthemore and Whitby 2013, 2014; Himma 2009; Torrance 2008). One way to get around this issue is to focus on what these various capacities give you as a result. In other words, instead of arguing about which capacities (and combinations thereof) will result in some facet of moral agency, focus on the outcome that is expected to be achieved. This is precisely what philosophers such as Sullins (2006) and Floridi and Sanders (2004) do by focusing on the top-level requirements for artificial moral agency and abstracting away the detail regarding the exact capacities required. I choose to focus on the requirements expressed by Floridi and Sanders because they conceptualise artificial moral agency within a weak machine ethics framework, as opposed to Sullins, who conceptualises it within a strong machine ethics framework.

Flordi and Sanders define the requirements for artificial moral agency as interactivity (being aware and responsive to environmental stimuli), adaptability (the ability to change internal states according to environmental stimuli) and autonomy (the ability to change internal states according to the agent’s own transition rules independently of environmental stimuli) (Floridi and Sanders 2004). Focusing on the top-level requirements for moral agency and abstracting away details around required capacities is essentially a focus on ‘mindless’ morality—a form of morality that distinctly suits a computational framing of moral agency. It does not care how autonomy or intentionality, for instance, are achieved—it only cares that they are achieved. This is precisely what Floridi and Sanders (2004) are alluding to when they talk about moral agency at different levels of abstraction (LoA). At a low enough LoA, a human being would also not be considered a moral agent since we would be dealing with their biological make-up, the neurobiological processes in their brains and other cognitive processes which at that level would seem indistinguishable from a machine.

Similarly, artificial agents observed at a low enough LoA are simply electronic components and code, and at that level, we cannot decide on moral agency. However, at a high enough LoA, these low-level processes and components are abstracted such that we only see the outcomes of their decisions. We wouldn’t ordinarily know how exactly the AMA functions, only that it seems to have some goals and intentions, it can function autonomously, and learn new things over time. At that LoA, we would be forced to admit that the robot acts in a manner that is consistent with our expectations of moral agents (Coeckelbergh 2014).

Flrodi and Sanders’ approach to defining the requirements for artificial moral agency is not without its critiques, the strongest of which likely comes from Himma (2009). He argues that, under Floridi and Sanders’ formulation, rattlesnakes, for example, could be wrongly considered to be artificial moral agents. If, as Himma’s example goes, the rattlesnake acts as a response to hunger and kills something, then it would have acted autonomously, certainly interactively and apparently with some ability to learn. The crux of Himma’s argument seems to be that only praise or blame-worthy agents could be moral agents. There are two issues with Himma’s argument, especially as it pertains to artificial moral agency.

Firstly, Himma’s argument presupposes that discourse around moral agency is equivalent to responsibility analysis and that no room exists for prescriptive discourse in the identification of moral agents (Floridi and Sanders 2004). Secondly, and to use his example, the rattlesnake would not qualify as a moral agent, according to Floridi and Sanders’ requirements, because it cannot learn moral values. It is only responding to instinct.

An artificial agent, on the other hand, can be programmed to simulate the capacity to learn (morally), and thus could qualify as an AMA. How good an AMA it will be (i.e. responsibility analysis) is a different matter altogether, and will require us to build models of computational morality and to evaluate them. To be clear, without consciousness or intentional/unconscious mental states, the AMA could not be a full moral agent, but that is why we put the qualifier ‘artificial’ in front of ‘moral agent’. In theory, it’s moral performance will lie somewhere between a rattlesnake and a full moral agent such as a human being (Moor 2006).

I have argued that the capacities required for moral agency, as expressed by Floridi and Sanders (2004), lend themselves to being computable. However that is not the only reason that artificial moral agency is compatible with a framework of computational rationality. Computational rationality exists as a framework primarily because artificial agents are not perfectly rational. They face many internal and external constraints such as limited computational resources, limited information about the problem at hand, limited time (and space) within which to make a decision, the tractability of the problem itself, and so on.

As it turns out, AMA’s are faced with much of the same constraints and limitations as computationally rational agents. AMA’s have to make moral decisions despite the limitation of computational resources, information, time, and the tractability of the moral decision itself. I posit that the problem of computational morality is simply a special case, albeit a complex one, of computational rationality, and that many of the approaches to solving computational rationality in the general case, can be used to enhance the prospects for computational morality further.

For example, the emergence of hybrid approaches, i.e. model-based (top-down) and model-free (bottom-up), as a superior choice for certain complex tasks for computational rationality (Gershman et al. 2015), and the fact that prominent researchers in machine ethics believe that a combination of top-down and bottom-up approaches will likely be required to solve certain kinds of complex moral decisions (Allen et al. 2005), lends further credence to the idea that the two domains are more related than different^{Footnote 7}. Just as Russell and Subramanian (1995) popularised the concept of a bounded-optimal agent, perhaps it is time to start talking about bounded-optimal artificial moral agents, i.e. AMA’s that arrive at moral decisions based on the information they can acquire from the environment, given the limitations of their software architectures and programming. Next, I will briefly discuss a basic conceptual model for a computationally rational AMA.

A model for an optimally-bounded, computationally rational AMA

The proposed model for a computationally rational AMA is based on the idea of an optimally-bounded, computationally rational agent that has been discussed thus far. I openly base the model on Russell and Norvig (2009, p. 55) (see Fig. 1), whose conception of a general learning agent is simple, and yet comprehensive. The ideas in computationally rationality can be integrated into the model of any general artificial and intelligent agent, so long as its key tenants, such as bounded-optimality, the separation of meta-reasoning from specific algorithms for reasoning, and the use of formal and heuristic methods, are preserved. Figure 1 depicts the structure of a general learning agent which can perform certain actions in an environment, through its sensors and actuators, according to a set performance standard (perhaps set by a human being). The general learning agent also can improve its decision making and performance capability over time, and generate new problems (goals) that can help it to improve performance further and learn new ways to reason.

I present Fig. 2 as a proposed high-level conceptual model for a computationally rational AMA. The agent gathers bounded information from the environment, and processes it in the ethical performance element, which is responsible for ethical as well as general reasoning. The decisions and actions from this element are then transferred back to the environment (via the relevant actuators and communication mechanisms). The learning element and problem generator are left as-is from Russell and Norvig’s conception. They are responsible for updating the performance element with new ways to reason and generate new ideas for future performance, respectively. The critic element is also similar, except instead of only allowing for external input to modify the performance of the agent (e.g. human input), it also allows the agent to provide a human-understandable rationale for its performance.

Figure 3 zooms in on the ethical performance element, where the ethical meta-reasoner is responsible for deciding on the best ethical framework (or combinations thereof) and one or more programs to execute to arrive at an optimally-bounded ethical decision. The ethical performance element thus separates high-level meta reasoning activities from the execution. However, it still exposes the ethical meta-reasoner to the information from the environment to allow it to make the optimal choice of execution strategy. At a high-level, the proposed AMA would meet the requirements of interactivity (it can receive information from the environment and act on it), adaptability (it can change its performance state through the ethical performance and learning elements), and autonomy (it can behave in a somewhat autonomous manner through the problem generator, which generates new ideas about how to execute performance in the future). Additionally, it can receive a new performance standard and explain its current performance to a human being.

With regards to potential limitations of the model, I have argued in earlier sections that the AMA is conceptualised to have weak machine ethics (Sect. 3). As such, we can expect that the AMA could only be capable of making some, but not all, moral decisions. At this stage, it would be difficult to determine which moral decisions it would be able to make, and such a determination lies outside the scope of this article. However, I can speculate that moral decision making in situations where (bounded) information is readily available and accessible to the AMA should be theoretically possible. Such contexts could include highly domain-specific environments, such as in self-driving cars, healthcare robots, loan approval bots, home-assistants, and the like.

The model depends heavily on the availability of bounded information. Thus I expect that moral decisions requiring little to no external information (i.e. abstract decision-making) would be difficult to compute, at least initially until the AMA learns a sufficient representation of moral values. Furthermore, there is the general issue (not necessarily a limitation, but an unknown) of how the model would internally represent its learned moral values, and how this would map to actions that affect real agents in the real world.

Conclusion

The purpose of this article was to advance an argument and model for artificial moral agency based on a framework of computational rationality. This was done by showing that computational rationality can be an integrative element that can effectively combine both the scientific and philosophical elements of artificial moral agency consistently and logically. In particular, I argued that the capacities required for artificial moral agency, as well as the aspects of functional consciousness that underpin them, are computable. I further argued that computational morality is a special, if not complex, case of computational rationality, hence many techniques originally developed for general rationality can be adapted for computational morality. I then briefly proposed a conceptual model for a bounded-optimal, computationally rational AMA.

Some philosophers and scientists might reject the idea of a bounded-optimal artificial moral agent. After all, the stakes can be quite high when it comes to moral decision making, as the wrong decision could have significant moral and societal implications. However, we need to start somewhere, and I suggest that starting from a weak machine ethics perspective is helpful to allow us to begin to test its limits and the sorts of domains and contexts where it can be applied. The model proposed is an invitation for dialogue and feedback, and the hope is that many philosopher-developers pairs can be formed to solve the problem of constraining weak AI systems and making them more respecting of human moral values.

I have specifically chosen to omit mentions of the ethical frameworks that the AMA should follow, as the main purpose of this article was to locate artificial moral agency within a framework of computational rationality. Future research needs to focus on the kinds of ethical frameworks an optimally-bounded, computationally rational AMA ought to follow. Further research into appropriate software architectures for the AMA, and the type of programs that can form part of the ethical performance element’s program space, is also required.

Notes

Sometimes referred to as machine morality, computational morality or artificial morality.
See, for example, the works of Wu and Lin (2018), Arnold et al. (2017), Conitzer et al. (2017) and Yu et al. (2018). Much of these works offer tremendous technical value in building AMA’s, but very little by way of conceptualisation and formulation of an AMA.
Chapter 13 of Russell and Norvig (2009) gives a great introduction to decision theory in computer science.
Can be found in the Nicomachean Ethics Book VI.
“A moral agent is an agent whom one appropriately holds responsible for its actions and their consequences, and moral agency is the distinct type of agency that agent possesses”.
According to Moor (2006), a explicit ethical agent is one that can hold an explicit ethical representation of a given situation, and use that to respond in a manner that is ethical.
The use of the terms top-down and bottom-up in both the cited philosophical and scientific disciplines is conceptually the same. In both cases, top-down means starting from a pre-defined ethical framework or a computational model and bottom-up means learning an ethical representation or a computational model from the available data.

References

Abney, K. (2012). Robotics, ethical theory, and metaethics: A guide for the perplexed, chap 3. In P. Lin, K. Abney, & G. Bekey (Eds.), Robot Ethics, the ethical and social implications of robotics. Cambridge: The MIT Press.
Google Scholar
Allen, C., & Wallach, W. (2012). Moral Machines: contradiction in terms, or abdication of human responsibility? Chap 4. In P. Lin, K. Abney, & G. A. Bekey (Eds.), Robot Ethics, the ethical and social implications of robotics. Cambridge: The MIT Press.
Google Scholar
Allen, C., Smit, I., & Wallach, W. (2005). Artificial morality: Top-down, bottom-up, and hybrid approaches. Ethics and Information Technology, 7(3), 149–155. https://doi.org/10.1007/s10676-006-0004-4.
Article Google Scholar
Anderson, M., & Anderson, S. L. (2007). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 28(4), 15. https://doi.org/10.1609/aimag.v28i4.2065, http://www.aaai.org/ojs/index.php/aimagazine/article/view/2065.
Arnold, T., Kasenberg, D., & Scheutz, M. (2017). Value alignment or misalignment what will keep systems accountable?. In Workshops at the Thirty-First AAAI Conference on Artificial Intelligence.
Churchland, P. S. (2014). The neurobiological platform for moral values. Behaviour, 151(2–3), 283–296. https://doi.org/10.1163/1568539X-00003144.
Article Google Scholar
Coeckelbergh, M. (2014). The moral standing of machines: Towards a relational and non-cartesian moral hermeneutics. Philosophy and Technology, 27(1), 61–77.
Article Google Scholar
Conitzer, V., Sinnott-Armstrong, W., Borg, J. S., Deng, Y., & Kramer, M. (2017). Moral decision making frameworks for artificial intelligence. In Thirty-First AAAI Conference on Artificial Intelligence, https://pdfs.semanticscholar.org/a3bb/ffdcc1c7c4cae66d6af373651389d94b7090.pdf.
Daily, M., Medasani, S., Behringer, R., & Trivedi, M. (2017). Self-driving cars. Computer, 50(12), 18–23. https://doi.org/10.1109/MC.2017.4451204
Article Google Scholar
Dameski, A. (2018). A comprehensive ethical framework for AI entities: Foundations. In M. Iklé, A. Franz, R. Rzepka, B. Goertzel, (Eds.), International Conference on Artificial General Intelligence, pp. 42–51. Berlin: Springer. https://doi.org/10.1007/978-3-319-97676-1.
Floridi, L., & Sanders, J. W. (2004). On the morality of artificial agents. Minds and Machines, 14(3), 349–379. https://doi.org/10.2139/ssrn.1124296.
Article Google Scholar
Franklin, S. (2003). A conscious artifact? Journal of Consciousness Studies, 10(4–5), 47–66.
Google Scholar
Franklin, S., Madl, T., Mello, S. D., & Snaider, J. (2014). LIDA: A systems-level architecture for cognition, emotion, and learning. IEEE Transactions on Autonomous Mental Development, 6(1), 19–41.
Article Google Scholar
Genewein, T., Leibfried, F., Grau-Moya, J., & Braun, D. A. (2015). Bounded rationality, abstraction, and hierarchical decision-making: An information-theoretic optimality principle. Frontiers in Robotics and AI, 2(November), 1–24. https://doi.org/10.3389/frobt.2015.00027.
Article Google Scholar
Gershman, S. J., Horvitz, E. J., & Tenenbaum, J. B. (2015). Computational rationality: A converging paradigm for intelligence in brains, minds, and machines.Science, 349(6245), 273–278. https://doi.org/10.1126/science.aac6076, www.sciencemag.orgpapers2://publication/uuid/20A0106C-9CBA-472D-AAFB-69231964766F, arXiv:1011.1669v3.
Himma, K. E. (2009). Artificial agency, consciousness, and the criteria for moral agency: What properties must an artificial agent have to be a moral agent? Ethics and Information Technology, 11(1), 19–29. https://doi.org/10.1007/s10676-008-9167-5.
Article Google Scholar
Horvitz, E. J. (1987). Reasoning about beliefs and actions under computational resource constraints. In Proceedings of the Third Workshop on Uncertainty in Artificial Intelligence, AAAI and Association for Uncertainty in Artificial Intelligence, pp. 429–444. http://erichorvitz.com/u87.htm.
Horvitz, E. J. (1988). Reasoning under varying and uncertain resource constraints. In AAAI, pp. 111–116.
Horvitz, E. J. (1989). Rational metareasoning and compilation for optimizing decisions under bounded resources. In Proceedings of Computational Intelligence ’89, Association of Computing Machinery, Milan, Italy, http://erichorvitz.com/rationality_89.htm.
Horvitz, E. J., Cooper, G. F., & Heckerman, D. E. (1989). Reflection and action under scarce resources: Theoretical principles and empirical study. IJCAI, 2, 1121–1127.
MATH Google Scholar
Jiang, F., Jiang, Y., Zhi, H., Dong, Y., Li, H., Ma, S., et al. (2017). Artificial intelligence in healthcare: past, present and future. BMJ,. https://doi.org/10.1136/svn-2017-000101.
Article Google Scholar
Johnson, D. G. (2006). Computer systems: Moral entities but not moral agents. Machine Ethics, 9780521112, 168–183. https://doi.org/10.1017/CBO9780511978036.012.
Article Google Scholar
Leviathan, Y., & Matias, Y. (2017). Google AI Blog: Google Duplex: An AI system for accomplishing real-world tasks over the phone. https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html.
Lewis, R. L., Howes, A., & Singh, S. (2014). Computational rationality: Linking mechanism and behavior through bounded utility maximization. Topics in Cognitive Science,. https://doi.org/10.1111/tops.12086.
Article Google Scholar
Liao, S. M. (2010). The basis of human moral status. Journal of Moral Philosophy, 7(2), 1–31. https://doi.org/10.1163/174552409X12567397529106.
Article Google Scholar
Lucentini, D. F., & Gudwin, R. R. (2015). A comparison among cognitive architectures: A theoretical analysis. Procedia Procedia Computer Science, 71, 56–61. https://doi.org/10.1016/j.procs.2015.12.198.
Article Google Scholar
Marwala, T. (2013). Semi-bounded rationality—A model for decision making. arXiv preprint arXiv:13056037 pp. 153–164, arXiv:1305.6037.
McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C. E. (2006). A proposal for the Dartmouth summer research project on artificial intelligence. AI Magazine, 4, 12–14. https://doi.org/10.1609/aimag.v27i4.1904. arXiv:9809069v1.
Miller, F. D. (1984). Aristotle on rationality in action. The Review of Metaphysics, 37(3), 499–520, https://www.jstor.org/stable/20128047.
Moor, J. H. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems, 21(4), 18–21. https://doi.org/10.1109/MIS.2006.80.
Article Google Scholar
Parthemore, J., & Whitby, B. (2013). What makes any agent a moral agent? Reflections on machine consciousness and moral Agency. International Journal of Machine Consciousness, 5(2), 105–129. https://pdfs.semanticscholar.org/3ff2/49fe3c8b3a2c94ae762b76b2dd0203f1f789.pdf.
Parthemore, J., & Whitby, B. (2014). Moral agency, moral responsibility, and artifacts: What existing artifacts fail to achieve (and why), and why they, nevertheless, can (and do!) make moral claims upon us. International Journal of Machine Consciousness, 6(2), 141–161. https://doi.org/10.1142/S1793843014400162.
Article Google Scholar
Rottschaefer, W. A. (2000). Naturalizing ethics: The biology and psychology of moral agency. Zygon, 35(5–6), 253–286. https://doi.org/10.1111/0591-2385.00276.
Article Google Scholar
Russell, S. J., & Norvig, P. (2009). Artifical intelligence: A modern approach, third edit edn. Prentice Hall, https://doi.org/10.1017/S0269888900007724, arXiv:1707.02286, arXiv:1011.1669v3.
Russell, S. J., & Subramanian, D. (1995). Provably bounded-optimal agents. Journal of Artiicial Intelligence Research, 2, 575–609.
Article Google Scholar
Sapaty, P. S. (2015). Military robotics: Latest trends and spatial grasp solutions. IJARAI International Journal of Advanced Research in Artificial Intelligence, 4(4), 9–18.
Google Scholar
Scheutz, M., & Malle, B. F. (2017). Moral robots. In L. S. M. Johnson & K. S. Rommelfanger (Eds.), The Routledge handbook of neuroethics. Abington: Routledge. https://doi.org/10.4324/9781315708652.ch24.
Chapter Google Scholar
Schlosser, M. (2015). Agency. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy, fall 2015 edition. Stanford: Metaphysics Research Lab, Stanford University.
Google Scholar
Selten, R. (1990). Bounded rationality. Journal of Institutional and Theoretical Economics (JITE), 146(4), 649–658.
Google Scholar
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., et al. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), 354. https://doi.org/10.1038/nature24270.
Article Google Scholar
Simon, H. A. (1955). A behavioral model of rational choice. The Quarterly Journal of Economics, 69(1), 99–118.
Article Google Scholar
Simon, H. A. (1972). Theories of bounded rationality. Decision and Organization, 1(1), 161–176.
MathSciNet Google Scholar
Sullins, J. P. (2006). When is a robot a moral agent? IRIE: International Review of Information Ethics. http://sonoma-dspace.calstate.edu/handle/10211.1/427.
Torrance, S. (2008). Ethics and consciousness in artificial agents. AI and Society, 22(4), 495–521. https://doi.org/10.1007/s00146-007-0091-8.
Article Google Scholar
Torrance, S. (2013). Artificial agents and the expanding ethical circle. AI and Society, 28(4), 399–414. https://doi.org/10.1007/s00146-012-0422-2.
Article Google Scholar
Turing, A. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460.
Article MathSciNet Google Scholar
Wallach, W., Allen, C., & Franklin, S. (2011). Consciousness and ethics: Artificially conscious moral agents. International Journal of Machine Consciousness, 03(01), 177–192. https://doi.org/10.1142/S1793843011000674.
Article Google Scholar
Wu, Y. H., & Lin, S. D. (2018). A low-cost ethics shaping approach for designing reinforcement learning agents. The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18). arXiv:1712.04172.
Yu, H., Shen, Z., Miao, C., Leung, C., Lesser, V. R., & Yang, Q. (2018). Building ethics into artificial intelligence. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI), pp. 5527–5533. http://moralmachine.mit.edu/.
Zilberstein, S. (2013). Metareasoning and bounded rationality. In M. T. Cox & A. Raja (Eds.), Metareasoning: Thinking about thinking (pp. 27–40). Cambridge: MIT Press. https://doi.org/10.7551/mitpress/9780262014809.003.0003.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

University of Pretoria, Pretoria, South Africa
Bongani Andy Mabaso

Authors

Bongani Andy Mabaso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bongani Andy Mabaso.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mabaso, B.A. Computationally rational agents can be moral agents. Ethics Inf Technol 23, 137–145 (2021). https://doi.org/10.1007/s10676-020-09527-1

Download citation

Published: 24 February 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s10676-020-09527-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Computationally rational agents can be moral agents

Abstract

Similar content being viewed by others

The Human Side of Artificial Intelligence

What makes full artificial agents morally different

Whose morality? Which rationality? Challenging artificial intelligence as a remedy for the lack of moral enhancement

Introduction

Computational rationality

Artificial moral agency

Artificial moral agency within a framework of computational rationality

A model for an optimally-bounded, computationally rational AMA

Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Computationally rational agents can be moral agents

Abstract

Similar content being viewed by others

The Human Side of Artificial Intelligence

What makes full artificial agents morally different

Whose morality? Which rationality? Challenging artificial intelligence as a remedy for the lack of moral enhancement

Explore related subjects

Introduction

Computational rationality

Artificial moral agency

Artificial moral agency within a framework of computational rationality

A model for an optimally-bounded, computationally rational AMA

Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation