Decision Support for Safe AI Design

Hibbard, Bill

doi:10.1007/978-3-642-35506-6_13

Bill Hibbard²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7716))

Included in the following conference series:

International Conference on Artificial General Intelligence

1318 Accesses
3 Citations

Abstract

There is considerable interest in ethical designs for artificial intelligence (AI) that do not pose risks to humans. This paper proposes using elements of Hutter’s agent-environment framework to define a decision support system for simulating, visualizing and analyzing AI designs to understand their consequences. The simulations do not have to be accurate predictions of the future; rather they show the futures that an agent design predicts will fulfill its motivations and that can be explored by AI designers to find risks to humans. In order to safely create a simulation model this paper shows that the most probable finite stochastic program to explain a finite history is finitely computable, and that there is an agent that makes such a computation without any unintended instrumental actions. It also discusses the risks of running an AI in a simulated environment.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

The “big red button” is too late: an alternative model for the ethical evaluation of AI systems

Article 23 January 2018

Designing Ethical AI in the Shadow of Hume’s Guillotine

Is AI a Problem for Forward Looking Moral Responsibility? The Problem Followed by a Solution

Keywords

References

Asimov, I.: Runaround. Astounding Science Fiction (1942)
Google Scholar
Bostrom, N.: Ethical issues in advanced artificial intelligence. In: Smit, I., et al. (eds.) Cognitive, Emotive and Ethical Aspects of Decision Making in Humans and in Artificial Intelligence, vol. 2, pp. 12–17. Int. Inst. of Adv. Studies in Sys. Res. and Cybernetics (2003)
Google Scholar
Bostrom, N.: The superintelligent will: Motivation and instrumental rationality in advanced artificial agents. Minds and Machines (forthcoming)
Google Scholar
Chalmers, D.: The Singularity: A Philosophical Analysis. J. Consciousness Studies 17, 7–65 (2010)
Google Scholar
Dewey, D.: Learning What to Value. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 309–314. Springer, Heidelberg (2011)
Chapter Google Scholar
Elliott, G.: US Nuclear Weapon Safety and Control. MIT Program in Science, Technology, and Society (2005), http://web.mit.edu/gelliott/Public/sts.072/paper.pdf
Ghahramani, Z.: Learning Dynamic Bayesian Networks. In: Giles, C.L., Gori, M. (eds.) IIASS-EMFCSC-School 1997. LNCS (LNAI), vol. 1387, pp. 168–197. Springer, Heidelberg (1998)
Chapter Google Scholar
Goertzel, B.: Universal ethics: the foundations of compassion in pattern dynamics (2004), http://www.goertzel.org/papers/UniversalEthics.html
Hibbard, B., Santek, D.: The Vis5D system for easy interactive visualization. In: Proc. IEEE Visualization 1990, pp. 129–134 (1990)
Google Scholar
Hibbard, B.: Super-intelligent machines. Computer Graphics 35(1), 11–13 (2001)
Article Google Scholar
Hibbard, B.: The technology of mind and a new social contract. J. Evolution and Technology 17(1), 13–22 (2008)
Google Scholar
Hibbard, B.: Temptation. Rejected for the AGI-09 Workshop on the Future of AI (2009), https://sites.google.com/site/whibbard/g/hibbard_agi09_workshop.pdf
Hibbard, B.: Model-based utility functions. J. Artificial General Intelligence 3(1), 1–24 (2012a)
Article Google Scholar
Hibbard, B.: Avoiding Unintended AI Behaviors. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 107–116. Springer, Heidelberg (2012), https://sites.google.com/site/whibbard/g/hibbard_agi12a.pdf
Google Scholar
Hutter, M.: Universal artificial intelligence: sequential decisions based on algorithmic probability. Springer, Heidelberg (2005)
MATH Google Scholar
Hutter, M.: Feature reinforcement learning: Part I. Unstructured MDPs. J. Artificial General Intelligence 1, 3–24 (2009a)
Article Google Scholar
Hutter, M.: Feature dynamic Bayesian networks. In: Goertzel, B., Hitzler, P., Hutter, M. (eds.) Proc. Second Conf. on AGI, AGI 2009, pp. 67–72. Atlantis Press, Amsterdam (2009b)
Google Scholar
Kurzweil, R.: The singularity is near. Penguin, New York (2005)
Google Scholar
Li, M., Vitanyi, P.: An introduction to Kolmogorov complexity and its applications. Springer, Heidelberg (1997)
MATH Google Scholar
Lloyd, S.: Computational Capacity of the Universe. Phys. Rev. Lett. 88, 237901 (2002)
Article MathSciNet Google Scholar
Muehlhauser, L., Helm, L.: The singularity and machine ethics. In: Eden, Søraker, Moor, Steinhart (eds.) The Singularity Hypothesis: a Scientific and Philosophical Assessment. Springer, Heidleberg (2012)
Google Scholar
Omohundro, S.: The basic AI drive. In: Wang, P., Goertzel, B., Franklin, S. (eds.) Proc. First Conf. on AGI, AGI 2008, pp. 483–492. IOS Press, Amsterdam (2008)
Google Scholar
Puterman, M.L.: Markov Decision Processes - Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Book MATH Google Scholar
Ring, M., Orseau, L.: Delusion, Survival, and Intelligent Agents. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 11–20. Springer, Heidelberg (2011)
Chapter Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press (1998)
Google Scholar
Waser, M.: Designing a safe motivational system for intelligent machines. In: Baum, E., Hutter, M., Kitzelmann, E. (eds.) Proc. Third Conf. on AGI, AGI 2010, pp. 170–175. Atlantis Press, Amsterdam (2010)
Google Scholar
Waser, M.: Rational Universal Benevolence: Simpler, Safer, and Wiser Than “Friendly AI”. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 153–162. Springer, Heidelberg (2011)
Chapter Google Scholar
Yudkowsky, E.: (2004), http://www.sl4.org/wiki/CoherentExtrapolatedVolition

Download references

Author information

Authors and Affiliations

SSEC, University of Wisconsin, Madison, WI, 53706, USA
Bill Hibbard

Authors

Bill Hibbard
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Humboldt Universität Berlin, Raumerstr. 11, 10437, Berlin, Germany
Joscha Bach
Aidyia Ltd., Unit 612, 6/F, Lu Plaza, 2 Wing Yip Street, Kwun Tong, Hong Kong
Ben Goertzel
Adams State University, Suite 3060, 81101, Alamosa, CO, USA
Matthew Iklé

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hibbard, B. (2012). Decision Support for Safe AI Design. In: Bach, J., Goertzel, B., Iklé, M. (eds) Artificial General Intelligence. AGI 2012. Lecture Notes in Computer Science(), vol 7716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35506-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-35506-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35505-9
Online ISBN: 978-3-642-35506-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Decision Support for Safe AI Design

Abstract

Chapter PDF

Similar content being viewed by others

The “big red button” is too late: an alternative model for the ethical evaluation of AI systems

Designing Ethical AI in the Shadow of Hume’s Guillotine

Is AI a Problem for Forward Looking Moral Responsibility? The Problem Followed by a Solution

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Decision Support for Safe AI Design

Abstract

Chapter PDF

Similar content being viewed by others

The “big red button” is too late: an alternative model for the ethical evaluation of AI systems

Designing Ethical AI in the Shadow of Hume’s Guillotine

Is AI a Problem for Forward Looking Moral Responsibility? The Problem Followed by a Solution

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation