with Friendly AI issues. They themselves assume that if they are good-intentioned people the AIs they make are automatically good intentioned, and this is not true. It’s actually a very difficult mathematical and engineering problem. I think most of them are just insufficiently good at thinking of uncomfortable thoughts. They started out not thinking, ‘Friendly AI is a problem that will kill you.’”
Yudkowsky said that AI makers are infected by the idea of a blissful AI-enhanced future that lives in their imaginations. They have been thinking about it since the AI bug first bit them.
“They do not want to hear anything that contradicts that. So if you present unfriendly AI to them it bounces off. As the old proverb goes, most of the damage is done by people who wish to feel themselves important. Many ambitious people find it far less scary to think about destroying the world than to think about never amounting to much of anything at all. All the people I have met who think they are going to win eternal fame through their AI projects have been like this.”
These AI makers aren’t mad scientists or people any different from you and me—you’ll meet several in this book. But recall the availability bias from chapter 2. When faced with a decision, humans will choose the option that’s recent, dramatic, or otherwise front and center. Annihilation by AI isn’t generally available to AI makers. Not as available as making advances in their field, getting tenure, publishing, getting rich, and so on.
In fact, not many AI makers, in contrast to AI theorists, are concerned with building Friendly AI. With one exception, none of the dozen or so AI makers I’ve spoken with are worried enough to work on Friendly AI or any other defensive measure. Maybe the thinkers overestimate the problem, or maybe the makers’ problem is not knowing what they don’t know. In a much-read online paper, Yudkowsky put it like this:
The human species came into existence through natural selection, which operates through the nonchance retention of chance mutations. One path leading to global catastrophe—to someone pressing the button with a mistaken idea of what the button does—is that Artificial Intelligence comes about through a similar accretion of working algorithms, with the researchers having no deep understanding of how the combined system works. [italics mine]
Not knowing how to build a Friendly AI is not deadly, of itself.… It’s the mistaken belief that an AI will be friendly which implies an obvious path to global catastrophe.
Assuming that human-level AIs (AGIs) will be friendly is wrong for a lot of reasons. The assumption becomes even more dangerous after the AGI’s intelligence rockets past ours, and it becomes ASI—artificial superintelligence. So how do you create friendly AI? Or could you impose friendliness on advanced AIs after they’re already built? Yudkowsky has written a book-length online treatise about these questions entitled Creating Friendly AI: The Analysis and Design of Benevolent Goal Architectures. Friendly AI is a subject so dense yet important it exasperates its chief proponent himself, who says about it, “it only takes one error for a chain of reasoning to end up in Outer Mongolia.”
Let’s start with a simple definition. Friendly AI is AI that has a positive rather than a negative impact on mankind. Friendly AI pursues goals, and it takes action to fulfill those goals. To describe an AI’s success at achieving its goals, theorists use a term from economics: utility. As you might recall from Econ 101, consumers behaving rationally seek to maximize utility by spending their resources in the way that gives them the most satisfaction. Generally speaking, for an AI, satisfaction is gained by achieving goals, and an act that moves it toward achieving its goals has high “utility.”
Values and preferences in addition to goal satisfaction can be packed into an AI’s definition of utility,
Ron Foster
Suzanne Williams
A.J. Downey
Ava Lore
Tami Hoag
Mark Miller
Jeffrey A. Carver
Anne Perry
Summer Lee
RC Boldt