In 1950 Alan Turing famously posed a litmus test for machine intelligence: A person conversing in a computer-chat-like manner must determine if the other side of the conversation is human or machine. If the difference can’t be detected, then the machine has achieved “intelligence.”
A passing grade on the Turing test is now often conflated with “the Singularity,” a label made most famous by the futurist and technology pioneer Ray Kurzweil for the moment when “strong AI” is achieved — that is, when a machine equals human intelligence in all its many manifestations.
There are highly regarded voices, among them the science sophisticates Stephen Hawking and Elon Musk, cautioning that we are seeing the beginning of the end. Their primary concern is that an exponentially fast, snowballing self-improvement cycle of intelligence will result in a “superintelligence” that will very possibly have no need for its relatively slow-witted forebears — i.e., humans. And in the short run, they fear that the human tendency to abdicate from work or responsibility might lead to dystopian scenarios, like the creation of amoral autonomous weapons, the subject of a recent open letter calling for “a ban on offensive autonomous weapons beyond meaningful human control.” The letter currently has more than 1,000 signatories, including Musk, Hawking, and many AI researchers.
For analytical and some physical tasks the goal is to beat and not imitate the human.
What does it mean for a machine to have achieved human intelligence? As controversies over IQ and SATs show, we humans can be notoriously poor evaluators of intelligence. Most broad definitions or measuring sticks fail in many ways. Early research in AI focused on narrow intellectual tasks like chess and checkers. Then, researchers began to realize that there were all kinds of intelligences that posed huge challenges to engineering and mathematics. There’s the physical intelligence of being able to pick your way through the brush in the woods, shake another person’s hand, or pick up an egg. There’s the social intelligence required to understand and carry on a conversation or make a joke or to realize that the person you are speaking with is sad or happy. And there’s the intelligence of nonhuman species for whom arithmetic testing is a nonstarter and a rather unintelligent anthropomorphism.
The Turing test label might be most constructively attached to tasks around social or emotional intelligence. After all, a computer that plays chess like a human would lose some of the time — and in very un-computer-like ways. A machine that does arithmetic like a human would make a mistake. For analytical and some physical tasks the goal is to beat and not imitate the human. These are gauged not by the Turing test but by another kind — with a nod to Hollywood, let’s call it the Terminator test. The Terminator test gauges not whether an intelligence is a convincing likeness of a human’s but whether it replaces and possibly surpasses a human’s. A rapid cycle of self-improvement along Terminator-test dimensions does not necessarily spell the disaster-movie end of mankind, perhaps just faster and cheaper chess software.
The original Turing test requires an algorithm capable of generating humanlike textual responses on a screen. This is a highly restricted area of human performance in which we have made great advances over the years. That said, here too the “rules” are unclear and certainly a cultural moving target. A Turing test for a generation raised on text messaging and tweets might be very different from a test for Turing’s midcentury generation of British public-school students.
The engineering challenges around intelligence have paid extraordinary dividends. Better and better chess programs required new understandings in database search techniques. IBM’s Watson isn’t just a great Jeopardy! player, it also manifests important advances in machine learning. On the emotional and social side, the challenges of the original Turing test have generated a huge amount of important research in machine learning and technology (natural language processing, voice recognition, and voice synthesis software) as well as a greater understanding of how the mind works in particular domains.
Some of the most extensive AI efforts are now being directed toward the “visual Turing test.” The traditional test would require a machine to respond to a sequence of yes-no questions about an image, which would, in all likelihood, only weakly distinguish human from machine. But at some point, the challenge will require a machine to describe what’s going on in a photograph, either interactively or not. That is wildly complicated — image processing software is getting better and better at recognizing the objects, but understanding their interactions is a whole different ballgame. Are two people talking or fighting? Is the dog opening his mouth for a biscuit or about to bite your hand? Are you putting the cup on the table or taking it off? Understanding and communicating visual language and the effortless way in which we describe scenes to one another is an extraordinary cognitive accomplishment.
Progress on the visual Turing test will yield all kinds of dividends, some creepy (cue Big Brother), some helpful, and some mind-blowing. Recently researchers at the University of California at San Diego and the University of Toronto determined that machine-learning algorithms are better able to detect real pain through human facial expressions than were fellow humans. Humans got it right 55 percent of the time while machines performed at 85 percent accuracy. Meet the empathic Terminator.
The arguments we seem to be having around artificial intelligence are not unlike conversations around technology in the 1950s or those we continue to have around genetic engineering.
The example of genetics is instructive, for despite foolish and unscientific abuses like eugenics and forced sterilization, research continues apace. That is largely because we hope it will lead to a world of precise and painless medical intervention based on direct manipulation of the genome. And scientists and medical companies eagerly anticipate the accompanying payday.
The genetics community self-polices, as in a researcher-led National Academy of Sciences call for a moratorium on genome-editing. It also populates governmental entities like the Presidential Commission for the Study of Bioethical Issues. Where are the analogous organizations for a digital ethics? Various disciplines have long monitored the effects of mechanization on the work force and society. But few digital inventions have received the same level of scrutiny. Why?
One explanation is the focus on Turing’s question and the comparative neglect of the Terminator test. Intelligence is not unitary and need not be human or even in the service of human goals. We need to become sophisticated about a range of intelligences, including crowdsourced recommendation systems, surveillance systems, search engines, and, the focus of the open letter, autonomous weapons.
Many of our current concerns relate not to the smartest technologies but to ubiquitous and relatively dumb ones.
In a world increasingly lived online, these are the kinds of newly deployed forms of intelligence that are shaping our actions and interactions. In some cases, such as automated movie queues and algorithmic music selection, these seem less likely to amplify or support human values (e.g., privacy) than to substitute for and eliminate them. They can provide incredible advantages but they can also threaten to undermine judgment in the pursuit of economic efficiencies. The National Security Agency has a host of reasons for wanting a transparent society, but so does Madison Avenue.
If the road to the Singularity is paved through Turing tests and Terminator tests, should we halt AI work? Oddly, the loudest, or at least the most publicized, voices, are those of science pioneers who in one way or another (physically or financially) have benefited greatly from precisely the kinds of research they now cry out against. One might object to their remarks, which can sound like “this research approach was great for me, but I worry about you.” But it is more likely that these pioneers have seen very clearly how the ethical and social worries are catching up to, and perhaps even overtaking, the huge profits that have been or could be made from such technologies.
But when you think about it, many of our current concerns relate not to the smartest technologies but to ubiquitous and relatively dumb ones. Preoccupation with the ill-defined sci-fi version of the Singularity is a distraction from the real and actionable if difficult conversations we should have around the steady collection of our personal data and its seamless inclusion in commercial transactions, and about autonomous weapons.
In mathematics we also have a notion of a singularity. It is effectively the mathematical definition of a phenomenon like Hawkings’s black holes: places of no escape pocking a landscape that is generally benign and easily, enjoyably navigable. The way in which we deal with these phenomena mathematically is to investigate all sorts of approaches to the danger zone, gently probing the curvature around the singularity to assess the nature of phenomena nearby.
We should apply the same contextual approach we use in math to the AI Singularity. When we do, we’ll probably find that there’s not one definitive Turing test but a spectrum of Turing and Terminator tests. We’ll probably find, too, that, paradoxically, the Singularity isn’t so singular after all — that is, that there is not one Singularity, but many. And that in investigating them, we’ll have to engage multiple kinds of intelligence in assessing multiple kinds of intelligence.
To begin, let’s seek the humanity of our machines even as we investigate the machinery of our humanity.