Authors: Erik Brynjolfsson,Andrew McAfee
•
When is the next Halley’s Comet?
Responded, “You have no meetings matching Halley’s.”
•
I want to go to Lake Superior.
Responded with directions to the company Lake Superior X-Ray.
9
Siri’s sometimes bizarre and frustrating responses became well known, but the power of the technology is undeniable. It can come to your aid exactly when you need it. On the same trip that afforded us some time in an autonomous car, we saw this firsthand. After a meeting in San Francisco, we hopped in our rental car to drive down to Google’s headquarters in Mountain View. We had a portable GPS device with us, but didn’t plug it in and turn it on because we thought we knew how to get to our next destination.
We didn’t, of course. Confronted with an Escherian maze of elevated highways, off-ramps, and surface streets, we drove around looking for an on-ramp while tensions mounted. Just when our meeting at Google, this book project, and our professional relationship seemed in serious jeopardy, Erik pulled out his phone and asked Siri for “directions to U.S. 101 South.” The phone responded instantly and flawlessly: the screen turned into a map showing where we were and how to find the elusive on-ramp.
We could have pulled over, found the portable GPS and turned it on, typed in our destination, and waited for our routing, but we didn’t want to exchange information that way. We wanted to speak a question and hear and see (because a map was involved) a reply. Siri provided exactly the natural language interaction we were looking for. A 2004 review of the previous half-century’s research in automatic speech recognition (a critical part of natural language processing) opened with the admission that “Human-level speech recognition has proved to be an elusive goal,” but less than a decade later major elements of that goal have been reached. Apple and other companies have made robust natural language processing technology available to hundreds of millions of people via their mobile phones.
10
As noted by Tom Mitchell, who heads the machine-learning department at Carnegie Mellon University: “We’re at the beginning of a ten-year period where we’re going to transition from computers that can’t understand language to a point where computers can understand quite a bit about language.”
11
Digital Fluency: The Babel Fish Goes to Work
Natural language processing software is still far from perfect, and computers are not yet as good as people at complex communication, but they’re getting better all the time. And in tasks like translation from one language to another, surprising developments are underway: while computers’ communication abilities are not as deep as those of the average human being, they’re much broader.
A person who speaks more than one language can usually translate between them with reasonable accuracy. Automatic translation services, on the other hand, are impressive but rarely error-free. Even if your French is rusty, you can probably do better than Google Translate with the sentence “Monty Python’s ‘Dirty Hungarian Phrasebook’ sketch is one of their funniest ones.” Google offered, “Sketch des Monty Python ‘Phrasebook sale hongrois’ est l’un des plus drôles les leurs.” This conveys the main gist, but has serious grammatical problems.
There is less chance you could have made progress translating this sentence (or any other) into Hungarian, Arabic, Chinese, Russian, Norwegian, Malay, Yiddish, Swahili, Esperanto, or any of the other sixty-three languages besides French that are part of the Google Translate service. But Google will attempt a translation of text from any of these languages into any other, instantaneously and at no cost for anyone with Web access.
12
The Translate service’s smartphone app lets users speak more than fifteen of these languages into the phone and, in response, will produce synthesized, translated speech in more than half of the fifteen. It’s a safe bet that even the world’s most multilingual person can’t match this breadth.
For years instantaneous translation utilities have been the stuff of science fiction (most notably
The Hitchhiker’s Guide to the Galaxy’s
Babel Fish, a strange creature that once inserted in the ear allows a person to understand speech in any language).
13
Google Translate and similar services are making it a reality today. In fact, at least one such service is being used right now to facilitate international customer service interactions. The translation services company Lionbridge has partnered with IBM to offer GeoFluent, an online application that instantly translates chats between customers and troubleshooters who do not share a language. In an initial trial, approximately 90 percent of GeoFluent users reported that it was good enough for business purposes.
14
Human Superiority in
Jeopardy!
Computers are now combining pattern matching with complex communication to quite literally beat people at their own games. In 2011, the February 14 and 15 episodes of the TV game show
Jeopardy!
included a contestant that was not a human being. It was a supercomputer called Watson, developed by IBM specifically to play the game (and named in honor of legendary IBM CEO Thomas Watson, Sr.).
Jeopardy!
debuted in 1964 and in 2012 was the fifth most popular syndicated TV program in America.
15
On a typical day almost 7 million people watch host Alex Trebek ask trivia questions on various topics as contestants vie to be the first to answer them correctly.
*
The show’s longevity and popularity stem from its being easy to understand yet extremely hard to play well. Almost everyone knows the answers to some of the questions in a given episode, but very few people know the answers to almost all of them. Questions cover a wide range of topics, and contestants are not told in advance what those topics will be. Players also have to be simultaneously fast, bold, and accurate—fast because they compete against one another for the chance to answer each question; bold because they have to try to answer a lot of questions, especially harder ones, in order to accumulate enough money to win; and accurate because money is subtracted for each incorrect answer.
Jeopardy!
’s producers further challenge contestants with puns, rhymes, and other kinds of wordplay. A clue might ask, for example, for “A rhyming reminder of the past in the city of the NBA’s Kings.”
16
To answer correctly, a player would have to know what the acronym NBA stood for (in this case, it’s the National Basketball Association, not the National Bank Act or chemical compound n-Butylamine), which city the NBA’s Kings play in (Sacramento), and that the clue’s demand for a
rhyming
reminder of the past meant that the right answer is “What is a Sacramento memento?” instead of a “Sacramento souvenir” or any other factually correct response. Responding correctly to clues like these requires mastery of pattern matching and complex communication. And winning at
Jeopardy!
requires doing both things repeatedly, accurately, and almost instantaneously.
During the 2011 shows, Watson competed against Ken Jennings and Brad Rutter, two of the best knowledge workers in this esoteric industry. Jennings won
Jeopardy!
a record seventy-four times in a row in 2004, taking home more than $3,170,000 in prize money and becoming something of a folk hero along the way.
17
In fact, Jennings is sometimes given credit for the existence of Watson.
18
According to one story circulating within IBM, Charles Lickel, a research manager at the company interested in pushing the frontiers of artificial intelligence, was having dinner in a steakhouse in Fishkill, New York, one night in the fall of 2004. At 7 p.m., he noticed that many of his fellow diners got up and went into the adjacent bar. When he followed them to find out what was going on, he saw that they were clustered in front of the bar’s TV watching Jennings extend his winning streak beyond fifty matches. Lickel saw that a match between Jennings and a
Jeopardy!
-playing supercomputer would be extremely popular, in addition to being a stern test of a computer’s pattern matching and complex communication abilities.
Since
Jeopardy!
is a three-way contest, the ideal third contestant would be Brad Rutter, who beat Jennings in the show’s 2005 Ultimate Tournament of Champions and won more than $3,400,000.
19
Both men had packed their brains with information of all kinds, were deeply familiar with the game and all of its idiosyncrasies, and knew how to handle pressure.
These two humans would be tough for any machine to beat, and the first versions of Watson weren’t even close. Watson could be ‘tuned’ by its programmers to be either more aggressive in answering questions (and hence more likely to be wrong) or more conservative and accurate. In December 2006, shortly after the project started, when Watson was tuned to try to answer 70 percent of the time (a relatively aggressive approach) it was only able to come up with the right response approximately 15 percent of the time. Jennings, in sharp contrast, answered about 90 percent of questions correctly in games when he buzzed in first (in other words, won the right to respond) 70 percent of the time.
20
But Watson turned out to be a very quick learner. The supercomputer’s performance on the aggression vs. accuracy tradeoff improved quickly, and by November 2010, when it was aggressive enough to win the right to answer 70 percent of a simulated match’s total questions, it answered about 85 percent of them correctly. This was impressive improvement, but it still didn’t put the computer in the same league as the best human players. The Watson team kept working until mid-January of 2011, when the matches were recorded for broadcast in February, but no one knew how well their creation would do against Jennings and Rutter.
Watson trounced them both. It correctly answered questions on topics ranging from “Olympic Oddities” (responding “pentathlon” to “A 1976 entry in the ‘modern’ this was kicked out for wiring his epee to score points without touching his foe”) to “Church and State” (realizing that the answers all contained one or the other of these words, the computer answered “gestate” when told “It can mean to develop gradually in the mind or to carry during pregnancy”). While the supercomputer was not perfect (for example, it answered “chic” instead of “class” when asked about “stylish elegance, or students who all graduated in the same year” as part of the category “Alternate Meanings”), it was very good.
Watson was also extremely fast, repeatedly buzzing in before Jennings and Rutter to win the right to answer questions. In the first of the two games played, for example, Watson buzzed in first 43 times, then answered correctly 38 times. Jennings and Rutter
combined
to buzz in only 33 times over the course of the same game.
21
At the end of the two-day tournament, Watson had amassed $77,147, more than three times as much as either of its human opponents. Jennings, who came in second, added a personal note on his answer to the tournament’s final question: “I for one welcome our new computer overlords.” He later elaborated, “Just as factory jobs were eliminated in the twentieth century by new assembly-line robots, Brad and I were the first knowledge-industry workers put out of work by the new generation of ‘thinking’ machines. ‘Quiz show contestant’ may be the first job made redundant by Watson, but I’m sure it won’t be the last.”
22
The Paradox of Robotic ‘Progress’
A final important area where we see a rapid recent acceleration in digital improvement is robotics—building machines that can navigate through and interact with the physical world of factories, warehouses, battlefields, and offices. Here again we see progress that was very gradual, then sudden.
The word
robot
entered the English language via the 1921 Czech play,
R.U.R.
(Rossum’s “Universal” Robots) by Karel Capek, and automatons have been an object of human fascination ever since.
23
During the Great Depression, magazine and newspaper stories speculated that robots would wage war, commit crimes, displace workers, and even beat boxer Jack Dempsey.
24
Isaac Asimov coined the term
robotics
in 1941 and provided ground rules for the young discipline the following year with his famous Three Laws of Robotics:
1.
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
2.
A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
3.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.
25
Asimov’s enormous influence on both science fiction and real-world robot-making has persisted for seventy years. But one of those two communities has raced far ahead of the other. Science fiction has given us the chatty and loyal R2-D2 and C-3PO,
Battlestar Galactica
’s ominous Cylons, the terrible Terminator, and endless varieties of androids, cyborgs, and replicants. Decades of robotics research, in contrast, gave us Honda’s ASIMO, a humanoid robot best known for a spectacularly failed demo that showcased its inability to follow Asimov’s third law. At a 2006 presentation to a live audience in Tokyo, ASIMO attempted to walk up a shallow flight of stairs that had been placed on the stage. On the third step, the robot’s knees buckled and it fell over backward, smashing its faceplate on the floor.
26
ASIMO has since recovered and demonstrated skills like walking up and down stairs, kicking a soccer ball, and dancing, but its shortcomings highlight a broad truth: a lot of the things humans find easy and natural to do in the physical world have been remarkably difficult for robots to master. As the roboticist Hans Moravec has observed, “It is comparatively easy to make computers exhibit adult-level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.”
27