Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers

BOOK: Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers
2.41Mb size Format: txt, pdf, ePub

Nine Algorithms That
Changed the Future

THE INGENIOUS IDEAS THAT
DRIVE TODAY'S COMPUTERS

John MacCormick

P  R  I  N  C  E  T  O  N    U  N  I  V  E  R  S  I  T  Y    P  R  E  S  S    
P  R  I  N  C  E  T  O  N    A  N  D    O  X  F  O  R  D

Copyright © 2012 by Princeton University Press

Published by Princeton University Press,
41 William Street, Princeton, New Jersey 08540

In the United Kingdom: Princeton University Press,
6 Oxford Street, Woodstock, Oxfordshire OX20 1TW

All Rights Reserved

Library of Congress Cataloging-in-Publication Data

MacCormick, John, 1972–
   Nine algorithms that changed the future : the ingenious ideas that drive
   today's computers / John MacCormick.
      p.     cm.
   Includes bibliographical references and index.
   ISBN 978-0-691-14714-7 (hardcover : alk. paper)
   1. Computer science. 2. Computer algorithms.
   3. Artificial intelligence. I. Title.

QA76M21453 2012
006.3-dc22                      2011008867

A catalogue record for this book is available from the British Library

This book has been composed in Lucida using T
E
X

Typeset by T&T Productions Ltd, London

Printed on acid-free paper

press.princeton.edu

Printed in the United States of America

10  9  8  7  6  5  4  3  2  1

The world has arrived at an age of cheap complex devices of
great reliability; and something is bound to come of it.

—Vannevar Bush, “As We May Think,” 1945

FOREWORD

Computing is transforming our society in ways that are as profound as the changes wrought by physics and chemistry in the previous two centuries. Indeed, there is hardly an aspect of our lives that hasn't already been influenced, or even revolutionized, by digital technology. Given the importance of computing to modern society, it is therefore somewhat paradoxical that there is so little awareness of the fundamental concepts that make it all possible. The study of these concepts lies at the heart of computer science, and this new book by MacCormick is one of the relatively few to present them to a general audience.

One reason for the relative lack of appreciation of computer science as a discipline is that it is rarely taught in high school. While an introduction to subjects such as physics and chemistry is generally considered mandatory, it is often only at the college or university level that computer science can be studied in its own right. Furthermore, what is often taught in schools as “computing” or “ICT” (information and communication technology) is generally little more than skills training in the use of software packages. Unsurprisingly, pupils find this tedious, and their natural enthusiasm for the use of computer technology in entertainment and communication is tempered by the impression that the creation of such technology is lacking in intellectual depth. These issues are thought to be at the heart of the 50 percent decline in the number of students studying computer science at university over the last decade. In light of the crucial importance of digital technology to modern society, there has never been a more important time to re-engage our population with the fascination of computer science.

In 2008 I was fortunate in being selected to present the 180th series of Royal Institution Christmas Lectures, which were initiated by Michael Faraday in 1826. The 2008 lectures were the first time they had been given on the theme of computer science. When preparing these lectures I spent much time thinking about how to explain computer science to a general audience, and realized that there are very few resources, and almost no popular books, that address this need. This new book by MacCormick is therefore particularly welcome.

MacCormick has done a superb job of bringing complex ideas from computer science to a general audience. Many of these ideas have an extraordinary beauty and elegance which alone makes them worthy of attention. To give just one example: the explosive growth of web-based commerce is only possible because of the ability to send confidential information (such as credit card numbers, for example) secretly and securely across the Internet. The fact that secure communication can be established over “open” channels was for decades thought to be an intractable problem. When a solution was found, it turned out to be remarkably elegant, and is explained by MacCormick using precise analogies that require no prior knowledge of computer science. Such gems make this book an invaluable contribution to the popular science bookshelf, and I highly commend it.

Chris Bishop
Distinguished Scientist,
Microsoft Research Cambridge
Vice President,
The Royal Institution of Great Britain
Professor of Computer Science,
University of Edinburgh

1

Introduction: What Are the Extraordinary Ideas Computers Use Every Day?

This is a gift that I have…a foolish extravagant spirit, full of forms, figures, shapes, objects, ideas, apprehensions, motions, revolutions.

—W
ILLIAM
S
HAKESPEARE
,
Love's Labour's Lost

How were the great ideas of computer science born? Here's a selection:

• In the 1930s, before the first digital computer has even been built, a British genius founds the field of computer science, then goes on to prove that certain problems cannot be solved by any computer to be built in the future, no matter how fast, powerful, or cleverly designed.

• In 1948, a scientist working at a telephone company publishes a paper that founds the field of information theory. His work will allow computers to transmit a message with perfect accuracy even when most of the data is corrupted by interference.

• In 1956, a group of academics attend a conference at Dartmouth with the explicit and audacious goal of founding the field of artificial intelligence. After many spectacular successes and numerous great disappointments, we are still waiting for a truly intelligent computer program to emerge.

• In 1969, a researcher at IBM discovers an elegant new way to structure the information in a database. The technique is now used to store and retrieve the information underlying most online transactions.

• In 1974, researchers in the British government's lab for secret communications discover a way for computers to communicate securely even when another computer can observe everything that passes between them. The researchers are bound by government secrecy—but fortunately, three American professors independently discover and extend this astonishing invention that underlies all secure communication on the internet.

• In 1996, two Ph.D. students at Stanford University decide to collaborate on building a web search engine. A few years later, they have created Google, the first digital giant of the internet era.

As we enjoy the astonishing growth of technology in the 21st century, it has become impossible to use a computing device—whether it be a cluster of the most powerful machines available or the latest, most fashionable handheld device—without relying on the fundamental ideas of computer science, all born in the 20th century. Think about it: have
you
done anything impressive today? Well, the answer depends on your point of view. Have you, perhaps, searched a corpus of billions of documents, picking out the two or three that are most relevant to your needs? Have you stored or transmitted many millions of pieces of information, without making a single mistake—despite the electromagnetic interference that affects all electronic devices? Did you successfully complete an online transaction, even though many thousands of other customers were simultaneously hammering the same server? Did you communicate some confidential information (for example, your credit card number) securely over wires that can be snooped by dozens of other computers? Did you use the magic of compression to reduce a multimegabyte photo down to a more manageable size for sending in an e-mail? Or did you, without even thinking about it, exploit the artificial intelligence in a hand-held device that self-corrects your typing on its tiny keyboard?

Each of these impressive feats relies on the profound discoveries listed earlier. Thus, most computer users employ these ingenious ideas many times every day, often without even realizing it! It is the objective of this book to explain these concepts—the great ideas of computer science that we use every day—to the widest possible audience. Each concept is explained without assuming any knowledge of computer science.

ALGORITHMS: THE BUILDING BLOCKS OF THE GENIUS AT YOUR FINGERTIPS

So far, I've been talking about great “ideas” of computer science, but computer scientists describe many of their important ideas as “algorithms.” So what's the difference between an idea and an algorithm? What, indeed,
is
an algorithm? The simplest answer to this question is to say that an algorithm is a precise recipe that specifies the exact sequence of steps required to solve a problem. A great example of this is an algorithm we all learn as children in school: the algorithm for adding two large numbers together. An example is shown above. The algorithm involves a sequence of steps that starts off something like this: “First, add the final digits of the two numbers together, write down the final digit of the result, and carry any other digits into the next column on the left; second, add the digits in the next column together, add on any carried digits from the previous column…”—and so on.

The first two steps in the algorithm for adding two numbers.

Note the almost mechanical feel of the algorithm's steps. This is, in fact, one of the key features of an algorithm: each of the steps must be absolutely precise, requiring no human intuition or guesswork. That way, each of the purely mechanical steps can be programmed into a computer. Another important feature of an algorithm is that it always works, no matter what the inputs. The addition algorithm we learned in school does indeed have this property: no matter what two numbers you try to add together, the algorithm will eventually yield the correct answer. For example, although it would take a rather long time, you could certainly use this algorithm to add two 1000-digit numbers together.

You may be a little curious about this definition of an algorithm as a precise, mechanical recipe. Exactly how precise does the recipe need to be? What fundamental operations are permitted? For example, in the addition algorithm above, is it okay to simply say “add the two digits together,” or do we have to somehow specify the entire set of addition tables for single-digit numbers? These details might seem innocuous or perhaps even pedantic, but it turns out that nothing could be further from the truth: the real answers to these questions lie right at the heart of computer science and also have connections to philosophy, physics, neuroscience, and genetics. The deep questions about what an algorithm really is all boil down to a proposition known as the
Church-Turing thesis.
We will revisit these issues in
chapter 10
, which discusses the theoretical limits of computation and some aspects of the Church-Turing thesis. Meanwhile, the informal notion of an algorithm as a very precise recipe will serve us perfectly well.

Now we know what an algorithm is, but what is the connection to computers? The key point is that computers need to be programmed with very precise instructions. Therefore, before we can get a computer to solve a particular problem for us, we need to develop an algorithm for that problem. In other scientific disciplines, such as mathematics and physics, important results are often captured by a single formula. (Famous examples include the Pythagorean theorem,
a
2
+ b
2
=
c
2
, or Einstein's
E =
mc
2
.) In contrast, the great ideas of computer science generally describe
how
to solve a problem—using an algorithm, of course. So, the main purpose of this book is to explain what makes your computer into your own personal genius: the great algorithms your computer uses every day.

WHAT MAKES A GREAT ALGORITHM?

This brings us to the tricky question of which algorithms are truly “great.” The list of potential candidates is rather large, but I've used a few essential criteria to whittle down that list for this book. The first and most important criterion is that the algorithms are used by ordinary computer users every day. The second important criterion is that the algorithms should address concrete, real-world problems—problems like compressing a particular file or transmitting it accurately over a noisy link. For readers who already know some computer science, the box on the next page explains some of the consequences of these first two criteria.

The third criterion is that the algorithms relate primarily to the
theory
of computer science. This eliminates techniques that focus on computer hardware, such as CPUs, monitors, and networks. It also reduces emphasis on design of infrastructure such as the internet. Why do I choose to focus on computer science theory? Part of my motivation is the imbalance in the public's perception of computer science: there is a widespread belief that computer science is mostly about programming (i.e., “software”) and the design of gadgets (i.e., “hardware”). In fact, many of the most beautiful ideas in computer science are completely abstract and don't fall in either of these categories. By emphasizing these theoretical ideas, it is my hope that more people will begin to understand the nature of computer science as an intellectual discipline.

You may have noticed that I've been listing criteria to eliminate potential great algorithms, while avoiding the much more difficult issue of defining greatness in the first place. For this, I've relied on my own intuition. At the heart of every algorithm explained in the book is an ingenious trick that makes the whole thing work. The presence of an “aha” moment, when this trick is revealed, is what makes the explanation of these algorithms an exhilarating experience for me and hopefully also for you. Since I'll be using the word “trick” a great deal, I should point out that I'm not talking about the kind of tricks that are mean or deceitful—the kind of trick a child might play on a younger brother or sister. Instead, the tricks in this book resemble tricks of the trade or even magic tricks: clever techniques for accomplishing goals that would otherwise be difficult or impossible.

The first criterion—everyday use by ordinary computer users—eliminates algorithms used primarily by computer professionals, such as compilers and program verification techniques. The second criterion—concrete application to a specific problem—eliminates many of the great algorithms that are central to the undergraduate computer science curriculum. This includes sorting algorithms like quicksort, graph algorithms such as Dijkstra's shortest-path algorithm, and data structures such as hash tables. These algorithms are indisputably great and they easily meet the first criterion, since most application programs run by ordinary users employ them repeatedly. But these algorithms are generic: they can be applied to a vast array of different problems. In this book, I have chosen to focus on algorithms for specific problems, since they have a clearer motivation for ordinary computer users.

Some additional details about the selection of algorithms for this book. Readers of this book are not expected to know any computer science. But if you do have a background in computer science, this box explains why many of your old favorites aren't covered in the book.

Thus, I've used my own intuition to pick out what I believe are the most ingenious, magical tricks out there in the world of computer science. The British mathematician G. H. Hardy famously put it this way in his book
A Mathematician's Apology
, in which he tried to explain to the public why mathematicians do what they do: “Beauty is the first test: there is no permanent place in the world for ugly mathematics.” This same test of beauty applies to the theoretical ideas underlying computer science. So the final criterion for the algorithms presented in this book is what we might call Hardy's beauty test: I hope I have succeeded in conveying to the reader at least some portion of the beauty that I personally feel is present in each of the algorithms.

Let's move on to the specific algorithms I chose to present. The profound impact of search engines is perhaps the most obvious example of an algorithmic technology that affects all computer users, so it's not surprising that I included some of the core algorithms of web search.
Chapter 2
describes how search engines use
indexing
to find documents that match a query, and
chapter 3
explains
PageRank—
the original version of the algorithm used by Google to ensure that the most relevant matching documents are at the top of the results list.

Even if we don't stop to think about it very often, most of us are at least
aware
that search engines are using some deep computer science ideas to provide their incredibly powerful results. In contrast, some of the other great algorithms are frequently invoked without the computer user even realizing it. Public key cryptography, described in
chapter 4
, is one such algorithm. Every time you visit a secure website (with https instead of http at the start of its address), you use the aspect of public key cryptography known as
key exchange
to set up a secure session.
Chapter 4
explains how this key exchange is achieved.

The topic of
chapter 5
, error correcting codes, is another class of algorithms that we use constantly without realizing it. In fact, error correcting codes are probably the single most frequently used great idea of all time. They allow a computer to recognize
and correct
errors in stored or transmitted data, without having to resort to a backup copy or a retransmission. These codes are everywhere: they are used in all hard disk drives, many network transmissions, on CDs and DVDs, and even in some computer memories—but they do their job so well that we are never even aware of them.

Other books

Yesterday's Lies by Lisa Jackson
Sky Island by L. Frank Baum
Rough Ride by Kimmage, Paul
Quest Maker by Laurie McKay
Bonds of Earth by G. N. Chevalier