Kahn and Crowther debated the gridlock question at length. Over points such as these Kahn's conceptual views came into open conflict with the pragmatic bent of the rest of the IMP Guys and cracked open a wide disagreement between them. The rest of the team just wanted to get the network up and running on schedule. As the network grew they'd have time to improve its performance, work out problems, and perfect the algorithms.
But Kahn persisted. “I could see things that to me were obvious flaws,” he said. “The most obvious one was that the network could deadlock.” Kahn was certain the network would lock up, and he told Heart and the others so immediately. They argued with him. “Bob was interested in the theory of things and the math, but he wasn't really interested in the implementation,” said Crowther. Crowther and Kahn began to talk it through, and the two had what Crowther described as “grand little fights.” The flow-control scheme wasn't designed for a huge network, and with a small number of nodes Crowther thought they could get by with it.
Heart thought that Kahn was worrying too much about hypothetical, unlikely network conditions. His approach was to wait and see. Some of the others thought Kahn didn't understand many of the problems with which they were grappling. “Some of the things he was suggesting were off-the-wall, just wrong,” said Ornstein. Kahn wanted to watch simulations of network traffic on a screen. He wanted to have a program that would show packets moving through the network. In fact, the packets would never move at a humanly observable speed; they'd be going “zip-zip” in microseconds and milliseconds. “We said, âBob, you'll never understand the problems looking at it that way.'” The other IMP Guys respected Kahn, but some believed he was going in the wrong direction. Gradually, they paid less attention to him. “Most of us in the group were trying to get Kahn out of our hair,” Ornstein said.
Heart scotched Kahn's suggestion that they use a simulation. Heart hated to see his programming team spend time on simulations or on writing anything but operational code. They were already becoming distracted by something else he dislikedâbuilding software tools. Heart feared delay. Over the years he had seen too many programmers captivated by tool building, and he had spent a career holding young engineers back from doing things that might waste money or time. The people in Heart's division knew that if they asked him for the okay to clock hours writing editors, assemblers, and debuggers, they would meet with stiff resistance, perhaps even a shouting match. So no one ever asked; they just did it, building tools when they thought it was the right thing to do, regardless of what Heart thought. This was software they would eventually need when the time came to test the system. All were customized pieces of programming, specifically designed for the ARPA project.
As summer peaked, a troubling problem loomed: BBN was still awaiting Honeywell's delivery of the first production IMP, with all the debugged interfaces built to BBN's specifications. The programming team had given up waiting and gone ahead with its work by loading a lower-grade development machine with a simulation program the team had designed to mimic the operations of the production model IMP and its I/O interfaces. Still, testing the software on the real machine was the preferred approach. And whenever the machine came in, Barker would first need time to debug it. The time left was dwindling. By late summer, the machine still hadn't crossed BBN's loading dock. Scheduled delivery of the IMP to California was now only a few weeks away, and BBN's own reputation was on the line.
Finally, about two weeks before Labor Day, Honeywell rushed the first ruggedized 516 IMP out its shop door and over to Cambridge. As soon as the machine touched the floor at BBN, Barker was ready to work on it. He powered up IMP Number One in the backroom.
Barker loaded the IMP diagnostic code. When he tried running it, nothing happened. The machine didn't respond. On closer inspection, it was apparent that the machine BBN received was not what it had ordered. This 516 had few of the modifications that Barker and Ornstein had worked out painstakingly in debugging the prototype; in fact, it was wired just like the original dysfunctional prototype had been wired. With the deadline closing in, Barker had only one recourse: fix it at BBN. This time at least he already knew where every wire should go. With the machine sitting in the middle of the large room, Barker went to work implementing all of the design modifications necessary to make it a functioning IMP.
Within a few days, Barker had coaxed the machine to life. He managed to activate the IMP's interfacesâwhereupon the computer began crashing at random intervals. The randomness of the crashes was unusually bad. Intermittent problems of this sort were the devil. The IMP would run for anywhere from twelve hours to forty hours at a stretch, then die and be somewhere “off in the boonies.” What to do? Recalled Ornstein, “We couldn't figure out what the hell was going on.”
As Labor Day approached, they pressured the IMP, putting it through as many hard tests as possible. It might run fine for twenty-four hours, then inexplicably die. Barker would look for a clue, chase what appeared to be a problem, fix it, and still the machine would crash again. With only a few days left before the delivery deadline, it looked like they were not going to make it.
Barker, who had been nursing the computer, suspected the problem was in the machine's timing chain. It was just a hunch.
The IMP had a clock used by the operating system to keep time in the machine, not as humans would by marking seconds, minutes or hours, but counting time in 1-microsecond (one million ticks per second) incrementsâfast for its day but a hundred times slower than today's personal computers. This clock provided a framework in which the IMP operated, and it regulated the computer's many functions synchronously. In a communications system, messages arrive unannounced; signals interrupt the machine asynchronously. Like a telephone call in the middle of dinner, an incoming packet shows up on its own schedule at the IMP's door and says, “Take me now.”
The computer had a sophisticated system for handling the incoming interruptions in a methodical manner, so as not to upset the synchronous operation of all its functions. If not properly designed, such synchronizers can be thrown off by an incoming signal occurring at just the wrong moment. Synchronizer bugs are rare. But when they occur and the synchronizer fails to respond properly to an interrupt, the consequences are profoundly disturbing to the machine's total operation. One might call it a nervous breakdown; computer scientists have another term for it: The synchronizer goes into a “metastable” condition. “Under such circumstances,” Ornstein said, “the machine invariably dies in a hopelessly confused stateâdifferent each time.”
Ornstein knew all too well about synchronizer bugs. He had dealt with the problem in the computer he and Wes Clark had built a few years earlier in St. Louis. Ornstein was the author of some of the first published papers on the subject, and was one of the few people in the world who actually had any experience with this particular gremlin.
Their unpredictability made synchronizer bugs among the most frustrating of bugs because of the absence of any recognizable pattern to the resulting crashes. Unlike most other problems that could cause computers to crash, a synchronizer bug left behind virtually no useful forensic evidence that might point a diagnostician to the problem. In fact, the absence of clues was one of the most useful clues. Furthermore, the failures caused by this bug were so infrequent (only once every day or so even in full-bore tests), that it was impossible to detect any evidence on an oscilloscope. Only the most astute debuggers had any idea what they were dealing with.
This seemed to be the problem Ornstein and Barker had on their hands. But who knew, because you couldn't actually trace it. What to do now? The Honeywell 516 had never been used in an application as demanding as the packet-switching network. It was a fast machine; the IMP Guys had chosen it precisely for its I/O capabilities. No one else was likely ever to see the problem in a typical application of the 516 computer. “If their machine died once a year,” Ornstein said, “they'd never notice. They would just restart.” But the IMP Guys were driving the machine hard. The flow of packets into and out of the IMP happened faster than the Honeywell designers had anticipated. The 516 machine didn't seem capable of handling such traffic. Maybe BBN had been overly optimistic. Ornstein and Barker went to Honeywell and insisted that the manufacturer “dig out of the woodwork, way in the backroom” the designer of the 516 computer. He was a very smart guy, Ornstein had to admit, but at first the Honeywell man refused to admit that a metastable state was possible in the machine. He had never read Ornstein's papers, and had never seen the problem.“Though filled with disbelief,” said Ornstein, he “at least understood what we were saying.”
Under normal conditions, the 516 would run for years without experiencing the synchronizer problem. However, under ARPA's packet-switching network conditions, the machine was failing once every day or so. Try telling Frank Heart, Mr. Reliability, that he'd just have to live with that.
Ornstein and Barker huddled. It was only a guess that the IMP had a synchronizer problem. To test the hypothesis, Ornstein designed and wired an “aggravator” that deliberately produced data requests at what Barker called a “fierce rate.” It increased the probability of getting interruptions at the exact nanosecond that would reveal the problem. The aggravator had a knob that worked like a tuner. Using the knob, Ornstein and Barker could “tune” the timing of requests to bring in a signal perfectly out of kilter with the clock, the worst case. Then, using an oscilloscope, they observed the machine's “heartbeat” and other internal functions.
The debugging crew went to work. The patterns they were looking for on the oscilloscope would be so faint as to be visible only in a darkened room. So with all the lights out in the IMP room and with all their diagnostic equipment and the Honeywell turned on, they watched, while fooling with the aggravator. The traces they saw on the scope were bright, regularly positioned, and steadily pacedâthe vital signs of a healthy machine.
Even with the aggravator, it took the debugging team quite a while to find what it was looking for. Still, every few minutes a very faint ghost trace flitted across the oscilloscope. Was that it? The fleeting trace was perhaps the only telltale sign that the crashes were caused by a timing problem: a synchronizer stalled in a metastable condition for a few nanoseconds too long. It was the computer equivalent of the one split second of confusion or indecision by a race car driver that ends suddenly in a fatal crash. The evidence seemed fairly incontrovertible, and Honeywell finally acknowledged it.
In the meantime, Barker designed a possible fix, and rewired the IMP's central timing chain. When Barker brought the machine back up, loaded in his diagnostic code, and looked in the scope, the ghost traces were gone.
While Barker and Ornstein were reasonably certain that the problem was fixed, they had no way of knowing for sure unless the machine ran for a few consecutive days without crashing. And they didn't have a few days. Heart had already approved shipping the first Interface Message Processor to California the next day. IMP Number One was almost out the door.
Do It to It Truett
Steve Crocker and Vint Cerf had been best friends since attending Van Nuys High School in L.A.'s San Fernando Valley. They shared a love for science, and the two spent more than a few Saturday nights building three-dimensional chess games or trying to re-create Edwin Land's experiments with color perception.
Vint was a wiry, intense, effusive kid. He joined his high school ROTC unit to avoid gym class. On the days he didn't show up at school in his ROTC uniform, Vint wore a jacket and tie. And he always toted a large brown briefcase. By local standards, it was an unusual mode of dress, even in the late 1950s. “I used the coat and tie to distinguish myself from the crowdâmaybe a nerd's way of being different,” he recalled. Nonetheless, much to the consternation of his friends,Vint never had trouble attracting the attention of the opposite sex. He was, everyone agreed, one of a kind.
From an early age,Vint aspired to match the accomplished track record of his father, who had risen through the ranks to become a senior executive at North American Aviation (now Rockwell International). Both of Vint's younger brothers played football and took turns as president of the student body.Vint was the bookworm. His literary tastes tilted toward fantasy. Well into his adult life, he regularly set aside several days to reread
The Lord of the Rings
trilogy.Vint did particularly well in chemistry, but his passion was math. When Steve Crocker started the math club atVan Nuys High,Vint was one of the first to join.
As a result of premature birth, Vint was hearing-impaired. Although hearing aids in both ears later corrected much of the deficit, he grew up devising clever strategies for communicating in the hearing world. Years later, after they became friends, Bob Kahn brought some of Cerf's aural tricks to his friends'attention and Cerf eventually wrote a paper called “Confessions of a Hearing-Impaired Engineer,” in which he shared some of his secrets.
In particularly noisy environments (cafeterias, restaurants, and homes with dogs and small children), the deaf person's reliance on conversational context often suffers badly. A typical strategy here is to dominate the conversation, not by doing all the talking, but by asking a lot of questions. In this way, the deaf listener will at least know what question the speaker is addressing, even if he cannot hear all of the response. In a group conversation, this can backfire embarrassingly if the question you ask is one which was just asked by someone else. A variation (equally embarrassing) is to enthusiastically suggest something just suggested, for example:
Friend A: I wonder what the origin of this term is?
Friend B: Why don't we look it up in
The Oxford English Dictionary?
Friend A: Yeah, but too bad we don't have an
O.E.D.
Cerf: I know. Why don't we look it up in
The Oxford English Dictio
nary?
Steve Crocker drifted in and out of Vint's life. Steve's parents were divorced, and he spent his high school years shuttling between suburban Chicago and the San Fernando Valley. Always precocious, Steve grew up knowing he was probably the smartest kid in any given room. At age thirteen, while home one day with a cold, he taught himself the elements of calculus. And at the end of tenth grade, he learned the rudiments of computer programming. “I remember being thrilled when I finally understood the concept of a loop,” Crocker recalled, “which enabled the computer to proceed with a very lengthy sequence of operations with only a relatively few instructions. I was a bit callow, but I remember thinking this was the kind of revelation that must have led Archimedes to run down the street naked yelling, âEureka!'”
Around 1960, when Steve had returned to L.A., Vint followed him into the computer lab at UCLA. Although still in high school, Steve had gotten permission to use the UCLA computer, but the only free time he andVint had was on the weekends. One Saturday they arrived to find the computer lab building locked. “I couldn't see any choice but to give up and go home,” said Crocker. But they looked up and saw an open second-story window. They looked at each other. “Next thing I know, Vint is on my shoulders,” Crocker recalled. Cerf went through the window and, once inside, opened the door and taped the latch so they could get in and out of the building. “When the Watergate burglars did the same thing a dozen years later and got caught, I shuddered,” said Crocker.
After high school, Cerf attended Stanford on a four-year scholarship from his father's company. He majored in math but soon got hooked on serious computing. “There was something amazingly enticing about programming,” he said. “You created your own universe and you were the master of it. The computer would do anything you programmed it to do. It was this unbelievable sandbox in which every grain of sand was under your control.”
After graduating in 1965, Cerf decided he wanted to work for a while before going on to graduate school. IBM was recruiting on the Stanford campus, and Cerf took a job at IBM in Los Angeles. He went to work as the systems engineer for an IBM time-sharing system. Realizing he needed better grounding in computer science, he soon joined his friend Crocker, now a graduate student in UCLA's computer science department. Computer science was still a young discipline, and UCLA's Ph.D. programâone of the first in the countryâwas one of only a dozen in existence at the time. Cerf arrived just as Crocker was leaving for MIT. Crocker's thesis advisor at UCLA was Jerry Estrin, the same professor Paul Baran had worked with a few years earlier. Estrin had an ARPA contract for the “Snuper Computer,” which used one computer to observe the execution of programs running on a second machine. Estrin took on Cerf as a research student for the project; it became the basis for Cerf's doctoral thesis. In the summer of 1968 Crocker returned to UCLA and joined Cerf in Estrin's group.
For both Cerf and Crocker, 1968 marked the beginning of a lifelong fascination with the networking of computers. For Cerf, computer networking would become the centerpiece of his professional career. Although Crocker would move on to other things for long stretches at a time, he too would eventually return to the field of networking.
In the fall of 1968, ARPA transferred its contract from Estrin to Len Kleinrock at UCLA. Kleinrock was setting up his Network Measurement Center, with a $200,000 annual contract from ARPA. By coincidence, when Kleinrock got the contract, the person in the office next door conveniently moved out, so Kleinrock expanded his domain; he tore down the wall between the two offices and installed a large conference table for meetings with students and staff. The meetings were frequent as Kleinrock busily built a small empire.
In planning the ARPA network, Larry Roberts had conceived of the Network Measurement Center as the organization that would be responsible for most of the performance testing and analysis. The measurement center was intended to be roughly analogous to a test track where drivers push the outer limits of high-performance cars. Kleinrock and his group were in charge of gathering dataâtotal network response time, traffic density, delays, and capacityâthe measures needed to evaluate how the network is performing. Like Bob Kahn, Kleinrock had a theoretician's bent; his business was simulation, modeling, and analysis. Through simulations, he had come as close as he could to monitoring the ways in which networks perform without actually having a network to run. He welcomed the chance to test his theories on the real thing.
The engineers at BBN didn't pay too much attention to Kleinrock. They thought he was a trifle heavy on theory and fairly light on engineering. The skepticism was mutual, for Kleinrock believed that the BBN team was largely uninterested in performance. BBN's programmers were outstanding, but, said Kleinrock, “By and large, a programmer simply wants to get a piece of software that works. That's hard enough. Whether it works efficiently or well is not usually the issue.” He was unaware, perhaps, of Walden and Crowther's obsession with software efficiency, but in any case, perfecting
net
work
performance, Kleinrock decided, was his job.
Before long Kleinrock was managing forty students who helped run the center. Crocker and Cerf were among the senior members of Kleinrock's group. Another important member was Jon Postel. He had a long bushy beard, wore sandals year-round and had never put on a tie in his life. Always dapper and generally more conservative, Cerf presented a striking contrast to Postel's steadfastly casual appearance. Crocker, the unofficial leader, was somewhere in the middle. He had grown a beard at MIT (“Cops looked at me a little harder, but girls were a lot friendlier, and that was a trade-off I could live with,” Crocker said), but was willing to put on a pair of dress shoes every now and then.
While Cerf and Crocker were academic stars, Postel, who was twenty-five, had had a more checkered academic career. He had grown up in nearby Glendale and Sherman Oaks, and he too had attended Van Nuys High School, where his grades were mediocre. Postel's interest in computers developed at a local community college. By the time he got to UCLA to finish an undergraduate degree in engineering (the closest thing to computer science at the time) computing was his life. UCLA eventually decided to establish computer science as a formal department, at just about the time Postel was entering the university's graduate school. Postel was quiet, but he had strong opinions. The people running the computer science department occasionally interpreted the firmness of Postel's opinions as a bad attitude.
In 1966 Cerf had married a young illustrator named Sigrid. She was profoundly deaf. Their first meeting had been contrived by their hearing-aid dealer, who scheduled adjacent appointments for them one Saturday morning in hopes that they would cross paths and hit it off. They went to lunch and Sigrid was awestruck by her companion's eclectic curiosity. Vint seemed to dance in his chair with excitement as he described his work with computers. They extended their
tête-à -tête
with a visit to the Los Angeles County Museum of Art to see some of Sigrid's favorite paintings. Unschooled in art but eager to learn, Cerf stared for a long time at a huge Kandinsky. “This thing reminds me of a green hamburger,” he finally remarked. A year later they were married, with Steve Crocker as Vint's best man (roles that would be reversed a few years later). Crocker's electronics expertise came in handy when, minutes before the ceremony was to begin, he discovered the tape recorder for the wedding music was malfunctioning. Best man and frantic groom retreated to a tiny room near the altar and fixed it just in time.
Kleinrock, although only ten years older than the rest of his group, had a great reputation in queueing theory (the study of how long people and things spend waiting in lines, how long the lines get, and how to design systems to reduce waiting). He had already published a book and he was in charge of a growing lab; his energy seemed boundless. Moreover, he was one of just a handful of scientists who had produced analytic models of store-and-forward networks before Roberts got started on the ARPA project.
At the time, the UCLA computer science department owned a computer made by Scientific Data Systems called the Sigma-7, the latest in that firm's line of computers. UCLA also had three major computer centers equipped with IBM 7094 mainframes. But the Sigma-7 was the machine assigned to the graduate students. No one liked the Sigma-7 much. It was unreliable and difficult to program. As a member of the UCLA team put it, the Sigma-7 was a dog. (“But it was our dog,” Cerf said years later.) It was also the only computer they had to play withâuntil, that is, the ARPA network came along. Not only would the computer scientists at UCLA be receiving the first IMP, but presumably the network would open doors to all kinds of different host machines at the other sites.
The most pressing task in the summer of 1969 was to build the interfaceâa combination of hardware and softwareâbetween the Sigma-7 and the IMP. As the UCLA guys understood it, BBN was working out some specifications for how to construct such a connection. The host-to-IMP interface had to be built from scratch each time a new site was established around a different computer model. Later, sites using the same model could purchase copies of the custom interface.
Nearly as urgent was the more far-reaching challenge of writing the software that allowed host computers throughout the network to communicate with one another. This was to be the host-to-host protocol, a very broad based set of operating terms that would be common to all machines. It had to be like a traveler's check: good anywhere and able to support a gamut of applications, from remote log-ins to file transfers to text processing. Inventing it wouldn't be easy.
In the summer of 1968, a small group of graduate students from the first four host sitesâUCLA, SRI, UC Santa Barbara, and the University of Utahâhad met in Santa Barbara. They knew that the network was being planned, but they'd been given few details beyond that. But networking in general, and the ARPA experiment in particular, were hot topics.
The meeting was seminal, if only because of the enthusiasm it generated. “We had lots of questionsâhow IMPs and hosts would be connected, what hosts would say to each other, and what applications would be supported,” Crocker said. “No one had any answers, but the prospects seemed exciting. We found ourselves imagining all kinds of possibilitiesâinteractive graphics, cooperating processes, automatic database query, electronic mailâbut no one knew where to begin.”
From that meeting emerged a corps of young researchers devoted to working on, thinking through, and scheming about the network's host-to-host communications. To speed up the process, they decided to meet regularly. Theoretically, a computer network would cut down on some of the ARPA-funded travel, but before long Crocker was traveling enough that Kleinrock had to procure a separate travel budget for him.