Read It Began with Babbage Online
Authors: Subrata Dasgupta
At the time the paper was written, the system had not been completed. However, Dijkstra claimed, the rigor with which the system design will be proved correct will greatly assuage any fear that the system, when implemented and put into operation, will fail.
20
We will see later how programs could be
proved
correct. What interests us, however, especially about the THE multiprogramming system, is how the credo of divide and rule was realized in the design of a complex piece of software. In fact, divide and rule applied in two different ways. First, Dijkstra conceived the whole computing system
logically
as “a society of sequential processes.”
21
Each distinct user program corresponded to a sequential process; each input and output device (in operation) constituted a sequential process.
22
A sequential process (not to be confused with sequential circuit or machine [see
Chapter 15
, Sections IIâIV]) is a procedure or a program in a state of execution on its own “virtual” (or abstract) processor; it is a
serial
process. Sequential processes are abstract. Logically speaking each process goes about “doing its own thing,” independent of other processes, except when they need to communicate with one another. A “society” of sequential processes
logically
operates concurrently each on its own virtual processor. Because the processes are abstract, there is no sense of physical time; rather, there is a sense of abstract time so that each process executes at its own abstract speed. So when two processes
P1
and
P2
, working concurrently, need to communicateâfor example,
P1
sends a message to
P2
, which must receive itâthey must be synchronized at the time communication is made. Such communication would involve
P1
writing its message into a storage area (a “buffer”) and
P2
reading this message. So access to the buffer by
P1
and
P2
must not interfere with each other. This is the “mutual exclusion problem” involving shared resources that Dijkstra first studied and solved (by inventing a fundamental mechanism he called
semaphors
).
23
The abstractness of sequential processes was important, for the “society” of such processes live and work harmoniously and cooperatively with each other quite independent
of the
physical
processors (material artifacts) that actually run these processes.
24
In fact, the THE system was, in part, a realization of a
theory
of “cooperating sequential processes” Dijkstra had developed in 1965.
25
The organization of the whole computing system into a “society” of abstract, independent but cooperating sequential processes was one aspect of the divide-and-rule strategy used in building the THE multiprogramming system. The processes were also
structured hierarchically
into multiple levels. Each level comprised one or more sequential processes that
created abstractions
out of more basic, more concrete processes or resources. Thus, the lowest level, level 0, abstracted across physical processors. Above this level, the number of processors actually shared among the processes was rendered invisible
26
; processors, so to speak, lost their individual identity.
27
At the next level, level 1, processes abstracted physical memory into segments (in the virtual memory sense [see
Chapter 15
, Section XII]). Above level 2, each process had its own “console” to hold “conversations” with the operator. Virtual consoles were created; the fact that there may have been only one (shared) physical console “disappeared from the picture.”
28
At level 3, sequential processes concerned with buffering input streams and unbuffering output streams were realized. Virtual input/output communication units were created. Finally, at level 4, the user programs resided.
29
Decomposing and organizing the computing systemâdividing and rulingâin these two ways produced what Dijkstra would later call “hierarchical ordering of sequential processes”.
30
Structuring a programming system hierarchically in this bottom-up fashion, one level at a time, has great aesthetic appeal; it is an elegant way for managing the complexity of a large system. After the sequential processes at level
i
have been designed, implemented, and proved correct, one need not worry about them; they become reliable components for building sequential processes at level
i
+ 1.
But how does one
prove
that an individual program (or a sequential process) is correct? And why is it important to prove correctness as opposed to the usual method in programming of
testing
a program experimentally on a computer?
Dijkstra raised this latter issue in his 1969 paper, “Structured Programming,” presented at the second NATO conference on software engineering. Proving program correctness was preferable to program testing because the latter had an inherent limitation; although empirical testing of a program can reveal the presence of errors, it could never prove their absence.
31
This assertion became a much-quoted aphorism, a mantra, in later times. The logic of the statement was impeccable. We do not know whether Dijkstra had read the writings of philosopher of science Sir Karl Popper (1902â1994),
32
but there was complete resonance between Dijkstra's aphorism and Popper's assertion that no amount of empirical evidence can ever prove the truth of a scientific theory, but just one piece
of counterevidence is sufficient to falsify the theory. Thus, “falsifiability” of a scientific theoryâdemonstrating an error in the theoryâis analogous to demonstrating the presence of a bug in a program; demonstrating the
absence
of error in a scientific theory is the analog to demonstrating the absence of a bug in a program. Neither can be shown by empirical (“inductive”) meansâexperiment or observation.
The kind of science with which Popper was concerned were the natural sciencesâphysics, chemistry, biology, and the like. In the natural sciences, the object of interest being “natural,” one has to perform experiments or make observations; a theory in physics or in biology
has
to be tested against reality. Popper's point was that one must test a scientific theoryâthat is unavoidableâbut not to prove the theory (because that is logically impossible) but to
falsify
or
refute
it.
But programsâsoftwareâare artifacts; they are not discovered, but invented. Moreover, programs are
symbolic
artifacts in the same sense that mathematical theorems are symbolic artifacts. Computer science is, after all, a science of the artificial, not a science of the natural (see
Chapter 1
). Thus, there is a way out for programs that does not exist for a theory in a natural science. One can formally prove the correctness of a program in the same way one can formally prove the correctness of a theorem. Indeed, in this view,
a program is a theorem
. It says that “if this procedure is followed, then such and such a function will be computed.” And so, as in mathematics and logic, one can apply the axiomatic approach.
We have already seen the axiomatic method “in action” in the realm of computing in Turing's work on the
Entscheidungsproblem
(see
Chapter 4
) and in Allen Newell's and Herbert Simon's work on the LT, or Logic Theorist (see
Chapter 14
, Section V). To prove a theorem in mathematics or logic, one has a set of basic definitions, axioms, and a set of rules of inference, along with a body of already proved theorems. One then applies the rules of inference on appropriate axioms, definitions, and prior theorems to construct a logically rigorous chain of inferences with an end product that is the theorem of interest; the chain of reasoning is the proof.
A similar approach can applyâindeed, for Dijkstra,
must
applyâin the case of programs. Thus, a new arena in computer science within the subparadigm of programming methodology was born:
formal program verification
.
Dijkstra was by no means the first to think such thoughts. In 1949, at the Cambridge conference on automatic calculating machines where the EDSAC was first demonstrated publicly (see
Chapter 8
, Section XVI), the redoubtable Alan Turing presented a two-page paper in which he anticipated the germinal ideas underlying formal program verification.
33
Turing pointed out that “checking”âhis wordâwhether a program is correct can be greatly facilitated if the programmer states
assertions
that are expected to be true at
certain points in the program. For example (using Algol-like notation), immediately after the execution of the assignment statement
x := y + z
the assertion
x = y + z
will always be true. Or, following the execution of the statement
if
x ⤠0
then
x := 0
else
x := x + 1
the assertion
x ⥠0
will always be true, regardless of which of the two assignments is executed.
These assertions, Turing noted, can be such that after they are checked individually to be correct, the correctness of the whole program follows. We are reminded here of the “assertion boxes” Herman Goldstine and John von Neumann created as part of the flow diagram notation they invented for specifying algorithms (see
Chapter 9
, Section III).
Turing also made the distinction between what would later be called “partial” and “total” correctness. Partial correctness is the correctness of a program assuming it terminatesâthat is, it comes to a stop. Total correctness is concerned with proving that the program does terminate. Recall that finiteness or termination is a defining characteristic of an algorithm (see
Chapter 15
, Section VIII).
Turing illustrated his ideas by way of an example. The problem he used was an algorithm (“a routine,” in his words) to compute the factorial of a number
n
(denoted
n!
) without using a multiply operation, with multiplication being carried out by repeated additions. The algorithm was represented by a flowchart.
Sometimes, an idea is far too ahead of its time. Turing's paper apparently made no waves at the time. It died a quiet death. It lay forgotten until well after formal program verification became an established subparadigm. The paper was “discovered” almost 20 years afterward.
34
It is quite possible to follow Dijkstra's gospel of structured programming without actually proving the correctness of the resulting program. Structured programming represented
both a mentality and a method for managing the intellectual burden of developing computer programs. One can follow the divide-and-rule philosophy; proceed from the higher level specification for the programs as a whole through decomposition into parts and subparts in a top-down, hierarchical fashion without formal proofs along the way. However, the jewel in the crown of structured programming was formal verification. For this, Dijkstra's preference was the axiomatic approach of mathematics. Herein lay the aesthetics.
It is one thing to aspire to prove programs correct as one proves theorems. It is another thing to achieve this aspiration. To understand this issue we recall the fundamentals of Dijkstra's gospel. The function of an individual program must first be specified rigorously in some formal language, such as a language of logicâfor example, “predicate calculus.” Let us denote this specification by
S
; this describes the intended behavior of the program (yet to be written). The program itself must then be described in some appropriate programming language. Let us denote this program as
P
. The process of formal proof involves showing, by logical arguments, that
P
does indeed behave in the manner prescribed by
S
. But, this requires the availability of a precisely defined
semantics
of the language in which
P
has been written. Let us denote this semantics by
Ï
. Verification thus requires this
ménage a trois
<
S, P, Ï
>.
As we have seen, although the syntax of Algol 60 was defined formally by a context-free grammar expressed in the meta-language BNF (see
Chapter 13
, Section XVI), its semantics was in English. A much more formal and unambiguous meta-language than English (or any other natural language) would have to be used for the purpose of formal verification.
In fact, inventing a meta-language for a formal semantics of programming languages was very much in the air during the 1960s, quite apart from the needs of structured programming. After all, the user of a programming language would need to know the
precise
meaning of the various (precisely defined) syntactic units in the language, such as declarations of variables, assignment statements, conditional (
if then)
statements, repetition (
for
) statements, procedures (
proc
), program blocks (
begin end
statements), and so on. Furthermore, the compiler writer needed to know the precise meaning of these syntactic units to translate a program correctly in that language into object code for the target machine.
One approach, originating in the ideas of John McCarthy (the inventor of LISP [see
Chapter 14
, Section IX]) in 1963 and Peter Landin (1930â2009) in 1964, associated with each statement type in a programming language an
interpretation
of how that statement would be executed on a standardized (“canonical”) abstract machine.
35