The algorithm in figure 12.1 performs
O
(
n
) addition operations on
n
-bit numbers, where n is the number of bits in
x
and y. Note that
shiftedX
* 2 can be efficiently obtained by either left-shifting its bit representation or by adding
shiftedX
to itself. Both operations can be easily performed using primitive ALU operations. Thus this algorithm lends itself naturally to both software and hardware implementations.
Figure 12.1
Multiplication of two
n
-bit numbers.
A Comment about Notation
The algorithms in this chapter are written using a self-explanatory pseudocode syntax. The only non-obvious convention is that we use indentation to represent blocks of code (avoiding curly brackets or begin/end keywords). For example, in figure 12.1,
sum
= sum +
shiftedX
belongs to the single-statement body of the
if
statement whereas
shiftedX
=
shiftedX
* 2 ends the two-statement body of the
for
statement.
Division
The naïve way to compute the division of two
n
-bit numbers
x/y
is to repeatedly subtract
y
from
x
until it is impossible to continue (i.e., until
x
<
y
). The running time of this algorithm is clearly proportional to the quotient, and may be as large as
O
(
x
), that is, exponential in the number of bits n. To speed up this algorithm, we can try to subtract large chunks of
y
’s from
x
in each iteration. For example, if
x
= 891 and
y
= 5, we can tell right away that we can deduct a hundred 5’s from
x
and the remainder will still be greater than 5, thus shaving 100 iterations from the naïve approach. Indeed, this is the rationale behind the school method for long division
x
/
y
. Formally, in each iteration we try to subtract from
x
the largest possible shift of y, namely,
y
·
T
where T is the largest power of 10 such that
y
·
T
≤ x. The binary version of this opportunistic algorithm is identical, except that T is a power of 2 instead of 10.
Figure 12.2
Division.
Writing down this long division algorithm as we have done for multiplication is an easy exercise. We find it more illuminating to formulate the same logic as a recursive program that is probably easier to implement, shown in figure 12.2.
The running time of this algorithm is determined by the depth of the recursion. Since in each level of recursion the value of
y
is multiplied by 2, and since we terminate once
y
> x, it follows that the recursion depth is bounded by n, the number of bits in x. Each recursion level involves a constant number of addition, subtraction, and multiplication operations, implying a total running time of
O
(
n
) such operations.
This algorithm may be considered suboptimal since each multiplication operation also requires
O
(
n
) addition and subtraction operations. However, careful inspection reveals that the product 2 ·
q
·
y
can be computed without any multiplication. Instead, we can rely on the value of this product in the previous recursion level, and use addition to establish its current value.
Square Root
Square roots can be computed efficiently in a number of different ways, for example, by using the Newton-Raphson method or a Taylor series expansion. For our purpose though, a simpler algorithm will suffice. The square root function
y
= √
x
has two convenient properties. First, it is monotonically increasing. Second, its inverse function,
x
=
y
2
, is something that we already know how to compute (multiplication). Taken together, these properties imply that we have all we need to compute square roots using binary search. Figure 12.3 gives the details.
Note that each loop iteration takes a constant number of arithmetic operations. Since the number of iterations is bound by
n
/2, the algorithm’s running time is
O
(
n
) arithmetic operations.
Figure 12.3
Square root computation using binary search.
12.1.2 String Representation of Numbers
Computers represent numbers internally using binary codes. Yet humans are used to dealing with numbers in a decimal notation. Thus, when humans have to read or input numbers, and only then, a conversion to or from decimal notation must be performed. Typically, this service is implicit in the character handling routines supplied by the operating system. We now turn to describe how these OS services are actually implemented.
Of course the only subset of characters which is of interest here are the ten digit symbols that represent actual numbers. The ASCII codes of these characters are as follows:
As gleaned from the ASCII code, single digit characters can be easily converted into their numeric representation, and vice versa, as follows. To compute the ASCII code of a given digit 0 ≤
x
≤ 9, we can simply add
x
to 48 - the code of ‘0’. Conversely, the numeric value represented by an ASCII code 48 ≤
c
≤ 57 is obtained by
c
—48. And once we know how to convert single digits, we can proceed to convert any given integer. These conversion algorithms can be based on either iterative or recursive logic, so we present one of each in figures 12.4 and 12.5, respectively.
12.1.3 Memory Management
Dynamic Memory Allocation
Computer programs declare and use all sorts of variables, including simple data items like integers and booleans and complex ones like arrays and objects. One of the greatest virtues of high-level languages is that pro-grammers don’t have to worry about the details of allocating RAM space to these variables and recycling the space when it is no longer needed. Instead, all these memory management chores are done behind the scene by the compiler, the operating system, and the virtual machine implementation. This section describes the role of the operating system in this joint effort.
Figures 12.4 and 12.5
String-numeric conversions.
Different variables are allocated memory at different points of time during the program’s life cycle. For example, static variables may be allocated by the compiler at compile time, while local variables are allocated on the stack each time a subroutine starts running. Other memory is dynamically allocated during the program’s execution, and that’s where the OS enters the picture. For example, each time a Java program creates a new array or a new object, a memory block whose size can be determined only during run-time should be allocated. And when the array or the object is no longer needed, its RAM space may be recycled. In some languages like C++ and Jack, de-allocation of un-needed space is the responsibility of the programmer, while in others, like Java, “garbage collection” occurs automatically. The RAM segment from which memory is dynamically allocated is called heap, and the agent responsible for managing this resource is the operating system.
Operating systems use various techniques for handling dynamic memory allocation and de-allocation. These techniques are implemented in two functions traditionally called alloc() and deAlloc (). We present two versions of these algorithms: a basic one and an improved one.
Basic Memory Allocation Algorithm
The data structure that this algorithm manages is a single pointer, called free, which points to the beginning of the heap segment that was not yet allocated. Figure 12.6a gives the details.