Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture (11 page)

Read Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture Online

Authors: jon stokes

Tags: #Computers, #Systems Architecture, #General, #Microprocessors

BOOK: Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture

9.31Mb size Format: txt, pdf, ePub

Read Book Download Book

In the previous chapter, you learned that a computer repeats three basic

steps over and over again in order to execute a program:

Fetch
the next instruction from the address stored in the program

counter and load that instruction into the instruction register.

Increment the program counter.

Decode
the instruction in the instruction register.

Execute
the instruction in the instruction register.

You should also recall that step 3, the execute step, itself can consist of

multiple sub-steps, depending on the type of instruction being executed

(arithmetic, memory access, or branch). In the case of the arithmetic

instruction add A, B, C, the example we used last time, the three sub-steps

are as follows:

Read
the contents of registers A and B.

Add
the contents of A and B.

Write
the result back to register C.

Thus the expanded list of actions required to execute an arithmetic

instruction is as follows (substitute any other arithmetic instruction for
add
in the following list to see how it’s executed):

Fetch
the next instruction from the address stored in the program

counter and load that instruction into the instruction register.

Increment the program counter.

Decode
the instruction in the instruction register.

Execute
the instruction in the instruction register. Because the instruction is not a branch instruction but an arithmetic instruction, send it to the

arithmetic logic unit (ALU).

Read
the contents of registers A and B.

Add
the contents of A and B.

Write
the result back to register C.

Chapter 3

At this point, I need to make a modification to the preceding list. For

reasons we’ll discuss in detail when we talk about the instruction window

in Chapter 5, most modern microprocessors treat sub-steps 3a and 3b as

a group, while they treat step 3c, the register write, separately. To reflect this conceptual and architectural division, this list should be modified to look as

follows:

Fetch
the next instruction from the address stored in the program

counter, and load that instruction into the instruction register.

Increment the program counter.

Decode
the instruction in the instruction register.

Execute
the instruction in the instruction register. Because the instruction is not a branch instruction but an arithmetic instruction, send it to

the ALU.

Read
the contents of registers A and B.

Add
the contents of A and B.

Write
the result back to register C.

In a modern processor, these four steps are repeated over and over again

until the program is finished executing. These are, in fact, the four stages in

a classic RISC1 pipeline. (I’ll define the term
pipeline
shortly; for now, just think of a pipeline as a series of stages that each instruction in the code

stream must pass through when the code stream is being executed.) Here

are the four stages in their abbreviated form, the form in which you’ll most

often see them:

Fetch

Decode

Execute

Write (or “write-back”)

Each of these stages could be said to represent one
phase
in the
lifecycle
of an instruction. An instruction starts out in the
fetch phase
, moves to the
decode phase
, then to the
execute phase
, and finally to the
write phase
. As I mentioned
in “The Clock” on page 29,
each phase takes a fixed, but by no means equal, amount of time. In most of the example processors with which you’ll

be working in this chapter, all four phases take an equal amount of time;

this is not usually the case in real-world processors. In any case, if the DLW-1

takes exactly 1 nanosecond (ns) to complete each phase, then the DLW-1

can finish one instruction every 4 ns.

1 The term
RISC
is an acronym for
Reduced Instruction Set Computing
. I’ll cover this term in more detail in Chapter 5.

Pipelined Execution

Basic Instruction Flow

One useful division that computer architects often employ when talking

about CPUs is that of
front end
versus
back end
. As you already know, when instructions are fetched from main memory, they must be decoded for

execution. This fetching and decoding takes place in the processor’s front

end.

You can see in Figure 3-1 that the front end roughly corresponds to the

control and I/O units in the previous chapter’s diagram of the DLW-1’s

programming model. The ALU and registers constitute the back end of the

DLW-1. Instructions make their way from the front end down through the

back end, where the work of number crunching gets done.

Front End

Back End

Control Unit

Registers

Program Counter (PC)

Instruction Register

Proc. Status Word (PSW)

Data Bus

I/O Unit

ALU

Address

Bus

Figure 3-1: Front end versus back end

We can now modify Figure 1-4 to show all four phases of execution

(see Figure 3-2).

Chapter 3

Fetch

Decode

Execute

Write

Figure 3-2: Four phases of execution

From here on out, we’re going to focus primarily on the code stream,

and more specifically, on how instructions enter and flow through the

microprocessor, so the diagrams will need to leave out the data and results

streams entirely. Figure 3-3 presents a microprocessor’s basic instruction flow

in a manner that’s straightforward, yet easily elaborated upon.

Front End

Fetch

Decode

Back End

ALU

Execute

Write

Figure 3-3: Basic instruction flow

Pipelined Execution

In Figure 3-3, instructions flow from the front end’s fetch and decode

phases into the back end’s execute and write phases. (Don’t worry if this

seems too simple. As the level of complexity of the architectures under

discussion increases, so will the complexity of the diagrams.)

Pipelining Explained

Let’s say my friends and I have decided to go into the automotive manu-

facturing business and that our first product is to be a sport utility vehicle

(SUV). After some research, we determine that there are five stages in

the SUV-building process:

Stage 1:
Build the chassis.

Stage 2:
Drop the engine into the chassis.

Stage 3:
Put the doors, a hood, and coverings on the chassis.

Stage 4:
Attach the wheels.

Stage 5:
Paint the SUV.

Each of these stages requires the use of highly trained workers with very

specialized skill sets—workers who are good at building chasses don’t know

much about engines, bodywork, wheels, or painting, and likewise for engine

builders, painters, and the other crews. So when we make our first attempt to

put together an SUV factory, we hire and train five crews of specialists, one

for each stage of the SUV-building process. There’s one crew to build the

chassis, one to drop the engines, one to put the doors, hood, and coverings

on the chassis, another for the wheels, and a painting crew. Finally, because

the crews are so specialized and efficient, each stage of the SUV-building

process takes a crew exactly one hour to complete.

Now, since my friends and I are computer types and not industrial engi-

neers, we had a lot to learn about making efficient use of factory resources.

We based the functioning of our first factory on the following plan: Place all

five crews in a line on the factory floor, and have the first crew start an SUV at Stage 1. After Stage 1 is complete, the Stage 1 crew passes the partially finished SUV off to the Stage 2 crew and then hits the break room to play some foos-ball, while the Stage 2 crew builds the engine and drops it in. Once the

Stage 2 crew is done, the SUV moves down to Stage 3, and the Stage 3 crew

takes over, while the Stage 2 crew joins the Stage 1 crew in the break room.

The SUV moves on down the line through all five stages in this way, with

only one crew working on one stage at any given time while the rest of the

crews sit idle. Once the completed SUV finishes Stage 5, the crew at Stage 1

starts on another SUV. At this rate, it takes exactly five hours to finish a single SUV, and our factory completes one SUV every five hours.

In Figure 3-4, you can see the SUV pass through all five stages. The SUV

enters the factory floor at the beginning of the first hour, where the Stage 1

crew begins work on it. Notice that all of the other crews are sitting idle while the Stage 1 crew does its work. At the beginning of the second hour, the

Stage 2 crew takes over, and the other four crews sit idle while waiting on

Chapter 3

Stage 2. This process continues as the SUV moves down the line, until at the

beginning of the sixth hour, one SUV stands completed and while another

has entered Stage 1.

1hr

2hr

3hr

4hr

5hr

6hr

Factory

Floor

Completed

SUVs

Figure 3-4: The lifecycle of an SUV in a non-pipelined factory

Fast-forward one year. Our SUV, the Extinction LE, is selling like . . .

well, it’s selling like an SUV, which means it’s doing pretty well. In fact, our SUV is selling so well that we’ve attracted the attention of the military and

have been offered a contract to provide SUVs to the U.S. Army on an ongoing

basis. The Army likes to order multiple SUVs at a time; one order might

come in for 10 SUVs, and another order might come in for 500 SUVs. The

more of these orders that we can fill each fiscal year, the more money we can

make during that same period and the better our balance sheet looks. This,

of course, means that we need to find a way to increase the number of SUVs

that our factory can complete per hour, known as our factory’s
SUV completion
rate
. By completing more SUVs per hour, we can fill the Army’s orders faster and make more money each year.

The most intuitive way to go about increasing our factory’s SUV comple-

tion rate is to try and decrease the production time of each SUV. If we can

get the crews to work twice as fast, our factory can produce twice as many

SUVs in the same amount of time. Our crews are already working as hard

as they can, though, so unless there’s a technological breakthrough that

increases their productivity, this option is off the table for now.

Since we can’t speed up our crews, we can always use the brute-force

approach and just throw money at the problem by building a second assembly

line. If we hire and train five new crews to form a second assembly line, also

capable of producing one car every five hours, we can complete a grand total

of two SUVs every five hours from the factory floor—double the SUV comple-

tion rate of our present factory. This doesn’t seem like a very efficient use of factory resources, though, since not only do we have twice as many crews

working at once but we also have twice as many crews in the break room at

once. There has to be a better way.

Pipelined Execution

Faced with a lack of options, we hire a team of consultants to figure out a

clever way to increase overall factory productivity without either doubling the

number of crews or increasing each individual crew’s productivity. One year

and thousands of billable hours later, the consultants hit upon a solution.

Why let our crews spend four-fifths of their work day in the break room,

when they could be doing useful work during that time? With proper sched-

uling of the existing five crews, our factory can complete
one SUV each hour
, thus drastically improving both the efficiency and the output of our assembly

line. The revised workflow would look as follows:

The Stage 1 crew builds a chassis.

Once the chassis is complete, they send it on to the Stage 2 crew.

The Stage 2 crew receives the chassis and begins dropping the engine in,

while the Stage 1 crew starts on a new chassis.

When both Stage 1 and Stage 2 crews are finished, the Stage 2 crew’s

work advances to Stage 3, the Stage 1 crew’s work advances to Stage 2,

and the Stage 1 crew starts on a new chassis.

Figure 3-5 illustrates this workflow in action. Notice that multiple crews

have multiple SUVs simultaneously in progress on the factory floor. Compare

this figure to Figure 3-4, where only one crew is active at a time and only one

SUV is on the factory floor at a time.