The Elements of Computing Systems: Building a Modern Computer from First Principles (16 page)

BOOK: The Elements of Computing Systems: Building a Modern Computer from First Principles
12.13Mb size Format: txt, pdf, ePub
 
Conflicting Uses of the A Register
As was just illustrated, the programmer can use the A register to select either a data memory location for a subsequent
C
-instruction involving M, or an instruction memory location for a subsequent
C
-instruction involving a jump. Thus, to prevent conflicting use of the A register, in well-written programs a
C
-instruction that may cause a jump (i.e., with some non-zero j bits) should not contain a reference to M, and vice versa.
4.2.4 Symbols
Assembly commands can refer to memory locations (addresses) using either constants or symbols. Symbols are introduced into assembly programs in the following three ways:

Predefined symbols:
A special subset of RAM addresses can be referred to by any assembly program using the following predefined symbols:

Virtual registers:
To simplify assembly programming, the symbols R0 to R15 are predefined to refer to RAM addresses 0 to 15, respectively.

Predefined pointers:
The symbols SP, LCL, ARG, THIS, and THAT are predefined to refer to RAM addresses 0 to 4, respectively. Note that each of these memory locations has two labels. For example, address 2 can be referred to using either R2 or ARG. This syntactic convention will come to play in the implementation of the virtual machine, discussed in chapters 7 and 8.

I/O pointers:
The symbols SCREEN and KBD are predefined to refer to RAM addresses 16384 (0x4000) and 24576 (0x6000), respectively, which are the base addresses of the screen and keyboard memory maps. The use of these I/O devices is explained later.

Label symbols:
These user-defined symbols, which serve to label destinations of goto commands, are declared by the pseudo-command “(Xxx)”. This directive defines the symbol Xxx to refer to the instruction memory location holding the next command in the program. A label can be defined only once and can be used anywhere in the assembly program, even before the line in which it is defined.

Variable symbols:
Any user-defined symbol Xxx appearing in an assembly program that is not predefined and is not defined elsewhere using the “(Xxx)” command is treated as a variable, and is assigned a unique memory address by the assembler, starting at RAM address 16 (0x0010).
4.2.5 Input/Output Handling
The Hack platform can be connected to two peripheral devices: a screen and a keyboard. Both devices interact with the computer platform through memory maps. This means that drawing pixels on the screen is achieved by writing binary values into a memory segment associated with the screen. Likewise, listening to the keyboard is done by reading a memory location associated with the keyboard. The physical I/O devices and their memory maps are synchronized via continuous refresh loops.
 
Screen
The Hack computer includes a black-and-white screen organized as 256 rows of 512 pixels per row. The screen’s contents are represented by an 8K memory map that starts at RAM address 16384 (0x4000). Each row in the physical screen, starting at the screen’s top left corner, is represented in the RAM by 32 consecutive 16-bit words. Thus the pixel at row r from the top and column c from the left is mapped on the
c
%16 bit (counting from LSB to MSB) of the word located at RAM[16384 + r · 32 +
c
/16]. To write or read a pixel of the physical screen, one reads or writes the corresponding bit in the RAM-resident memory map (1 = black, 0 = white). Example:
Keyboard
The Hack computer interfaces with the physical keyboard via a single-word memory map located in RAM address 24576 (0x6000). Whenever a key is pressed on the physical keyboard, its 16-bit ASCII code appears in RAM[24576]. When no key is pressed, the code 0 appears in this location. In addition to the usual ASCII codes, the Hack keyboard recognizes the keys shown in figure 4.6.
4.2.6 Syntax Conventions and File Format
Binary Code Files
A binary code file is composed of text lines. Each line is a sequence of sixteen “0” and “1” ASCII characters, coding a single machine language instruction. Taken together, all the lines in the file represent a machine language program. The contract is such that when a machine language program is loaded into the computer’s instruction memory, the binary code represented by the file’s
n
th line is stored in address n of the instruction memory (the count of both program lines and memory addresses starts at 0). By convention, machine language programs are stored in text files with a “hack” extension, for example, Prog. hack.
 
Assembly Language Files
By convention, assembly language programs are stored in text files with an “asm” extension, for example, Prog.asm. An assembly language file is composed of text lines, each representing either an instruction or a symbol
declaration:
Figure 4.6
Special keyboard codes in the Hack platform.
 

Instruction:
an
A
-instruction or a
C
-instruction.

(Symbol):
This pseudo-command causes the assembler to assign the label Symbol to the memory location in which the next command of the program will be stored. It is called “pseudo-command” since it generates no machine code.
 
(The remaining conventions in this section pertain to assembly programs only.)
 
Constants and Symbols
Constants
must be non-negative and are always written in decimal notation. A user-defined symbol can be any sequence of letters, digits, underscore (_), dot (.), dollar sign ($), and colon (:) that does not begin with a digit.
 
Comments
Text beginning with two slashes (//) and ending at the end of the line is considered a comment and is ignored.
 
White Space
Space characters are ignored. Empty lines are ignored.
 
Case Conventions
All the assembly mnemonics must be written in uppercase. The rest (user-defined labels and variable names) is case sensitive. The convention is to use uppercase for labels and lowercase for variable names.
4.3 Perspective
The Hack machine language is almost as simple as machine languages get. Most computers have more instructions, more data types, more registers, more instruction formats, and more addressing modes. However, any feature not supported by the Hack machine language may still be implemented in software, at a performance cost. For example, the Hack platform does not supply multiplication and division as primitive machine language operations. Since these operations are obviously required by any high-level language, we will later implement them at the operating system level (chapter 12).
In terms of syntax, we have chosen to give Hack a somewhat different look-and-feel than the mechanical nature of most assembly languages. In particular, we have chosen a high-level language-like syntax for the
C
-command, for example, D=M and D=D+M instead of the more traditional LOAD and ADD directives. The reader should note, however, that these are just syntactic details. For example, the + character plays no algebraic role whatsoever in the command D=D+M. Rather, the three-character string D+M, taken as a whole, is treated as a single assembly mnemonic, designed to code a single ALU operation.
One of the main characteristics that gives machine languages their particular flavor is the number of memory addresses that can appear in a single command. In this respect, Hack may be described as a “
address machine”: Since there is no room to pack both an instruction code and a 15-bit address in the 16-bit instruction format, operations involving memory access will normally be specified in Hack using two instructions: an
A
-instruction to specify the address and a
C
-instruction to specify the operation. In comparison, most machine languages can directly specify at least one address in every machine instruction.
Indeed, Hack assembly code typically ends up being (mostly) an alternating sequence of A- and
C
-instructions, for example, @xxx followed by D=D+M, @YYY followed by 0 ; JMP, and so on. If you find this coding style tedious or even peculiar, you should note that friendlier macro commands like D=D+M[xxx] and GOTO YYY can easily be introduced into the language, causing Hack assembly code to be more readable as well as about 50 percent shorter. The trick is to have the assembler translate these macro commands into binary code effecting @xxx followed by D=D+M,@YYY followed by 0 ; JMP, and so on.
The
assembler,
mentioned several times in this chapter, is the program responsible for translating symbolic assembly programs into executable programs written in binary code. In addition, the assembler is responsible for managing all the system- and user-defined symbols found in the assembly program, and for replacing them with physical memory addresses, as needed. We return to this translation task in chapter 6, in which we build an assembler for the Hack language.
4.4 Project
Objective
Get a taste of low-level programming in machine language, and get acquainted with the Hack computer platform. In the process of working on this project, you will also become familiar with the assembly process, and you will appreciate visually how the translated binary code executes on the target hardware.
 
Resources
In this project you will use two tools supplied with the book: An assembler, designed to translate Hack assembly programs into binary code, and a CPU emulator, designed to run binary programs on a simulated Hack platform.
 
Contract
Write and test the two programs described in what follows. When executed on the CPU emulator, your programs should generate the results mandated by the test scripts supplied in the project directory.
 

Multiplication Program
(Mult.asm): The inputs of this program are the current values stored in R0 and R1 (i.e., the two top RAM locations). The program computes the product R0*R1 and stores the result in R2. We assume (in this program) that R0>=0, R1>=0, and R0*R1<32768. Your program need not test these conditions, but rather assume that they hold. The supplied Mult.tst and Mult.cmp scripts will test your program on several representative data values.

Other books

Shelter Me Home by T. S. Joyce
Sex and the Single Vamp by Covington, Robin
What Looks Like Crazy by Charlotte Hughes
Design for Dying by Renee Patrick
A Gentleman's Kiss by Kimberley Comeaux
Family Ties by Debi V. Smith
Harmattan by Weston, Gavin