that both memory addresses and instructions are ordinary numbers
that can be stored in memory. All of the instructions in a program like
Program 1-1 are represented inside the computer as strings of numbers.
Indeed, a program is one long string of numbers stored in a series of
memory locations.
How is a program like Program 1-1 rendered in numerical notation so
that it can be stored in memory and executed by the computer? The answer
is simpler than you might think.
As you may already know, a computer actually only understands 1s and
0s (or “high” and “low” electric voltages), not English words like add , load , and store , or letters and base-10 numbers like A, B, 12, and 13. In order for the computer to run a program, therefore, all of its instructions must be rendered
in binary notation . Think of translating English words into Morse code’s dots and dashes and you’ll have some idea of what I’m talking about.
Machine Language on the DLW-1
The translation of programs of any complexity into this binary-based machine language is a massive undertaking that’s meant to be done by a computer, but I’ll show you the basics of how it works so you can understand what’s going
on. The following example is simplified, but useful nonetheless.
The English words in a program, like add , load , and store , are mnemonics (meaning they’re easy for people to remember), and they’re all mapped to
strings of binary numbers, called opcodes , that the computer can understand.
Each opcode designates a different operation that the processor can perform.
Table 2-1 maps each of the mnemonics used in Chapter 1 to a 3-bit opcode
for the hypothetical DLW-1 microprocessor. We can also map the four
register names to 2-bit binary codes, as shown in Table 2-2.
Table 2-1: Mapping of Mnemonics to
Opcodes for the DLW-1
Mnemonic
Opcode
add
000
sub
001
load
010
store
011
Table 2-2: Mapping of Registers to
Binary Codes for the DLW-1
Register
Binary Code
A
00
B
01
C
10
D
11
The binary values representing both the opcodes and the register codes
are arranged in one of a number of 16-bit (or 2-byte) formats to get a complete
machine language instruction, which is a binary number that can be stored in RAM and used by the processor.
20
Chapter 2
NOTE
Because programmer-written instructions must be translated into binary codes before a computer can read them, it is common to see programs in any format—binary, assembly, or a high-level language like BASIC or C, referred to generically as
“code” or “codes.” So programmers sometimes speak of “assembler code,” “binary code,” or “C code,” when referring to programs written in assembly, binary, or C
language. Programmers also will often describe the act of programming as “writing code” or “coding.” I have adopted this terminology in this book, and will henceforth use the term “code” regularly to refer generically to instruction sequences and programs.
Binary Encoding of Arithmetic Instructions
Arithmetic instructions have the simplest machine language instruction
formats, so we’ll start with them. Figure 2-1 shows the format for the machine
language encoding of a register-type arithmetic instruction.
0
1
2
3
4
5
6
7
mode
opcode
source1
source2
Byte 1
8
9
10
11
12
13
14
15
destination
000000
Byte 2
Figure 2-1: Machine language format for a register-type instruction
In a register-type arithmetic instruction (that is, an arithmetic instruc-
tion that uses only registers and no immediate values), the first bit of the
instruction is the mode bit . If the mode bit is set to 0, then the instruction is a register-type instruction; if it’s set to 1, then the instruction is of the immediate type.
Bits 1–3 of the instruction specify the opcode, which tells the computer
what type of operation the instruction represents. Bits 4–5 specify the instruc-
tion’s first source register, 6–7
How to Talk to Anyone
C. M. Wright
Beth Ciotta
Meg McKinlay
Mark Edwards, Louise Voss
Joe Nobody
Gennita Low
Scott Ciencin
Chantel Seabrook
Kristen Strassel