1. Assembler Made Painless (I Hope)

by Kevin Cole

March 14, 1984

Dedicated to

CHICKEN

my favorite student


1.1. PREFACE

Updated 2020.09.10: This was written 36 years ago, and at a time when students had access to the Digital Equipment Corporation (DEC) manuals. Also, in an era before hypertext. New users stumbling onto this may be better served by the PDP-11 Processor Handbook from the University of Calgary, Department of Computer Science, which is a mere 19 years old as of this vintage.

I intend to explain much of the workings of a computer by analogy. The style is somewhat loose and flippant at times, but for the most part, has quite a bit of useful information to be absorbed. I am concentrating on what I consider to be of great importance in learning assembly language, and on areas which most people whom I have tutored had the greatest difficulty. So let’s get straight to work.

1.2. INTRODUCTION

At the present time, most computers store information as electro-magnetic impulses. Many things in the physical world are “bipolar” in nature. That is, they have two states. In the case of magnets, the way in which the electrons are aligned create “poles” and by using electric current these poles can be reversed. This creates a switch with two conditions which can be interpreted as any of the following: on/off, true/false, 0/1, +/-… You get the picture. It is helpful to remember all of the above, because, depending on the context, the switch can represent any of the above conditions, and more. For convenience, this two-state switch (or binary switch) is referred to as a BIT.

1.3. BITS, BYTES and WORDS

Computer memory is filled with millions of these tiny electromagnetic bits. But, organized as independent elements, they cannot represent much. Therefore, bits are organized into groups of 8, 16, or 32. That is to say the computer will never interpret bits on an individual basis, but rather as part of a group of consecutive bits. This group of bits is referred to as a WORD. The size of a word varies from computer to computer. Most microcomputers have an 8-bit word, the large computers have a 32-bit word, and the PDP-11 minicomputers have 16-bit words. There are other machines which use unusual word sizes, but I won’t go into that.

Sometimes, it is not handy to grab 16 bits at once. Occasionally, (actually quite often), it is more useful to get only 8 bits at a time. 8 consecutive bits are referred to as a BYTE.

Word sizes on various machines
Computer Type Word size in bytes (bits) Notes
APPLE II-e Micro 1 byte 8 bits  
Commodore VIC20 Micro 1 byte 8 bits  
PDP-11 Mini 2 bytes 16 bits  
Apple Macintosh Micro 4 bytes 32 bits  
IBM-360 Mainframe 4 bytes 32 bits  
VAX Mid-size 4 bytes 32 bits  
DEC 10 Mainframe ? bytes 36 bits
  • The DECsystem 10 uses a concept called variable length bytes. Not discussed here.

Let’s look at some of the more frequent uses for bits, bytes and words.

Here’s a group of 8 bits (1 byte):

0 1 1 0 1 1 0 1

This can be interpreted several different ways. Each bit could be set up as a flag or switch to indicate when a particular condition has occurred. For example, when a calculation is performed the leftmost bit (also referred to as the high order bit) might be set to 1 if the result is negative, and set to 0 for positive results. The second bit from the left might be set when there is an error in the mathematical operation being performed, for example division by zero.

Another way to interpret the byte is as a binary number. In this case:

\begin{gather*} (0 \times 2^7) + (1 \times 2^6) + (1 \times 2^5) + (0 \times 2^4) + (1 \times 2^3) + (1 \times 2^2) + (0 \times 2^1) + (1 \times 2^0) =\\ (0 \times 128) + (1 \times 64) + (1 \times 32) + (0 \times 16) + (1 \times 8) + (1 \times 4) + (0 \times 2) + (1 \times 1) =\\ 0 + 64 + 32 + 0 + 8 + 4 + 0 + 1 =\\ 109 \end{gather*}

1.4. ASCII

Other ways of interpreting it, involve using the value as a code. Perhaps, as a kid, you played around with the idea of codes for passing notes between you and someone else, that told your innermost feelings about a member of the opposite sex, or something you did not want read by parents or teachers… In computerland, the code is not exactly used for that reason, but is a way of storing textual information in memory. The standard code used today is referred to as ASCII (American Standard Code for Information Interchange), pronounced “ass-kee”. In this code, every character on your terminal keyboard, or on a printer, or in a file on disk, is represented as an 8-bit code. Actually, only 7 bits are currently needed for the code. The high order bit is not used on most machines. There are 128 different characters in the ASCII code ranging in value from 0 to 127.

The first 32 codes are non-printing control-characters. These are generated by holding the control key as you strike some other key. Actually, some of these have a visible effect, however they are still referred to as non-printing. Some of these characters, because of their frequent use, are also located in a separate key that does not have to be struck in conjunction with the control key. The most useful of these to remember are:

Control-key values and corresponding key name
Control character Keyboard key name Notes
Control-[ = ESCAPE (sometimes marked ALTMODE)
Control-H = BACKSPACE  
Control-I = TAB  
Control-J = LINE FEED (sometimes mapped to ENTER)
Control-M = CARRIAGE RETURN (sometimes mapped to ENTER)

Try these. (For example, next time you are using the machine type Control-I instead of TAB.)

(I know it may seem that I am belaboring this point, but bear with me.)

The code for a Control-A is 00000001 binary. Control-B is 00000010. Control-C = 00000011… If you understand binary as well as you indicated, you should see the pattern developing. Believe me, it’s important. If you are debugging a machine language program, and you don’t know the ASCII code, you’ll be at it more than twice as long as necessary.

The next group of 32 (codes 32-63) are your digits and special symbols; punctuation generaly speaking. IMPORTANT: The ASCII code for the digit 2 is NOT the same as the value 2 !!! The digits on you keyboard and printer are merely SYMBOLS. The ASCII code for the digit 2 is 00110010 binary (or 50 decimal). If you try to print the value 2 (binary 00000010) it will print as a Control-B. Just as the same sound spoken at different pitches has different meanings in Chinese, so too, two bytes which are identical can have different meanings, depending on the context in which they are used. This is one of the most difficult and most important thing to remember about assembler language.

Next are the upper case letters (and a few punctuation symbols). These are codes 64-95. And lastly, the lower case letters (and still a few more punctuation marks) codes 96-127.

See the cheat sheet included in this for a better understanding. I will refer to it often in here.

1.5. INSTRUCTIONS

The most common use for a byte is as part of an instruction to the computer. This too is a code. For instance the value 00000011 on the PDP says Branch when Equal to zero (BEQ). The codes for each instruction are so tedious to memorize, that it is not worth the trouble. For the most part, you will have access to some reference which allows you to find the code for an instruction quickly. And, since there is NO standard here, it does no good to memorize the instruction codes for a PDP-11 and then go work on an IBM. The codes are completely different.

Therefore, it is better to memorize the scheme used for the mnemonics for the various instructions. Like knowing that a mnemonic which starts with the letter “B” is most likely to be a Branch instruction. This generally holds true for all machines.

1.6. ADDRESSING

Ok, so now you have these millions of bits organized into consecutive groups. Now comes the problem of how to get to a particular group. Well, each byte is assigned an ADDRESS starting at 0 and increasing by 1 til all memory is exhausted. Unfortunately, here’s where the problems start…

I mentioned earlier that the PDP-11 “thinks” in 16-bit words. This means it usually grabs two bytes at a shot. This makes things complicated when you start to think about how memory on the 11 is organized. But I will labor to make it a bit more accessible. (Ha ha. Get it? A BIT more ACCESSible… Never mind.)

Most machines tend to number their bytes consecutively, but because Digital Equipment Corporation wants to give students a hard time, they decided to do things differently… Actually there is a method to their particular brand of madness, and I hope it will become clearer as we continue. Above, I mentioned that the computer we are using here works in units of 16 bits, or 2 bytes. And each byte has an address. Because it uses a 16-bit word, it can work with positive integers between 0 and \(2^{16} - 1\) (0000000000000000 to 1111111111111111 which is 1000000000000000 minus 1, take my word for it). Representing negative numbers becomes a bit more complicated. It uses a scheme called 2’s-complement notation. More on that subject later. Because the rightmost 8 bits hold the lower portion of the value, it is considered to be the low-order byte and has the lower address value. This is always the even-addressed byte. And the high-order byte is the odd-addressed byte.

Imagining it as boxes filled with values:

1 0 3 2 5 4  
00000011 11010000 10101010 11110000 00000000 00000000

The numbers below the boxes represent the address of each byte in memory. The values in the boxes are the contents of each byte. Confused yet? If not, I’ll try harder to confuse you.

1.7. OCTAL

After a while, it becomes very tedious work to look at everything in binary. So on most machines, HEXADECIMAL or BASE 16 is used to represent groups of 4 bits conveniently. Thus, the contents (or the address) of a 16-bit word could be written as 4 hexadecimal digits. However, DEC chose OCTAL or BASE 8. There is no justifiable reason for this. It is one of very few things DEC did which is just plain stupid. You’ll see why as we go on. Anyway, octal represents 3 bits as a single digit. Refer to the cheat sheet for an illustration of how to count in any of the useful computer numbering systems. I am going to assume you either are already sufficiently familiar with it from previous school or that your assembly language teacher has covered it in depth.

1.8. GENERAL PURPOSE REGISTERS (or ACCUMULATORS)

Because memory is usually quite large, and it takes time for a computer to reference a particular address and manipulate the contents, computer designers created special locations which can be used as a fast memory and can be used easily with most instructions. These are called ACCUMULATORS on some machines and REGISTERS on others. The PDP-11 has 8 such registers, which are numbered %0 to %7. Two of these have highly specialized functions. %6 is known as the Stack Pointer (SP) and %7 is the Program Counter (PC). More on both of these later.

1.9. ADDRESSING MODES

You probably skipped right to this section because you have no patience. Well, good luck… cause if you did you may have missed something very important.

Generally speaking, to manipulate data in the computer, information is obtained from some auxiliary device (INPUT) and moved into memory. Then it is moved from memory to an register, where it is bent, folded, spindled, and mutilated (i.e. added to, subtracted from, etc.), moved back into memory and finally moved to an auxiliary device again (OUTPUT). Often the information must undergo a transformation or conversion from some external form (such as the ASCII code) to an internal representation (such as binary integer). One word of memory only holds 2 characters, maximum. This means that if we enter a 5 digit number from the keyboard, it will take up 2 and 1/2 words of memory in its ASCII form. Remember, each character entered from the keyboard occupies 1 byte of memory.

So, let’s say we entered the string "98760". Each character would be stored in ascending bytes. The octal codes for each digit are: 071, 070, 067, 066, 060, respectively. In memory, represented in octal and binary this would be:

1 0 3 2 5 4
0 7 0 0 7 1 0 6 6 0 6 7 0 0 0 0 6 0
00111000 00111001 00110110 00110111 00000000 00110000

By crushing the two halves of each word together we get:

     
0 3 4 0 7 1 0 3 3 0 6 7 0 0 0 0 6 0
0011100000111001 0011011000110111 0000000000110000

(It is still the same binary value, but when shown as a 6-digit octal value representing the word as a single value, the distinction between the 2 bytes becomes a bit blurred. This would not have happened if DEC had chosen hexadecimal. If the above illustration is not clear, we will have to go over it together.)

There are several different ways to access the above bytes. The simplest, is to use ABSOLUTE addressing. This means when referencing a location in memory, you give the address of that particular word in the instruction. So, if the 3 words in the above example occupied memory locations 400 to 406 (octal), and we wish to move the first word to another location, we would specify the value 400 as the address. The format for this “@#400”.

Absolute addressing, though simple, has several drawbacks. First of all, some instructions were implemented by the manufacturer in such a way that they cannot use absolute addressing. Secondly, the addresses can range from 000000 to 177777 (octal). This means that when using absolute addressing in an instruction, one word of computer memory is used for the instruction itself, and another for the absolute address. This eats up space quickly. Other addressing modes do not share this deficiency. And it does not have much flexability. If you change the location of your data area in memory, it must have all of your program’s absolute addresses changed as well.

In both FORTRAN and COBOL you encountered the idea of an indexed table (also called an array or matrix). This is used when you have several locations which are all related in some way (for example a list of department names which you select from based on the department number):

COBOL: MOVE DEPART-LIST (DEPT-NUMBER-IN) TO DEPT-OUT.
FORTRAN: DEPOUT = DEPT(CODE)

The number in the variables in the parentheses are indexes into the arrays. Well in INDEXED addressing, the same concept applies. The index value is kept in an register. In the above example, the first byte would be referenced by placing a value of 0 into an register (for now lets use %2) and using an address like “400(%2)”. The “EFFECTIVE” address, i.e. the actual address from which the computer fetches the data, is computed by adding the contents of register %2 to the offset 400. The result is 400. Now if we add 1 to register %2 the effective address becomes 401 but the instruction did not have to be changed, only the contents of the index register. You can choose any register to be your index (except the PC and SP), as long as you are not using it for anything else at that point in your program.

In COBOL or FORTRAN you would have a separate instruction to add one to your index… The same is true for indexed addressing. However there is an addressing mode which will automatically increment the index after it is used in an instruction. As you have probably guessed by now, this is the AUTO-INCREMENT addressing mode and is specified as “(%2)+”. With this mode, however, you are not allowed to use an offset. Therefore, for our example, you would have to set register %2 to a value of 400 before using this mode. Its’ crude equivalent in COBOL would be:

MOVE DEPART-LIST (DEPT-NUMBER-IN) TO DEPT-OUT.
ADD 1 TO DEPT-NUMBER-IN.

Now the index is automatically set up for the next pass through the loop to reference the next element in the table.

Now it seems appropriate to introduce two topics at once. These are the STACK and AUTO-DECREMENT mode. Suppose you have part of a calculation completed, and you have another quantity to compute. You do not want to loose the first part while calculating the next part. One solution is to move it from the register where you’ve chosen to store it, to a temporary memory location, and remember where you’ve left it… The STACK POINTER remembers it for you, and the STACK is the temporary area where you keep it. And both auto-increment and auto-decrement are used to control this process.

Ok. This is a typical, overused analogy, but it still works well. When you eat at the cafeteria, you often pick up a tray from a STACK of trays. As you pick yours up, a spring beneath the stack pops the next tray up a little. And when you replace your tray, it pushes everything down slightly. Now imagine each tray has a piece of paper with a number on it. Each tray is a word of memory and the paper is the contents of that word. Auto-increment mode is analogous to PUSHing a tray onto the stack and auto-decrement corresponds to POPing the stack.

Auto-decrement mode works in reverse. It subtracts 1 from the index first, then it uses the result to calculate the effective address. So now, to go back to our example, we could go through our 5 bytes of memory backwards. Starting with a 406 in register %2 and using “-(%2)” would access the byte located at location 405. Using it again would get the byte at 404, and so on.

Look at the following sequence of code:

MOVE 400 TO STACK-POINTER.
   .
   .    (Calculate something called TOT-1 here)
   .
MOVE TOT-1 TO MEMORY (STACK-POINTER).
ADD  1 TO STACK-POINTER.
   .
   .    (Calculate something called TOT-2 here)
   .
SUBTRACT 1 FROM STACK-POINTER.
ADD MEMORY (STACK-POINTER) TO TOT-2.

That’s roughly what you should think of auto-inc and auto-dec as doing. And that’s a good approximation of the stack also.

I’ve been misleading you somewhat with these examples. Actually, auto-increment and auto-decrement can add/subtract 1 from a register or it can add/subtract 2 from a register. This depends on the kind of instruction you are using. I mentioned at the beginning of this text that the PDP-11 usually handles memory in 16-bit quantities called words. This means it usually uses only the even addresses and works with the even/odd pair of bytes at that location in memory. However, using the byte instructions, you can force counting by 1 and only access 8-bits at a time.

The stack is also used in subroutines to “remember” where the main program is to resume execution when the subroutine finishes.

REGISTER mode addressing is similar to absolute. It’s simple. It does not work with memory at all. Use it when all of your data is in registers already and specify it using “%0” through “%7”.

IMMEDIATE mode. In the previous examples, we were accessing variable information in memory. Occasionally in machine language, as in other high level languages, it is necessary to use a constant, for example multiplying some variable by 100 to calculate percentage. Well, it’s silly to store the value 100 in memory somewhere, then when we need it, fetch it from memory for use with the multiply instruction and never use it again. This takes unnecessary time. Instead, it would be much better to have the value available to the instruction without going to some other memory location to get it. So you would have it IMMEDIATELY. Strictly speaking, it’s not an address. But it is classified that way. To use it specify “#100”.

RELATIVE addressing and the PC… This one is pretty complicated, so pay attention. (I know, you find them ALL complicated. Sorry ‘bout that. Doing the best I can.) Register 7 is used by the system itself, to point to the location of the current instruction being executed in memory. Keep in mind that your program and data both reside in memory and there is no way for the machine to distinguish one from the other, except through the fact that the PC (PROGRAM COUNTER) is pointing to the next instruction to be executed. IF YOU MESS THIS UP AND POINT IT TO YOUR DATA AREA THE COMPUTER WILL TRY TO EXECUTE YOUR DATA !!! RELATIVE addressing involves using an offset from the PC. This is transparent to you when you are writing the program. The assembler will convert things properly for you. It will look like you are using absolute addressing. The difference is how the address is stored in memory, when it is assembled. Absolute addressing will use the actual address you type in, but relative does not. Consider the following illustration:

Suppose at location 300 (octal) you have an MOV instruction:

MOV     %2,@#400        will assemble the address as 400
MOV     %2,400          will assemble the address as  74

Really the second instruction will use an address of 74 because relative addressing is calculated by adding 4 to the PC (which would be 300 in this case) and subtracting that from the address you specified (all in octal).

The reason for using relative addressing is that it makes the program position-independent (PIC = Position Independent Code). This means, unlike absolute addressing, that the program can be moved to a different memory location and it will still run correctly.

All of the DEFERRED (sometimes referred to as INDIRECT) modes work the same way, so I will just explain the concept. The best example of indirect addressing is the idea of a “jump” table. Suppose you have a program which expects the user to enter a code for the operation he/she wishes to perform.

Example user application menu codes
Code Meaning
0 Exit
1 Add 2 numbers
2 Subtract 2 numbers
3 Multiply 2 numbers
4 Divide 2 numbers

You could set up a table in memory with the address of each subroutine. Then, if the user types a 1 you jump to the CONTENTS of the first ADDRESS in the table. You do NOT jump to the first address of the table, but you use the address of the table to point to another address.

If I tell you to go to the bedroom to get a book for me, and you went to get it and found a note saying the book is in the kitchen. That’s indirect addressing. You went to one location which told you to go to a second location.

You can use indirect addressing in conjunction with most of the other modes, thus producing an auto-increment deferred, etc.

1.10. SYMBOLS and MNEMONICS

In the early days of computing, machine language programs were written completely with numbers. Imagine writing a program that looks like the left-hand side of your list (.LST) file… Pretty awful, if you ask me. So, names for each operation were invented. These are known as MNEMONICS. For example, MOV is the mnemonic for the move instruction whose octal code (often called an opcode) is 01####. (The #### portion is part of the address.) So, now we can say:

MOV     %2,@#400        instead of     010237 000400

This still has limitations, cause you have to remember all the addresses you are using. It becomes much easier when you use a symbol to remember a location. A SYMBOL is essentially a name for a memory location. Thus, you can say:

...
        MOV     %2,@#ANSWER
           .
           .
           .
ANSWER: .BLKW   1

And assuming for the sake of example, that ANSWER: is at location 400 it is identical to the previous method but a heck of a lot clearer than 010237 000400.

1.11. APPENDIX A: Addressing Mode Summary

Addressing Mode Summary
Code Mode Source Code Machine Code
0 Register MOV %3,%4 010304
1 Register Deferred MOV (%3),(%4) 011314
2 Auto-Increment MOV (%3)+,(%4)+ 012324
3 Auto-Increment Deferred MOV @(%3)+,@(%4)+ 013334
4 Auto-Decrement MOV -(%3),-(%4) 014344
5 Auto-Decrement Deferred MOV @-(%3),@-(%4) 015354
6 Indexed MOV 10(%3),15(%4) 016364 000010 000015
7 Index Deferred MOV 10(%3),15(%4) 017374 000010 000015

The following four modes use the Program Counter (PC) as the register portion of the addressing mode. (Note the forth digit in the first word of the machine code.)

Addressing Mode Summary
Code Mode Source Code Machine Code
2 Immediate MOV #100,%4 012704 000100
3 Absolute MOV @#400,@#500 013737 000400 000500
6 Relative MOV 400,500 016767 000074 000170
7 Relative Deferred MOV @400,@500 017777 000074 000170

1.12. APPENDIX B: Instruction Format Summary

Instruction Format Summary
Bit Pattern Mnemonic Opcode Group
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 HALT 000000 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 WAIT 000001 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 RTI 000002 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 BPT 000003 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 IOT 000004 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 RESET 000005 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 RTT 000006 0
0 0 0 0 0 0 0 0 0 1 d d d d d d JMP 000100 1
0 0 0 0 0 0 0 0 1 0 0 0 0 r r r RTS 000200 1
0 0 0 0 0 0 0 0 1 0 0 1 1 n n n SPL 000230 4
0 0 0 0 0 0 0 0 1 0 1 0 n z v c CL(N/Z/V/C) 000240 5
0 0 0 0 0 0 0 0 1 0 1 1 n z v c SE(N/Z/V/C) 000260 5
0 0 0 0 0 0 0 0 1 1 d d d d d d SWAB 000300 1
0 0 0 0 0 0 0 1 o o o o o o o o BR 000400 2
0 0 0 0 0 0 1 0 o o o o o o o o BNE 001000 2
0 0 0 0 0 0 1 1 o o o o o o o o BEQ 001400 2
0 0 0 0 0 1 0 0 o o o o o o o o BGE 002000 2
0 0 0 0 0 1 0 1 o o o o o o o o BLT 002400 2
0 0 0 0 0 1 1 0 o o o o o o o o BGT 003000 2
0 0 0 0 0 1 1 1 o o o o o o o o BLE 003400 2
0 0 0 0 1 0 0 r r r d d d d d d JSR 004000 3
b 0 0 0 1 0 1 0 0 0 d d d d d d CLR(B) 005000 1
b 0 0 0 1 0 1 0 0 1 d d d d d d COM(B) 005100 1
b 0 0 0 1 0 1 0 1 0 d d d d d d INC(B) 005200 1
b 0 0 0 1 0 1 0 1 1 d d d d d d DEC(B) 005300 1
b 0 0 0 1 0 1 1 0 0 d d d d d d NEG(B) 005400 1
b 0 0 0 1 0 1 1 0 1 d d d d d d ADC(B) 005500 1
b 0 0 0 1 0 1 1 1 0 d d d d d d SBC(B) 005600 1
b 0 0 0 1 0 1 1 1 1 d d d d d d TST(B) 005700 1
b 0 0 0 1 1 0 0 0 0 d d d d d d ROR(B) 006000 1
b 0 0 0 1 1 0 0 0 1 d d d d d d ROL(B) 006100 1
b 0 0 0 1 1 0 0 1 0 d d d d d d ASR(B) 006200 1
b 0 0 0 1 1 0 0 1 1 d d d d d d ASL(B) 006300 1
0 0 0 0 1 1 0 1 0 0 n n n n n n MARK 006400 4
0 0 0 0 1 1 0 1 0 1 s s s s s s MFPI 006500 1
0 0 0 0 1 1 0 1 1 0 d d d d d d MTPI 006600 1
0 0 0 0 1 1 0 1 1 1 d d d d d d SXT 006700 1
b 0 0 1 s s s s s s d d d d d d MOV(B) 010000 3
b 0 1 0 s s s s s s d d d d d d CMP(B) 020000 3
b 0 1 1 s s s s s s d d d d d d BIT(B) 030000 3
b 1 0 0 s s s s s s d d d d d d BIC(B) 040000 3
b 1 0 1 s s s s s s d d d d d d BIS(B) 050000 3
0 1 1 0 s s s s s s d d d d d d ADD 060000 3
0 1 1 1 0 0 0 r r r s s s s s s MUL 070000 7
0 1 1 1 0 0 1 r r r s s s s s s DIV 071000 7
0 1 1 1 0 1 0 r r r s s s s s s ASH 072000 7
0 1 1 1 0 1 1 r r r s s s s s s ASHC 073000 7
0 1 1 1 1 0 0 r r r d d d d d d XOR 074000 3
0 1 1 1 1 0 1 0 0 0 0 0 0 r r r FADD 075000 1
0 1 1 1 1 0 1 0 0 0 0 0 1 r r r FSUB 075010 1
0 1 1 1 1 0 1 0 0 0 0 1 0 r r r FMUL 075020 1
0 1 1 1 1 0 1 0 0 0 0 1 1 r r r FDIV 075030 1
0 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 MED 076600 0
0 1 1 1 1 1 0 1 1 1 d d d d d d XFC 076700 1
0 1 1 1 1 1 1 r r r o o o o o o SOB 077000 6
1 0 0 0 0 0 0 0 o o o o o o o o BPL 100000 2
1 0 0 0 0 0 0 1 o o o o o o o o BMI 100400 2
1 0 0 0 0 0 1 0 o o o o o o o o BHI 101000 2
1 0 0 0 0 0 1 1 o o o o o o o o BLOS 101400 2
1 0 0 0 0 1 0 0 o o o o o o o o BVC 102000 2
1 0 0 0 0 1 0 1 o o o o o o o o BVS 102400 2
1 0 0 0 0 1 1 0 o o o o o o o o BCC 103000 2
1 0 0 0 0 1 1 0 o o o o o o o o BHIS 103000 2
1 0 0 0 0 1 1 1 o o o o o o o o BCS 103400 2
1 0 0 0 0 1 1 1 o o o o o o o o BLO 103400 2
1 0 0 0 1 0 0 0 t t t t t t t t EMT 104000 4
1 0 0 0 1 0 0 1 t t t t t t t t TRAP 104400 4
1 0 0 0 1 1 0 1 0 0 s s s s s s MTPS 106400 1
1 0 0 0 1 1 0 1 0 1 s s s s s s MFPD 106500 1
1 0 0 0 1 1 0 1 1 0 d d d d d d MTPD 106600 1
1 0 0 0 1 1 0 1 1 1 d d d d d d MFPS 106700 1
1 1 1 0 s s s s s s d d d d d d SUB 160000 3
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 CFCC 170000 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 SETF 170001 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 SETI 170002 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 LDUB 170003 0
1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 MNS 170004 0
1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 MPP 170005 0
1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 MAS 170007 0
1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 1 SETD 170011 0
1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 0 SETL 170012 0
1 1 1 1 0 0 0 0 0 1 s s s s s s LDFPS 170100 1
1 1 1 1 0 0 0 0 1 0 d d d d d d STFPS 170200 1
1 1 1 1 0 0 0 0 1 1 d d d d d d STST 170300 1
1 1 1 1 0 0 0 1 0 0 d d d d d d CLR(F/D) 170400 1
1 1 1 1 0 0 0 1 0 1 fdfdfdfdfdfd TST(F/D) 170500 1
1 1 1 1 0 0 0 1 1 0 fdfdfdfdfdfd ABS(F/D) 170600 1
1 1 1 1 0 0 0 1 1 1 fsfsfsfsfsfs NEG(F/D) 170700 1
1 1 1 1 0 0 1 0 f f fsfsfsfsfsfs MUL(F/D) 171000 7
1 1 1 1 0 0 1 1 f f fsfsfsfsfsfs MOD(F/D) 171400 7
1 1 1 1 0 1 0 0 f f fsfsfsfsfsfs ADD(F/D) 172000 7
1 1 1 1 0 1 0 1 f f fsfsfsfsfsfs LD(F/D) 172400 7
1 1 1 1 0 1 1 0 f f fsfsfsfsfsfs SUB(F/D) 173000 7
1 1 1 1 0 1 1 1 f f fsfsfsfsfsfs CMP(F/D) 173400 7
1 1 1 1 1 0 0 0 f f fdfdfdfdfdfd ST(F/D) 174000 3
1 1 1 1 1 0 0 1 f f fsfsfsfsfsfs DIV(F/D) 174400 7
1 1 1 1 1 0 1 0 f f d d d d d d STEXP 175000 3
1 1 1 1 1 0 1 1 f f d d d d d d STCFI 175400 3
1 1 1 1 1 0 1 1 f f d d d d d d STCFL 175400 3
1 1 1 1 1 0 1 1 f f d d d d d d STCDI 175400 3
1 1 1 1 1 0 1 1 f f d d d d d d STCDL 175400 3
1 1 1 1 1 1 0 0 f f fdfdfdfdfdfd STCFD 176000 3
1 1 1 1 1 1 0 0 f f fdfdfdfdfdfd STCDF 176000 3
1 1 1 1 1 1 0 1 f f s s s s s s LDEXP 176400 7
1 1 1 1 1 1 1 0 f f s s s s s s LDCIF 177000 7
1 1 1 1 1 1 1 0 f f s s s s s s LDCID 177000 7
1 1 1 1 1 1 1 0 f f s s s s s s LDCLF 177000 7
1 1 1 1 1 1 1 0 f f s s s s s s LDCLD 177000 7
1 1 1 1 1 1 1 1 f f fsfsfsfsfsfs LDCDF 177400 7
1 1 1 1 1 1 1 1 f f fsfsfsfsfsfs LDCFD 177400 7
Instruction Argument Groupings
Group Instruction format
0 opcode
1
opcode source
opcode destination
2 opcode offset
3 opcode source,destination
4 opcode octal-number
5 opcode bit-pattern
6 opcode register,offset
7 opcode destination,source
Meanings of abbreviations
Bits Meanings Number of bits
r Register 3
s Source Address 6 (3 for mode, 3 for register)
d Destination Address 6 (3 for mode, 3 for register)
o Offset 8 (except SOB uses 6)
n z v c Condition Codes 4
t Trap 8
f Floating Point Register 2
fs Floating Point Source 6
fd Floating Point Destination 6
n Octal Number 3 for SPL, 6 for MARK

1.13. APPENDIX C: The D.O.B.A.S.H. Cheat Sheet

The D.O.B.A.S.H. Cheat Sheet
Decimal Octal Binary ASCII Sixbit Hex
000 000 0 0 0 0 0 0 0 ^@   00
001 001 0 0 0 0 0 0 1 ^A ! 01
002 002 0 0 0 0 0 1 0 ^B " 02
003 003 0 0 0 0 0 1 1 ^C # 03
004 004 0 0 0 0 1 0 0 ^D $ 04
005 005 0 0 0 0 1 0 1 ^E % 05
006 006 0 0 0 0 1 1 0 ^F & 06
007 007 0 0 0 0 1 1 1 ^G ' 07
008 010 0 0 0 1 0 0 0 ^H ( 08
009 011 0 0 0 1 0 0 1 ^I ) 09
010 012 0 0 0 1 0 1 0 ^J * 0A
011 013 0 0 0 1 0 1 1 ^K + 0B
012 014 0 0 0 1 1 0 0 ^L , 0C
013 015 0 0 0 1 1 0 1 ^M - 0D
014 016 0 0 0 1 1 1 0 ^N . 0E
015 017 0 0 0 1 1 1 1 ^O / 0F
016 020 0 0 1 0 0 0 0 ^P 0 10
017 021 0 0 1 0 0 0 1 ^Q 1 11
018 022 0 0 1 0 0 1 0 ^R 2 12
019 023 0 0 1 0 0 1 1 ^S 3 13
020 024 0 0 1 0 1 0 0 ^T 4 14
021 025 0 0 1 0 1 0 1 ^U 5 15
022 026 0 0 1 0 1 1 0 ^V 6 16
023 027 0 0 1 0 1 1 1 ^W 7 17
024 030 0 0 1 1 0 0 0 ^X 8 18
025 031 0 0 1 1 0 0 1 ^Y 9 19
026 032 0 0 1 1 0 1 0 ^Z : 1A
027 033 0 0 1 1 0 1 1 ^[ ; 1B
028 034 0 0 1 1 1 0 0 ^\ < 1C
029 035 0 0 1 1 1 0 1 ^] = 1D
030 036 0 0 1 1 1 1 0 ^^ > 1E
031 037 0 0 1 1 1 1 1 ^_ ? 1F
032 040 0 1 0 0 0 0 0   @ 20
033 041 0 1 0 0 0 0 1 ! A 21
034 042 0 1 0 0 0 1 0 " B 22
035 043 0 1 0 0 0 1 1 # C 23
036 044 0 1 0 0 1 0 0 $ D 24
037 045 0 1 0 0 1 0 1 % E 25
038 046 0 1 0 0 1 1 0 & F 26
039 047 0 1 0 0 1 1 1 ' G 27
040 050 0 1 0 1 0 0 0 ( H 28
041 051 0 1 0 1 0 0 1 ) I 29
042 052 0 1 0 1 0 1 0 * J 2A
043 053 0 1 0 1 0 1 1 + K 2B
044 054 0 1 0 1 1 0 0 , L 2C
045 055 0 1 0 1 1 0 1 - M 2D
046 056 0 1 0 1 1 1 0 . N 2E
047 057 0 1 0 1 1 1 1 / O 2F
048 060 0 1 1 0 0 0 0 0 P 30
049 061 0 1 1 0 0 0 1 1 Q 31
050 062 0 1 1 0 0 1 0 2 R 32
051 063 0 1 1 0 0 1 1 3 S 33
052 064 0 1 1 0 1 0 0 4 T 34
053 065 0 1 1 0 1 0 1 5 U 35
054 066 0 1 1 0 1 1 0 6 V 36
055 067 0 1 1 0 1 1 1 7 W 37
056 070 0 1 1 1 0 0 0 8 X 38
057 071 0 1 1 1 0 0 1 9 Y 39
058 072 0 1 1 1 0 1 0 : Z 3A
059 073 0 1 1 1 0 1 1 ; [ 3B
060 074 0 1 1 1 1 0 0 < \ 3C
061 075 0 1 1 1 1 0 1 = ] 3D
062 076 0 1 1 1 1 1 0 > ^ 3E
063 077 0 1 1 1 1 1 1 ? _ 3F
064 100 1 0 0 0 0 0 0 @   40
065 101 1 0 0 0 0 0 1 A   41
066 102 1 0 0 0 0 1 0 B   42
067 103 1 0 0 0 0 1 1 C   43
068 104 1 0 0 0 1 0 0 D   44
069 105 1 0 0 0 1 0 1 E   45
070 106 1 0 0 0 1 1 0 F   46
071 107 1 0 0 0 1 1 1 G   47
072 110 1 0 0 1 0 0 0 H   48
073 111 1 0 0 1 0 0 1 I   49
074 112 1 0 0 1 0 1 0 J   4A
075 113 1 0 0 1 0 1 1 K   4B
076 114 1 0 0 1 1 0 0 L   4C
077 115 1 0 0 1 1 0 1 M   4D
078 116 1 0 0 1 1 1 0 N   4E
079 117 1 0 0 1 1 1 1 O   4F
080 120 1 0 1 0 0 0 0 P   50
081 121 1 0 1 0 0 0 1 Q   51
082 122 1 0 1 0 0 1 0 R   52
083 123 1 0 1 0 0 1 1 S   53
084 124 1 0 1 0 1 0 0 T   54
085 125 1 0 1 0 1 0 1 U   55
086 126 1 0 1 0 1 1 0 V   56
087 127 1 0 1 0 1 1 1 W   57
088 130 1 0 1 1 0 0 0 X   58
089 131 1 0 1 1 0 0 1 Y   59
090 132 1 0 1 1 0 1 0 Z   5A
091 133 1 0 1 1 0 1 1 [   5B
092 134 1 0 1 1 1 0 0 \   5C
093 135 1 0 1 1 1 0 1 ]   5D
094 136 1 0 1 1 1 1 0 ^   5E
095 137 1 0 1 1 1 1 1 _   5F
096 140 1 1 0 0 0 0 0     60
097 141 1 1 0 0 0 0 1 a   61
098 142 1 1 0 0 0 1 0 b   62
099 143 1 1 0 0 0 1 1 c   63
100 144 1 1 0 0 1 0 0 d   64
101 145 1 1 0 0 1 0 1 e   65
102 146 1 1 0 0 1 1 0 f   66
103 147 1 1 0 0 1 1 1 g   67
104 150 1 1 0 1 0 0 0 h   68
105 151 1 1 0 1 0 0 1 i   69
106 152 1 1 0 1 0 1 0 j   6A
107 153 1 1 0 1 0 1 1 k   6B
108 154 1 1 0 1 1 0 0 l   6C
109 155 1 1 0 1 1 0 1 m   6D
110 156 1 1 0 1 1 1 0 n   6E
111 157 1 1 0 1 1 1 1 o   6F
112 160 1 1 1 0 0 0 0 p   70
113 161 1 1 1 0 0 0 1 q   71
114 162 1 1 1 0 0 1 0 r   72
115 163 1 1 1 0 0 1 1 s   73
116 164 1 1 1 0 1 0 0 t   74
117 165 1 1 1 0 1 0 1 u   75
118 166 1 1 1 0 1 1 0 v   76
119 167 1 1 1 0 1 1 1 w   77
120 170 1 1 1 1 0 0 0 x   78
121 171 1 1 1 1 0 0 1 y   79
122 172 1 1 1 1 0 1 0 z   7A
123 173 1 1 1 1 0 1 1 {   7B
124 174 1 1 1 1 1 0 0 |   7C
125 175 1 1 1 1 1 0 1 }   7D
126 176 1 1 1 1 1 1 0 ~   7E
127 177 1 1 1 1 1 1 1 RU   7F

1.14. APPENDIX D: Powers of 2 Cheat Sheet

Powers of 2 Cheat Sheet
Power Decimal Octal Hex K
0 1 1 1 0 K
1 2 2 2 0 K
2 4 4 4 0 K
3 8 10 8 0 K
4 16 20 10 0 K
5 32 40 20 0 K
6 64 100 40 0 K
7 128 200 80 0 K
8 256 400 100 0 K
9 512 1000 200 0 K
10 1024 2000 400 1 K
11 2048 4000 800 2 K
12 4096 10000 1000 4 K
13 8192 20000 2000 8 K
14 16384 40000 4000 16 K
15 32768 100000 8000 32 K
16 65536 200000 10000 64 K
17 131072 400000 20000 128 K
18 262144 1000000 40000 256 K
19 524288 2000000 80000 512 K
20 048576 4000000 100000 1024 K
21 097152 10000000 200000 2048 K
22 194304 20000000 400000 4096 K
23 388608 40000000 800000 8192 K