1. Assembler Made Painless (I Hope)¶

by Kevin Cole

March 14, 1984

Dedicated to

CHICKEN

my favorite student

1.1. PREFACE¶

Updated 2020.09.10: This was written 36 years ago, and at a time when students had access to the Digital Equipment Corporation (DEC) manuals. Also, in an era before hypertext. New users stumbling onto this may be better served by the PDP-11 Processor Handbook from the University of Calgary, Department of Computer Science, which is a mere 19 years old as of this vintage.

I intend to explain much of the workings of a computer by analogy. The style is somewhat loose and flippant at times, but for the most part, has quite a bit of useful information to be absorbed. I am concentrating on what I consider to be of great importance in learning assembly language, and on areas which most people whom I have tutored had the greatest difficulty. So let’s get straight to work.

1.2. INTRODUCTION¶

At the present time, most computers store information as electro-magnetic impulses. Many things in the physical world are “bipolar” in nature. That is, they have two states. In the case of magnets, the way in which the electrons are aligned create “poles” and by using electric current these poles can be reversed. This creates a switch with two conditions which can be interpreted as any of the following: on/off, true/false, 0/1, +/-… You get the picture. It is helpful to remember all of the above, because, depending on the context, the switch can represent any of the above conditions, and more. For convenience, this two-state switch (or binary switch) is referred to as a BIT.

1.3. BITS, BYTES and WORDS¶

Computer memory is filled with millions of these tiny electromagnetic bits. But, organized as independent elements, they cannot represent much. Therefore, bits are organized into groups of 8, 16, or 32. That is to say the computer will never interpret bits on an individual basis, but rather as part of a group of consecutive bits. This group of bits is referred to as a WORD. The size of a word varies from computer to computer. Most microcomputers have an 8-bit word, the large computers have a 32-bit word, and the PDP-11 minicomputers have 16-bit words. There are other machines which use unusual word sizes, but I won’t go into that.

Sometimes, it is not handy to grab 16 bits at once. Occasionally, (actually quite often), it is more useful to get only 8 bits at a time. 8 consecutive bits are referred to as a BYTE.

Word sizes on various machines¶
Computer	Type	Word size in bytes	(bits)
APPLE II-e	Micro	1 byte	8 bits
Commodore VIC20	Micro	1 byte	8 bits
PDP-11	Mini	2 bytes	16 bits
Apple Macintosh	Micro	4 bytes	32 bits
IBM-360	Mainframe	4 bytes	32 bits
VAX	Mid-size	4 bytes	32 bits
DEC 10	Mainframe	? bytes	36 bits

The DECsystem 10 uses a concept called variable length bytes. Not discussed here.

Let’s look at some of the more frequent uses for bits, bytes and words.

Here’s a group of 8 bits (1 byte):

0 1 1 0 1 1 0 1

This can be interpreted several different ways. Each bit could be set up as a flag or switch to indicate when a particular condition has occurred. For example, when a calculation is performed the leftmost bit (also referred to as the high order bit) might be set to 1 if the result is negative, and set to 0 for positive results. The second bit from the left might be set when there is an error in the mathematical operation being performed, for example division by zero.

Another way to interpret the byte is as a binary number. In this case:

$\begin{gather*} (0 \times 2^7) + (1 \times 2^6) + (1 \times 2^5) + (0 \times 2^4) + (1 \times 2^3) + (1 \times 2^2) + (0 \times 2^1) + (1 \times 2^0) =\\ (0 \times 128) + (1 \times 64) + (1 \times 32) + (0 \times 16) + (1 \times 8) + (1 \times 4) + (0 \times 2) + (1 \times 1) =\\ 0 + 64 + 32 + 0 + 8 + 4 + 0 + 1 =\\ 109 \end{gather*}$

1.4. ASCII¶

Other ways of interpreting it, involve using the value as a code. Perhaps, as a kid, you played around with the idea of codes for passing notes between you and someone else, that told your innermost feelings about a member of the opposite sex, or something you did not want read by parents or teachers… In computerland, the code is not exactly used for that reason, but is a way of storing textual information in memory. The standard code used today is referred to as ASCII (American Standard Code for Information Interchange), pronounced “ass-kee”. In this code, every character on your terminal keyboard, or on a printer, or in a file on disk, is represented as an 8-bit code. Actually, only 7 bits are currently needed for the code. The high order bit is not used on most machines. There are 128 different characters in the ASCII code ranging in value from 0 to 127.

The first 32 codes are non-printing control-characters. These are generated by holding the control key as you strike some other key. Actually, some of these have a visible effect, however they are still referred to as non-printing. Some of these characters, because of their frequent use, are also located in a separate key that does not have to be struck in conjunction with the control key. The most useful of these to remember are:

Control-key values and corresponding key name¶
Control character	Keyboard key name	Notes
Control-[	= ESCAPE	(sometimes marked ALTMODE)
Control-H	= BACKSPACE
Control-I	= TAB
Control-J	= LINE FEED	(sometimes mapped to ENTER)
Control-M	= CARRIAGE RETURN	(sometimes mapped to ENTER)

Try these. (For example, next time you are using the machine type Control-I instead of TAB.)

(I know it may seem that I am belaboring this point, but bear with me.)

The code for a Control-A is 00000001 binary. Control-B is 00000010. Control-C = 00000011… If you understand binary as well as you indicated, you should see the pattern developing. Believe me, it’s important. If you are debugging a machine language program, and you don’t know the ASCII code, you’ll be at it more than twice as long as necessary.

The next group of 32 (codes 32-63) are your digits and special symbols; punctuation generaly speaking. IMPORTANT: The ASCII code for the digit 2 is NOT the same as the value 2 !!! The digits on you keyboard and printer are merely SYMBOLS. The ASCII code for the digit 2 is 00110010 binary (or 50 decimal). If you try to print the value 2 (binary 00000010) it will print as a Control-B. Just as the same sound spoken at different pitches has different meanings in Chinese, so too, two bytes which are identical can have different meanings, depending on the context in which they are used. This is one of the most difficult and most important thing to remember about assembler language.

Next are the upper case letters (and a few punctuation symbols). These are codes 64-95. And lastly, the lower case letters (and still a few more punctuation marks) codes 96-127.

See the cheat sheet included in this for a better understanding. I will refer to it often in here.

1.5. INSTRUCTIONS¶

The most common use for a byte is as part of an instruction to the computer. This too is a code. For instance the value 00000011 on the PDP says Branch when Equal to zero (BEQ). The codes for each instruction are so tedious to memorize, that it is not worth the trouble. For the most part, you will have access to some reference which allows you to find the code for an instruction quickly. And, since there is NO standard here, it does no good to memorize the instruction codes for a PDP-11 and then go work on an IBM. The codes are completely different.

Therefore, it is better to memorize the scheme used for the mnemonics for the various instructions. Like knowing that a mnemonic which starts with the letter “B” is most likely to be a Branch instruction. This generally holds true for all machines.

1.6. ADDRESSING¶

Ok, so now you have these millions of bits organized into consecutive groups. Now comes the problem of how to get to a particular group. Well, each byte is assigned an ADDRESS starting at 0 and increasing by 1 til all memory is exhausted. Unfortunately, here’s where the problems start…

I mentioned earlier that the PDP-11 “thinks” in 16-bit words. This means it usually grabs two bytes at a shot. This makes things complicated when you start to think about how memory on the 11 is organized. But I will labor to make it a bit more accessible. (Ha ha. Get it? A BIT more ACCESSible… Never mind.)

Most machines tend to number their bytes consecutively, but because Digital Equipment Corporation wants to give students a hard time, they decided to do things differently… Actually there is a method to their particular brand of madness, and I hope it will become clearer as we continue. Above, I mentioned that the computer we are using here works in units of 16 bits, or 2 bytes. And each byte has an address. Because it uses a 16-bit word, it can work with positive integers between 0 and $2^{16} - 1$ (0000000000000000 to 1111111111111111 which is 1000000000000000 minus 1, take my word for it). Representing negative numbers becomes a bit more complicated. It uses a scheme called 2’s-complement notation. More on that subject later. Because the rightmost 8 bits hold the lower portion of the value, it is considered to be the low-order byte and has the lower address value. This is always the even-addressed byte. And the high-order byte is the odd-addressed byte.

Imagining it as boxes filled with values:

1	0	3	2	5	4
00000011	11010000	10101010	11110000	00000000	00000000	…

The numbers below the boxes represent the address of each byte in memory. The values in the boxes are the contents of each byte. Confused yet? If not, I’ll try harder to confuse you.

1.7. OCTAL¶

After a while, it becomes very tedious work to look at everything in binary. So on most machines, HEXADECIMAL or BASE 16 is used to represent groups of 4 bits conveniently. Thus, the contents (or the address) of a 16-bit word could be written as 4 hexadecimal digits. However, DEC chose OCTAL or BASE 8. There is no justifiable reason for this. It is one of very few things DEC did which is just plain stupid. You’ll see why as we go on. Anyway, octal represents 3 bits as a single digit. Refer to the cheat sheet for an illustration of how to count in any of the useful computer numbering systems. I am going to assume you either are already sufficiently familiar with it from previous school or that your assembly language teacher has covered it in depth.

1.8. GENERAL PURPOSE REGISTERS (or ACCUMULATORS)¶

Because memory is usually quite large, and it takes time for a computer to reference a particular address and manipulate the contents, computer designers created special locations which can be used as a fast memory and can be used easily with most instructions. These are called ACCUMULATORS on some machines and REGISTERS on others. The PDP-11 has 8 such registers, which are numbered %0 to %7. Two of these have highly specialized functions. %6 is known as the Stack Pointer (SP) and %7 is the Program Counter (PC). More on both of these later.

1.9. ADDRESSING MODES¶

You probably skipped right to this section because you have no patience. Well, good luck… cause if you did you may have missed something very important.

Generally speaking, to manipulate data in the computer, information is obtained from some auxiliary device (INPUT) and moved into memory. Then it is moved from memory to an register, where it is bent, folded, spindled, and mutilated (i.e. added to, subtracted from, etc.), moved back into memory and finally moved to an auxiliary device again (OUTPUT). Often the information must undergo a transformation or conversion from some external form (such as the ASCII code) to an internal representation (such as binary integer). One word of memory only holds 2 characters, maximum. This means that if we enter a 5 digit number from the keyboard, it will take up 2 and 1/2 words of memory in its ASCII form. Remember, each character entered from the keyboard occupies 1 byte of memory.

So, let’s say we entered the string "98760". Each character would be stored in ascending bytes. The octal codes for each digit are: 071, 070, 067, 066, 060, respectively. In memory, represented in octal and binary this would be:

1	0	3	2	5	4
0 7 0	0 7 1	0 6 6	0 6 7	0 0 0	0 6 0
00111000	00111001	00110110	00110111	00000000	00110000

By crushing the two halves of each word together we get:


0 3 4 0 7 1	0 3 3 0 6 7	0 0 0 0 6 0
0011100000111001	0011011000110111	0000000000110000

(It is still the same binary value, but when shown as a 6-digit octal value representing the word as a single value, the distinction between the 2 bytes becomes a bit blurred. This would not have happened if DEC had chosen hexadecimal. If the above illustration is not clear, we will have to go over it together.)

There are several different ways to access the above bytes. The simplest, is to use ABSOLUTE addressing. This means when referencing a location in memory, you give the address of that particular word in the instruction. So, if the 3 words in the above example occupied memory locations 400 to 406 (octal), and we wish to move the first word to another location, we would specify the value 400 as the address. The format for this “@#400”.

Absolute addressing, though simple, has several drawbacks. First of all, some instructions were implemented by the manufacturer in such a way that they cannot use absolute addressing. Secondly, the addresses can range from 000000 to 177777 (octal). This means that when using absolute addressing in an instruction, one word of computer memory is used for the instruction itself, and another for the absolute address. This eats up space quickly. Other addressing modes do not share this deficiency. And it does not have much flexability. If you change the location of your data area in memory, it must have all of your program’s absolute addresses changed as well.

In both FORTRAN and COBOL you encountered the idea of an indexed table (also called an array or matrix). This is used when you have several locations which are all related in some way (for example a list of department names which you select from based on the department number):

COBOL:	`MOVE DEPART-LIST (DEPT-NUMBER-IN) TO DEPT-OUT.`
FORTRAN:	`DEPOUT = DEPT(CODE)`

The number in the variables in the parentheses are indexes into the arrays. Well in INDEXED addressing, the same concept applies. The index value is kept in an register. In the above example, the first byte would be referenced by placing a value of 0 into an register (for now lets use %2) and using an address like “400(%2)”. The “EFFECTIVE” address, i.e. the actual address from which the computer fetches the data, is computed by adding the contents of register %2 to the offset 400. The result is 400. Now if we add 1 to register %2 the effective address becomes 401 but the instruction did not have to be changed, only the contents of the index register. You can choose any register to be your index (except the PC and SP), as long as you are not using it for anything else at that point in your program.

In COBOL or FORTRAN you would have a separate instruction to add one to your index… The same is true for indexed addressing. However there is an addressing mode which will automatically increment the index after it is used in an instruction. As you have probably guessed by now, this is the AUTO-INCREMENT addressing mode and is specified as “(%2)+”. With this mode, however, you are not allowed to use an offset. Therefore, for our example, you would have to set register %2 to a value of 400 before using this mode. Its’ crude equivalent in COBOL would be:

MOVE DEPART-LIST (DEPT-NUMBER-IN) TO DEPT-OUT.
ADD 1 TO DEPT-NUMBER-IN.

Now the index is automatically set up for the next pass through the loop to reference the next element in the table.

Now it seems appropriate to introduce two topics at once. These are the STACK and AUTO-DECREMENT mode. Suppose you have part of a calculation completed, and you have another quantity to compute. You do not want to loose the first part while calculating the next part. One solution is to move it from the register where you’ve chosen to store it, to a temporary memory location, and remember where you’ve left it… The STACK POINTER remembers it for you, and the STACK is the temporary area where you keep it. And both auto-increment and auto-decrement are used to control this process.

Ok. This is a typical, overused analogy, but it still works well. When you eat at the cafeteria, you often pick up a tray from a STACK of trays. As you pick yours up, a spring beneath the stack pops the next tray up a little. And when you replace your tray, it pushes everything down slightly. Now imagine each tray has a piece of paper with a number on it. Each tray is a word of memory and the paper is the contents of that word. Auto-increment mode is analogous to PUSHing a tray onto the stack and auto-decrement corresponds to POPing the stack.

Auto-decrement mode works in reverse. It subtracts 1 from the index first, then it uses the result to calculate the effective address. So now, to go back to our example, we could go through our 5 bytes of memory backwards. Starting with a 406 in register %2 and using “-(%2)” would access the byte located at location 405. Using it again would get the byte at 404, and so on.

Look at the following sequence of code:

MOVE 400 TO STACK-POINTER.
   .
   .    (Calculate something called TOT-1 here)
   .
MOVE TOT-1 TO MEMORY (STACK-POINTER).
ADD  1 TO STACK-POINTER.
   .
   .    (Calculate something called TOT-2 here)
   .
SUBTRACT 1 FROM STACK-POINTER.
ADD MEMORY (STACK-POINTER) TO TOT-2.

That’s roughly what you should think of auto-inc and auto-dec as doing. And that’s a good approximation of the stack also.

I’ve been misleading you somewhat with these examples. Actually, auto-increment and auto-decrement can add/subtract 1 from a register or it can add/subtract 2 from a register. This depends on the kind of instruction you are using. I mentioned at the beginning of this text that the PDP-11 usually handles memory in 16-bit quantities called words. This means it usually uses only the even addresses and works with the even/odd pair of bytes at that location in memory. However, using the byte instructions, you can force counting by 1 and only access 8-bits at a time.

The stack is also used in subroutines to “remember” where the main program is to resume execution when the subroutine finishes.

REGISTER mode addressing is similar to absolute. It’s simple. It does not work with memory at all. Use it when all of your data is in registers already and specify it using “%0” through “%7”.

IMMEDIATE mode. In the previous examples, we were accessing variable information in memory. Occasionally in machine language, as in other high level languages, it is necessary to use a constant, for example multiplying some variable by 100 to calculate percentage. Well, it’s silly to store the value 100 in memory somewhere, then when we need it, fetch it from memory for use with the multiply instruction and never use it again. This takes unnecessary time. Instead, it would be much better to have the value available to the instruction without going to some other memory location to get it. So you would have it IMMEDIATELY. Strictly speaking, it’s not an address. But it is classified that way. To use it specify “#100”.

RELATIVE addressing and the PC… This one is pretty complicated, so pay attention. (I know, you find them ALL complicated. Sorry ‘bout that. Doing the best I can.) Register 7 is used by the system itself, to point to the location of the current instruction being executed in memory. Keep in mind that your program and data both reside in memory and there is no way for the machine to distinguish one from the other, except through the fact that the PC (PROGRAM COUNTER) is pointing to the next instruction to be executed. IF YOU MESS THIS UP AND POINT IT TO YOUR DATA AREA THE COMPUTER WILL TRY TO EXECUTE YOUR DATA !!! RELATIVE addressing involves using an offset from the PC. This is transparent to you when you are writing the program. The assembler will convert things properly for you. It will look like you are using absolute addressing. The difference is how the address is stored in memory, when it is assembled. Absolute addressing will use the actual address you type in, but relative does not. Consider the following illustration:

Suppose at location 300 (octal) you have an MOV instruction:

MOV     %2,@#400        will assemble the address as 400
MOV     %2,400          will assemble the address as  74

Really the second instruction will use an address of 74 because relative addressing is calculated by adding 4 to the PC (which would be 300 in this case) and subtracting that from the address you specified (all in octal).

The reason for using relative addressing is that it makes the program position-independent (PIC = Position Independent Code). This means, unlike absolute addressing, that the program can be moved to a different memory location and it will still run correctly.

All of the DEFERRED (sometimes referred to as INDIRECT) modes work the same way, so I will just explain the concept. The best example of indirect addressing is the idea of a “jump” table. Suppose you have a program which expects the user to enter a code for the operation he/she wishes to perform.

Example user application menu codes¶
Code	Meaning
0	Exit
1	Add 2 numbers
2	Subtract 2 numbers
3	Multiply 2 numbers
4	Divide 2 numbers

You could set up a table in memory with the address of each subroutine. Then, if the user types a 1 you jump to the CONTENTS of the first ADDRESS in the table. You do NOT jump to the first address of the table, but you use the address of the table to point to another address.

If I tell you to go to the bedroom to get a book for me, and you went to get it and found a note saying the book is in the kitchen. That’s indirect addressing. You went to one location which told you to go to a second location.

You can use indirect addressing in conjunction with most of the other modes, thus producing an auto-increment deferred, etc.

1.10. SYMBOLS and MNEMONICS¶

In the early days of computing, machine language programs were written completely with numbers. Imagine writing a program that looks like the left-hand side of your list (.LST) file… Pretty awful, if you ask me. So, names for each operation were invented. These are known as MNEMONICS. For example, MOV is the mnemonic for the move instruction whose octal code (often called an opcode) is 01####. (The #### portion is part of the address.) So, now we can say:

MOV     %2,@#400        instead of     010237 000400

This still has limitations, cause you have to remember all the addresses you are using. It becomes much easier when you use a symbol to remember a location. A SYMBOL is essentially a name for a memory location. Thus, you can say:

...
        MOV     %2,@#ANSWER
           .
           .
           .
ANSWER: .BLKW   1

And assuming for the sake of example, that ANSWER: is at location 400 it is identical to the previous method but a heck of a lot clearer than 010237 000400.

1.11. APPENDIX A: Addressing Mode Summary¶

Addressing Mode Summary¶
Code	Mode	Source Code	Machine Code
0	Register	MOV %3,%4	010304
1	Register Deferred	MOV (%3),(%4)	011314
2	Auto-Increment	MOV (%3)+,(%4)+	012324
3	Auto-Increment Deferred	MOV @(%3)+,@(%4)+	013334
4	Auto-Decrement	MOV -(%3),-(%4)	014344
5	Auto-Decrement Deferred	MOV @-(%3),@-(%4)	015354
6	Indexed	MOV 10(%3),15(%4)	016364 000010 000015
7	Index Deferred	MOV 10(%3),15(%4)	017374 000010 000015

The following four modes use the Program Counter (PC) as the register portion of the addressing mode. (Note the forth digit in the first word of the machine code.)

Addressing Mode Summary¶
Code	Mode	Source Code	Machine Code
2	Immediate	MOV #100,%4	012704 000100
3	Absolute	MOV @#400,@#500	013737 000400 000500
6	Relative	MOV 400,500	016767 000074 000170
7	Relative Deferred	MOV @400,@500	017777 000074 000170

1.12. APPENDIX B: Instruction Format Summary¶

Instruction Format Summary¶
Bit Pattern	Mnemonic	Opcode	Group
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0	HALT	000000	0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1	WAIT	000001	0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0	RTI	000002	0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1	BPT	000003	0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0	IOT	000004	0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1	RESET	000005	0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0	RTT	000006	0
0 0 0 0 0 0 0 0 0 1 d d d d d d	JMP	000100	1
0 0 0 0 0 0 0 0 1 0 0 0 0 r r r	RTS	000200	1
0 0 0 0 0 0 0 0 1 0 0 1 1 n n n	SPL	000230	4
0 0 0 0 0 0 0 0 1 0 1 0 n z v c	CL(N/Z/V/C)	000240	5
0 0 0 0 0 0 0 0 1 0 1 1 n z v c	SE(N/Z/V/C)	000260	5
0 0 0 0 0 0 0 0 1 1 d d d d d d	SWAB	000300	1
0 0 0 0 0 0 0 1 o o o o o o o o	BR	000400	2
0 0 0 0 0 0 1 0 o o o o o o o o	BNE	001000	2
0 0 0 0 0 0 1 1 o o o o o o o o	BEQ	001400	2
0 0 0 0 0 1 0 0 o o o o o o o o	BGE	002000	2
0 0 0 0 0 1 0 1 o o o o o o o o	BLT	002400	2
0 0 0 0 0 1 1 0 o o o o o o o o	BGT	003000	2
0 0 0 0 0 1 1 1 o o o o o o o o	BLE	003400	2
0 0 0 0 1 0 0 r r r d d d d d d	JSR	004000	3
b 0 0 0 1 0 1 0 0 0 d d d d d d	CLR(B)	005000	1
b 0 0 0 1 0 1 0 0 1 d d d d d d	COM(B)	005100	1
b 0 0 0 1 0 1 0 1 0 d d d d d d	INC(B)	005200	1
b 0 0 0 1 0 1 0 1 1 d d d d d d	DEC(B)	005300	1
b 0 0 0 1 0 1 1 0 0 d d d d d d	NEG(B)	005400	1
b 0 0 0 1 0 1 1 0 1 d d d d d d	ADC(B)	005500	1
b 0 0 0 1 0 1 1 1 0 d d d d d d	SBC(B)	005600	1
b 0 0 0 1 0 1 1 1 1 d d d d d d	TST(B)	005700	1
b 0 0 0 1 1 0 0 0 0 d d d d d d	ROR(B)	006000	1
b 0 0 0 1 1 0 0 0 1 d d d d d d	ROL(B)	006100	1
b 0 0 0 1 1 0 0 1 0 d d d d d d	ASR(B)	006200	1
b 0 0 0 1 1 0 0 1 1 d d d d d d	ASL(B)	006300	1
0 0 0 0 1 1 0 1 0 0 n n n n n n	MARK	006400	4
0 0 0 0 1 1 0 1 0 1 s s s s s s	MFPI	006500	1
0 0 0 0 1 1 0 1 1 0 d d d d d d	MTPI	006600	1
0 0 0 0 1 1 0 1 1 1 d d d d d d	SXT	006700	1
b 0 0 1 s s s s s s d d d d d d	MOV(B)	010000	3
b 0 1 0 s s s s s s d d d d d d	CMP(B)	020000	3
b 0 1 1 s s s s s s d d d d d d	BIT(B)	030000	3
b 1 0 0 s s s s s s d d d d d d	BIC(B)	040000	3
b 1 0 1 s s s s s s d d d d d d	BIS(B)	050000	3
0 1 1 0 s s s s s s d d d d d d	ADD	060000	3
0 1 1 1 0 0 0 r r r s s s s s s	MUL	070000	7
0 1 1 1 0 0 1 r r r s s s s s s	DIV	071000	7
0 1 1 1 0 1 0 r r r s s s s s s	ASH	072000	7
0 1 1 1 0 1 1 r r r s s s s s s	ASHC	073000	7
0 1 1 1 1 0 0 r r r d d d d d d	XOR	074000	3
0 1 1 1 1 0 1 0 0 0 0 0 0 r r r	FADD	075000	1
0 1 1 1 1 0 1 0 0 0 0 0 1 r r r	FSUB	075010	1
0 1 1 1 1 0 1 0 0 0 0 1 0 r r r	FMUL	075020	1
0 1 1 1 1 0 1 0 0 0 0 1 1 r r r	FDIV	075030	1
0 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0	MED	076600	0
0 1 1 1 1 1 0 1 1 1 d d d d d d	XFC	076700	1
0 1 1 1 1 1 1 r r r o o o o o o	SOB	077000	6
1 0 0 0 0 0 0 0 o o o o o o o o	BPL	100000	2
1 0 0 0 0 0 0 1 o o o o o o o o	BMI	100400	2
1 0 0 0 0 0 1 0 o o o o o o o o	BHI	101000	2
1 0 0 0 0 0 1 1 o o o o o o o o	BLOS	101400	2
1 0 0 0 0 1 0 0 o o o o o o o o	BVC	102000	2
1 0 0 0 0 1 0 1 o o o o o o o o	BVS	102400	2
1 0 0 0 0 1 1 0 o o o o o o o o	BCC	103000	2
1 0 0 0 0 1 1 0 o o o o o o o o	BHIS	103000	2
1 0 0 0 0 1 1 1 o o o o o o o o	BCS	103400	2
1 0 0 0 0 1 1 1 o o o o o o o o	BLO	103400	2
1 0 0 0 1 0 0 0 t t t t t t t t	EMT	104000	4
1 0 0 0 1 0 0 1 t t t t t t t t	TRAP	104400	4
1 0 0 0 1 1 0 1 0 0 s s s s s s	MTPS	106400	1
1 0 0 0 1 1 0 1 0 1 s s s s s s	MFPD	106500	1
1 0 0 0 1 1 0 1 1 0 d d d d d d	MTPD	106600	1
1 0 0 0 1 1 0 1 1 1 d d d d d d	MFPS	106700	1
1 1 1 0 s s s s s s d d d d d d	SUB	160000	3
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0	CFCC	170000	0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1	SETF	170001	0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0	SETI	170002	0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1	LDUB	170003	0
1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0	MNS	170004	0
1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1	MPP	170005	0
1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1	MAS	170007	0
1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 1	SETD	170011	0
1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 0	SETL	170012	0
1 1 1 1 0 0 0 0 0 1 s s s s s s	LDFPS	170100	1
1 1 1 1 0 0 0 0 1 0 d d d d d d	STFPS	170200	1
1 1 1 1 0 0 0 0 1 1 d d d d d d	STST	170300	1
1 1 1 1 0 0 0 1 0 0 d d d d d d	CLR(F/D)	170400	1
1 1 1 1 0 0 0 1 0 1 fdfdfdfdfdfd	TST(F/D)	170500	1
1 1 1 1 0 0 0 1 1 0 fdfdfdfdfdfd	ABS(F/D)	170600	1
1 1 1 1 0 0 0 1 1 1 fsfsfsfsfsfs	NEG(F/D)	170700	1
1 1 1 1 0 0 1 0 f f fsfsfsfsfsfs	MUL(F/D)	171000	7
1 1 1 1 0 0 1 1 f f fsfsfsfsfsfs	MOD(F/D)	171400	7
1 1 1 1 0 1 0 0 f f fsfsfsfsfsfs	ADD(F/D)	172000	7
1 1 1 1 0 1 0 1 f f fsfsfsfsfsfs	LD(F/D)	172400	7
1 1 1 1 0 1 1 0 f f fsfsfsfsfsfs	SUB(F/D)	173000	7
1 1 1 1 0 1 1 1 f f fsfsfsfsfsfs	CMP(F/D)	173400	7
1 1 1 1 1 0 0 0 f f fdfdfdfdfdfd	ST(F/D)	174000	3
1 1 1 1 1 0 0 1 f f fsfsfsfsfsfs	DIV(F/D)	174400	7
1 1 1 1 1 0 1 0 f f d d d d d d	STEXP	175000	3
1 1 1 1 1 0 1 1 f f d d d d d d	STCFI	175400	3
1 1 1 1 1 0 1 1 f f d d d d d d	STCFL	175400	3
1 1 1 1 1 0 1 1 f f d d d d d d	STCDI	175400	3
1 1 1 1 1 0 1 1 f f d d d d d d	STCDL	175400	3
1 1 1 1 1 1 0 0 f f fdfdfdfdfdfd	STCFD	176000	3
1 1 1 1 1 1 0 0 f f fdfdfdfdfdfd	STCDF	176000	3
1 1 1 1 1 1 0 1 f f s s s s s s	LDEXP	176400	7
1 1 1 1 1 1 1 0 f f s s s s s s	LDCIF	177000	7
1 1 1 1 1 1 1 0 f f s s s s s s	LDCID	177000	7
1 1 1 1 1 1 1 0 f f s s s s s s	LDCLF	177000	7
1 1 1 1 1 1 1 0 f f s s s s s s	LDCLD	177000	7
1 1 1 1 1 1 1 1 f f fsfsfsfsfsfs	LDCDF	177400	7
1 1 1 1 1 1 1 1 f f fsfsfsfsfsfs	LDCFD	177400	7

Instruction Argument Groupings¶
Group	Instruction format
0	opcode
1	opcode source opcode destination
2	opcode offset
3	opcode source,destination
4	opcode octal-number
5	opcode bit-pattern
6	opcode register,offset
7	opcode destination,source

Meanings of abbreviations¶
Bits	Meanings	Number of bits
r	Register	3
s	Source Address	6 (3 for mode, 3 for register)
d	Destination Address	6 (3 for mode, 3 for register)
o	Offset	8 (except SOB uses 6)
n z v c	Condition Codes	4
t	Trap	8
f	Floating Point Register	2
fs	Floating Point Source	6
fd	Floating Point Destination	6
n	Octal Number	3 for SPL, 6 for MARK

1.13. APPENDIX C: The D.O.B.A.S.H. Cheat Sheet¶

The D.O.B.A.S.H. Cheat Sheet¶
Decimal	Octal	Binary	ASCII	Sixbit	Hex
000	000	0 0 0 0 0 0 0	`^@`		00
001	001	0 0 0 0 0 0 1	`^A`	`!`	01
002	002	0 0 0 0 0 1 0	`^B`	`"`	02
003	003	0 0 0 0 0 1 1	`^C`	`#`	03
004	004	0 0 0 0 1 0 0	`^D`	`$`	04
005	005	0 0 0 0 1 0 1	`^E`	`%`	05
006	006	0 0 0 0 1 1 0	`^F`	`&`	06
007	007	0 0 0 0 1 1 1	`^G`	`'`	07
008	010	0 0 0 1 0 0 0	`^H`	`(`	08
009	011	0 0 0 1 0 0 1	`^I`	`)`	09
010	012	0 0 0 1 0 1 0	`^J`	`*`	0A
011	013	0 0 0 1 0 1 1	`^K`	`+`	0B
012	014	0 0 0 1 1 0 0	`^L`	`,`	0C
013	015	0 0 0 1 1 0 1	`^M`	`-`	0D
014	016	0 0 0 1 1 1 0	`^N`	`.`	0E
015	017	0 0 0 1 1 1 1	`^O`	`/`	0F
016	020	0 0 1 0 0 0 0	`^P`	`0`	10
017	021	0 0 1 0 0 0 1	`^Q`	`1`	11
018	022	0 0 1 0 0 1 0	`^R`	`2`	12
019	023	0 0 1 0 0 1 1	`^S`	`3`	13
020	024	0 0 1 0 1 0 0	`^T`	`4`	14
021	025	0 0 1 0 1 0 1	`^U`	`5`	15
022	026	0 0 1 0 1 1 0	`^V`	`6`	16
023	027	0 0 1 0 1 1 1	`^W`	`7`	17
024	030	0 0 1 1 0 0 0	`^X`	`8`	18
025	031	0 0 1 1 0 0 1	`^Y`	`9`	19
026	032	0 0 1 1 0 1 0	`^Z`	`:`	1A
027	033	0 0 1 1 0 1 1	`^[`	`;`	1B
028	034	0 0 1 1 1 0 0	`^\`	`<`	1C
029	035	0 0 1 1 1 0 1	`^]`	`=`	1D
030	036	0 0 1 1 1 1 0	`^^`	`>`	1E
031	037	0 0 1 1 1 1 1	`^_`	`?`	1F
032	040	0 1 0 0 0 0 0		`@`	20
033	041	0 1 0 0 0 0 1	`!`	`A`	21
034	042	0 1 0 0 0 1 0	`"`	`B`	22
035	043	0 1 0 0 0 1 1	`#`	`C`	23
036	044	0 1 0 0 1 0 0	`$`	`D`	24
037	045	0 1 0 0 1 0 1	`%`	`E`	25
038	046	0 1 0 0 1 1 0	`&`	`F`	26
039	047	0 1 0 0 1 1 1	`'`	`G`	27
040	050	0 1 0 1 0 0 0	`(`	`H`	28
041	051	0 1 0 1 0 0 1	`)`	`I`	29
042	052	0 1 0 1 0 1 0	`*`	`J`	2A
043	053	0 1 0 1 0 1 1	`+`	`K`	2B
044	054	0 1 0 1 1 0 0	`,`	`L`	2C
045	055	0 1 0 1 1 0 1	`-`	`M`	2D
046	056	0 1 0 1 1 1 0	`.`	`N`	2E
047	057	0 1 0 1 1 1 1	`/`	`O`	2F
048	060	0 1 1 0 0 0 0	`0`	`P`	30
049	061	0 1 1 0 0 0 1	`1`	`Q`	31
050	062	0 1 1 0 0 1 0	`2`	`R`	32
051	063	0 1 1 0 0 1 1	`3`	`S`	33
052	064	0 1 1 0 1 0 0	`4`	`T`	34
053	065	0 1 1 0 1 0 1	`5`	`U`	35
054	066	0 1 1 0 1 1 0	`6`	`V`	36
055	067	0 1 1 0 1 1 1	`7`	`W`	37
056	070	0 1 1 1 0 0 0	`8`	`X`	38
057	071	0 1 1 1 0 0 1	`9`	`Y`	39
058	072	0 1 1 1 0 1 0	`:`	`Z`	3A
059	073	0 1 1 1 0 1 1	`;`	`[`	3B
060	074	0 1 1 1 1 0 0	`<`	`\`	3C
061	075	0 1 1 1 1 0 1	`=`	`]`	3D
062	076	0 1 1 1 1 1 0	`>`	`^`	3E
063	077	0 1 1 1 1 1 1	`?`	`_`	3F
064	100	1 0 0 0 0 0 0	`@`		40
065	101	1 0 0 0 0 0 1	`A`		41
066	102	1 0 0 0 0 1 0	`B`		42
067	103	1 0 0 0 0 1 1	`C`		43
068	104	1 0 0 0 1 0 0	`D`		44
069	105	1 0 0 0 1 0 1	`E`		45
070	106	1 0 0 0 1 1 0	`F`		46
071	107	1 0 0 0 1 1 1	`G`		47
072	110	1 0 0 1 0 0 0	`H`		48
073	111	1 0 0 1 0 0 1	`I`		49
074	112	1 0 0 1 0 1 0	`J`		4A
075	113	1 0 0 1 0 1 1	`K`		4B
076	114	1 0 0 1 1 0 0	`L`		4C
077	115	1 0 0 1 1 0 1	`M`		4D
078	116	1 0 0 1 1 1 0	`N`		4E
079	117	1 0 0 1 1 1 1	`O`		4F
080	120	1 0 1 0 0 0 0	`P`		50
081	121	1 0 1 0 0 0 1	`Q`		51
082	122	1 0 1 0 0 1 0	`R`		52
083	123	1 0 1 0 0 1 1	`S`		53
084	124	1 0 1 0 1 0 0	`T`		54
085	125	1 0 1 0 1 0 1	`U`		55
086	126	1 0 1 0 1 1 0	`V`		56
087	127	1 0 1 0 1 1 1	`W`		57
088	130	1 0 1 1 0 0 0	`X`		58
089	131	1 0 1 1 0 0 1	`Y`		59
090	132	1 0 1 1 0 1 0	`Z`		5A
091	133	1 0 1 1 0 1 1	`[`		5B
092	134	1 0 1 1 1 0 0	`\`		5C
093	135	1 0 1 1 1 0 1	`]`		5D
094	136	1 0 1 1 1 1 0	`^`		5E
095	137	1 0 1 1 1 1 1	`_`		5F
096	140	1 1 0 0 0 0 0			60
097	141	1 1 0 0 0 0 1	`a`		61
098	142	1 1 0 0 0 1 0	`b`		62
099	143	1 1 0 0 0 1 1	`c`		63
100	144	1 1 0 0 1 0 0	`d`		64
101	145	1 1 0 0 1 0 1	`e`		65
102	146	1 1 0 0 1 1 0	`f`		66
103	147	1 1 0 0 1 1 1	`g`		67
104	150	1 1 0 1 0 0 0	`h`		68
105	151	1 1 0 1 0 0 1	`i`		69
106	152	1 1 0 1 0 1 0	`j`		6A
107	153	1 1 0 1 0 1 1	`k`		6B
108	154	1 1 0 1 1 0 0	`l`		6C
109	155	1 1 0 1 1 0 1	`m`		6D
110	156	1 1 0 1 1 1 0	`n`		6E
111	157	1 1 0 1 1 1 1	`o`		6F
112	160	1 1 1 0 0 0 0	`p`		70
113	161	1 1 1 0 0 0 1	`q`		71
114	162	1 1 1 0 0 1 0	`r`		72
115	163	1 1 1 0 0 1 1	`s`		73
116	164	1 1 1 0 1 0 0	`t`		74
117	165	1 1 1 0 1 0 1	`u`		75
118	166	1 1 1 0 1 1 0	`v`		76
119	167	1 1 1 0 1 1 1	`w`		77
120	170	1 1 1 1 0 0 0	`x`		78
121	171	1 1 1 1 0 0 1	`y`		79
122	172	1 1 1 1 0 1 0	`z`		7A
123	173	1 1 1 1 0 1 1	`{`		7B
124	174	1 1 1 1 1 0 0	`\|`		7C
125	175	1 1 1 1 1 0 1	`}`		7D
126	176	1 1 1 1 1 1 0	`~`		7E
127	177	1 1 1 1 1 1 1	`RU`		7F

1.14. APPENDIX D: Powers of 2 Cheat Sheet¶

Powers of 2 Cheat Sheet¶
Power	Decimal	Octal	Hex	K
0	1	1	1	0 K
1	2	2	2	0 K
2	4	4	4	0 K
3	8	10	8	0 K
4	16	20	10	0 K
5	32	40	20	0 K
6	64	100	40	0 K
7	128	200	80	0 K
8	256	400	100	0 K
9	512	1000	200	0 K
10	1024	2000	400	1 K
11	2048	4000	800	2 K
12	4096	10000	1000	4 K
13	8192	20000	2000	8 K
14	16384	40000	4000	16 K
15	32768	100000	8000	32 K
16	65536	200000	10000	64 K
17	131072	400000	20000	128 K
18	262144	1000000	40000	256 K
19	524288	2000000	80000	512 K
20	048576	4000000	100000	1024 K
21	097152	10000000	200000	2048 K
22	194304	20000000	400000	4096 K
23	388608	40000000	800000	8192 K