The 12-bit PDP-8 contained:
|
The little machine that could, and did!
Why does the PDP-8's store instruction also clear the accumulator?
Having recently re-learned PDP-8 machine language for the purpose of creating the family of SBC6120 front panel "Toggle Toys", I wanted to take a moment to make some observations about the PDP-8's quirky, small, yet often surprisingly efficient instruction set.
In November of 2009 there was some discussion in the alt.sys.pdp8 public USENET newsgroup regarding DEC's decision to have the store-to-memory operation (DCA - Deposit and Clear Accumulator) clear the accumulator (AC) after storing the AC's value into memory. The question raised was whether this was a wise choice given that a PDP-8 programmer often finds himself immediately reloading the AC with the value just stored.
More than anything, to programmers familiar with contemporary machine language instruction sets, the idea of a forced clearing of the machine's accumulator upon storing its contents into memory seems jarring in the extreme. "But, wait!... I wasn't through with it! I need it back right now!!!" (And, of course, you CAN have it back right now at the cost (and annoyance) of ADDing what you just stored to the just-zeroed accumulator.) The other annoyance is that does make for less transparent and more confusing code.
So what were the PDP-8's designers thinking? Were they right to have DCA and not a simple store?
In my own recent PDP-8 programming experience I found the pro and con of this decision about evenly balanced. I often did wish that the AC had not just been cleared, but just as often it turned out to be just what I wanted ... given that all I had was an add (TAD) instruction and no simple load! And, in fact, that's part of the key. In thinking about this a bit I believe that the storing and clearing decision is closely tied to the fact that there is also no simple "load-from-memory" operation. All we have is "TAD" for "Two's Compliment Add" which adds the contents of memory to the current contents of the AC.
Independent of whether the machine's "store" operation were also to clear the AC, we obviously need to not only be able to load data from memory, but also to perform addition operations. The designers of the instruction set were dealing with just about as sparse a set of opcodes as was imaginable. DEC's preceding machine, the PDP-7, used an 18-bit word length with a comparably roomy 4-bit opcode. It's amazing what a difference just that one bit of opcode space made; 16 basic operations is really ample by comparison! 8 basic operations is really quite tight for a general purpose machine that you'd like to have able to do pretty much anything.
But the designers of a 3-bit opcode machine didn't even have the luxury of giving the machine both a LOAD and an ADD instruction. There just wasn't room for both. And given that we absolutely MUST have addition, and that adding the contents of memory to a zeroed accumulator is the equivalent of a LOAD, the designers chose addition.
So it seems to me that the presence of a TAD (add from memory) instruction without any LOAD, biases the decision about whether to clear the AC after a STORE in favor of doing so. There's a sort of symmetry there. And that way you can get the quite often needed effect of a simple LOAD immediately following a STORE by having the STORE also zero the AC.
Obviously, if the PDP-8's store operation did NOT also clear the accumulator, a programmer wishing to load some other value, when only having a TAD instruction, would have needed to perform a CLA (Clear Accumulator) before the TAD. Being a non-MRI instruction, a CLA executes in half the time of any of the MRI instructions, since another memory cycle is not needed to perform an operation on memory. But the additional CLA would still have consumed another word of memory. This would offset the word saved if the programmer DID wish to retain the AC's value, since with a non-clearing store the AC would not have needed to be reloaded. So memory consumption would likely have been a toss-up either way.
In the end, I think that the designers felt that DCA had a nicely matching symmetry to TAD. Also, the original target market for the PDP-8 was process control applications much more than general purpose computing. So the designers may have had good reason to believe that moving data around, as the TAD/DCA combination does, would be done more often than storing a value and wishing it were still available.
In the core memory world of the 1960's, another clever aspect of the design of the PDP-8 is its use of "zero cost" incrementation of values in core memory.
The act of reading data from an address in core memory inherently erases its contents. Core memory is a "destructive read" memory technology. It works by writing all 0's to the addressed memory location and only those core memory bits that WERE 1's beforehand will generate an inductive "pulse" on their sense wires as those bits are flipped over to 0's. But this means that the act of reading a location also erases it. Therefore, unless the contents of the memory is going to be immediately changed it must be immediately re-written with the original data that was just read from it, since reading that data caused it to be erased.
But this also means that CHANGING the contents of a memory location can be done "for free" if its new value is known immediately, as is the case with incrementation, where the value just read from the location immediately determines its next value.
The PDP-8 system cleverly used this characteristic in two places:
The Increment and Skip of Zero (ISZ) instruction would read, increment, and re-write a value one greater than the specified memory word previously contained. Since the system's core memory system needed to perform a memory rewrite anyway, rewriting the memory with its incremented value was a zero-cost operation - it took no additional time than writing-back the same data would have.
The second place where the PDP-8 system used this was in the auto-incrementing core memory locations 0010 through 0017: Any time a memory reference instruction indirectly referenced any of those eight locations, the location's current contents was first incremented. That incremented value was then used by the program and also written back to the same location. This was very useful for scanning through memory, copying data from one place to another, etc. Once again, since the core memory location needed to be rewritten anyway after being destructively read, this was an economical "zero cost" operation. And by giving these eight locations this "special" auto-incrementation property, additional functionality was "snuck" into the PDP-8's ultra-sparse instruction set.
In reacquainting myself with the PDP-8 on the Internet, I discovered that the PDP-8 had been (and even was now being) used in educational settings to teach the essentials of low-level computer operation and architecture. It was, after all, the machine on which I first learned to program machine and assembly language. But I have seen discussions in newsgroup forums -- even recently -- where those present think that using the PDP-8 today, in the 21st century, still makes a lot of sense.
I'm not so sure I agree.
I have just completed a few small but usefully complex contemporary PDP-8 programs, I'm a lifetime assembly language programmer, and I also have a long history of explaining/teaching technology to others who are curious and receptive (generally without putting anyone to sleep). The bottom line is that I think a simple synthetic machine could have been, and could be, created that would be much more useful for explaining the key concepts of binary computing. For example, this is exactly the path Donald Knuth took when he created the hypothetical teaching machine "MIX" to illustrate algorithms for his famous "Art of Computer Programming" series.
While it's an interesting exercise to work out how to perform OR and XOR operations on a machine that only has AND and NOT (the topic of the next section), as a programmer it's just annoying, and it seems more a topic for a course in digital logic since any and every CPU anyone will ever encounter in real life today will have all of those intrinsic operation built-in. In fact, even a predecessor to the PDP-8, DEC's very first computer, the 18-bit PDP-1, had both OR and XOR functions! DEC's engineers took those functions out when the machine's word length, and thus its opcode space, was reduced from five bits down to just three.
Similarly, a one-register machine is an interesting curio that is effectively reduced to treating all of main memory as registers and the accumulator as a temporary data path between main memory. With a clear trend toward an ever greater number of general purpose registers, a machine sporting 8, 16, or 32 general purpose registers, where they can be used in two- and three-operator expressions with indexing and indirection features, would much more readily teach today's core computing concepts.
Frankly, much as I personally love the PDP-8 (oh, yes I do), the only reason I can see for ever mentioning the PDP-8 in an "introduction to machine architectures" course would be during a brief "history of computing" overview.
The more I think about it, and I've thought about it a lot, the more I believe that the perfect place for the PDP-8 in an instructional setting would be as an extra-credit assignment in a machine language course where the task would be to write a "Towers of Hanoi" solver using the hauntingly elegant, though certainly quite funky, old machine's instruction set.
The presence of inclusive OR and exclusive OR (XOR) “bitwise” logical functions is not just a matter of convenience when programming, it can often be a necessity. (The “Lights Out” toggle toy I wrote for the SBC6120 PDP-8 uses XOR extensively.) The clever designers of the PDP-8's “pared to the bone” instruction set knew that OR and XOR could be synthesized from NOT and AND, so that's what they left us to do.
If you style yourself a puzzle and problem solver, read no further. Grab some paper and a pencil (and an eraser) and work out for yourself how two 12-bit binary words can be "ORed" and "XORed" using just the available AND and NOT functions. It's a pleasant exercise. (There's a clever means, shown below, for obtaining XOR that also uses ADD and a SHIFT instruction, so once you do it the hard way, you might want to keep thinking!)
Truth-table for logical AND:
|
Truth-table for logical OR:
|
Referencing the two “truth tables” shown above, we can see the relationship between the fundamental boolean logical AND and OR operations: The AND function has a single '1' outcome where both inputs are also '1', whereas the OR function has a single '0' outcome where both inputs are also '0'. In other words, either function can be converted into the other if we invert the inputs and the output!
Since the function the PDP-8 has is AND, we can obtain the OR function by inverting the result of AND-ing the boolean NOT (bitwise inverse) of each input. The electrical/logical schematic representation of this would be this:
In the diagram above, the triangles with the circles on their outputs are “inverters” (NOT) and the box with the rounded right edge represents the AND function. So you can see that we're inverting each input, AND-ing those results, then inverting the result of the AND.
Expressed in PDP-8 PAL assembly language source code, this would look like this:
Synthesizing logical 'OR' when all you have is 'AND' and 'NOT'
CLA / clear accumulator (AC) since all we have is add! TAD ArgOne / add (TAD) the first argument to the just-zeroed AC CMA / invert the bits of ArgOne (CMA = Compliment Accumulator) DCA Temp / deposit the intermediate value in a temporary memory cell / (DCA = Deposit and Clear Accumulator) TAD ArgTwo / add (TAD) the second argument to the just-zeroed AC CMA / invert the bits of the second argument, ArgTwo AND Temp / AND the second inverted arg with the first inverted arg CMA / compliment the result of the AND to get the final result / the accumulator is left with the boolean 'OR' of the args |
As you can see, the lack of an 'OR' instruction requires jumping through some hoops on the PDP-8.
Referencing the XOR's “truth table” below, it's quickly apparent that the trick we used for converting the AND into an OR (noticing the one place where each table had one different output) won't work as easily for XOR since there are two '0's and two '1's.
XOR | 0 | 1 |
0 | 0 | 1 |
1 | 1 | 0 |
One way to think of an XOR operation is with the phrase “either of the inputs, but not both”. Given only the PDP-8's available operations of AND & NOT, the following logic diagram implements that phrase using just the PDP-8's two (AND & NOT) logical operations:
In the upper-left, you'll notice the same collection of four logic gates (the two inverters feeding the AND whose output is inverted) that we used above to synthesize the 'OR' operation. That's the “either of the inputs” clause of the phrase. The “but not both” clause is implemented with the lower AND function, whose output will be TRUE only when both inputs are TRUE. Its output is then inverted and ANDed with the output of the upper 'OR' clause. Thus, we get “either of the inputs, but not both”.
To express this as PDP-8 code, we add the AND, NOT & AND instructions to the end of the 'OR' implementation from above:
Synthesizing logical 'XOR' when all you have is 'AND' and 'NOT'
CLA / clear accumulator (AC) since all we have is add! TAD ArgOne / add (TAD) the first argument to the just-zeroed AC CMA / invert the bits of ArgOne (CMA = Compliment Accumulator) DCA Temp / deposit the intermediate value in a temporary memory cell / (DCA = Deposit and Clear Accumulator) TAD ArgTwo / add (TAD) the second argument to the just-zeroed AC CMA / invert the bits of the second argument, ArgTwo AND Temp / AND the second inverted arg with the first inverted arg CMA / compliment the result of the AND to get the final result DCA Temp / save the 'OR' of the two arguments TAD ArgOne / get the first argument again AND ArgTwo / AND it with the second argument CMA / invert the AND of the two arguments AND Temp / and finally, AND the result of the OR with this second half / the accumulator now contains the XOR of ArgOne and ArgTwo |
The code above will indeed perform a boolean logical XOR of the bits in the two arguments. But it is often the case with computers that non-obvious solutions are also possible... and that's also true in this case! Recalling the truth-table for the XOR operation:
XOR | 0 | 1 |
0 | 0 | 1 |
1 | 1 | 0 |
Compare the XOR's results with those of binary addition:
| |||||||||||||||||||||||||||||||||||||||||||||
Consider only the least significant (units) digit of the addition, ignoring the “carry” from the units place that occurs when we add '1' + '1'. The least significant (units) digit produced by adding the two numbers results in '1' when either, but not both, of the inputs is '1'. In other words, the XOR operation is same as “carryless addition”. This fact provides a very nifty, and much smaller and faster means for coding an XOR operation for a PDP-8:
Optimized logical 'XOR' using “carryless addition”
CLA / clear accumulator (AC) since all we have is add! TAD ArgOne / add (TAD) ArgOne to the just-zeroed AC AND ArgTwo / AND ArgTwo to determine where the carrys will be CLL RAL / clear the LINK (CLL) and rotate the accumulator left (RAL) CMA IAC / compliment (CMA) & increment (IAC) the accumulator (negate) TAD ArgOne / add the first argument to the negated accumulator TAD ArgTwo / and add the second argument as well / the accumulator now contains the XOR of ArgOne & ArgTwo |
The “carryless addition” code shown above efficiently implements the following algebraic expression:
ArgOne + ArgTwo - 2*(ArgOne AND ArgTwo)
For the sake of efficiency with the PDP-8's oh-so-limited instruction set, the last term of that algebraic expression is evaluated first. The two arguments are ANDed, then the result of the ANDing is doubled by shifting it one place to the left (by using the PDP-8's rotate accumulator left (RAL) instruction with the LINK bit first cleared (CLL). This is done with a single compound “operate” instruction.). The result is then negated using another single compound operate instruction to compliment the accumulator (CMA) and increment the accumulator (IAC). Finally, the two left-hand terms of the algebraic equation are added to the partial result . . . which yields the XOR function in many fewer steps than synthesizing it out of individual AND and NOT operations (and an intermediate “Temp”orary storage location is not required.
CLICK HERE to learn how YOU can acquire & build one of these complete PDP-8 systems for yourself !!! |
Gibson Research Corporation is owned and operated by Steve Gibson. The contents of this page are Copyright (c) 2024 Gibson Research Corporation. SpinRite, ShieldsUP, NanoProbe, and any other indicated trademarks are registered trademarks of Gibson Research Corporation, Laguna Hills, CA, USA. GRC's web and customer privacy policy. |
Last Edit: Jan 03, 2010 at 13:39 (5,369.50 days ago) | Viewed 7 times per day |