Alas, we do not have much software for the ElectroData/Burroughs 205. Tom Sawyer has the decimal tape dump of the "Burroughs Algebraic Compiler." I have transcribed the EASY and MEASY assemblers, and the Algol-58 compiler that Donald Knuth donated to the Computer History Museum a decade ago. The recovery of these has been described in prior blog posts [1, 2]. Aside from a few demos and test routines Tom and I have written, that is all the software we have, or so I originally thought.
This is Paul Kimpel again, with another post on Tom's 205 blog. This is the first of a two-part post that describes another piece of software we have managed to recover -- the Shell Assembler. This part discusses the general features of the assembler, recovery of the source and object code, and how to operate the assembler in the retro-205 emulator. The second part will discuss the assembler's features in more detail, including its macro capability, and programming with the assembler.
The Shell Assembler
Tom Sawyer had yet another 205 manual in his collection -- a pair of them, actually -- published by Burroughs in March 1960 as Bulletin 3038, and titled Shell Symbolic Assembly Program for the Burroughs 205 Electronic Data-Processing System, Parts I and II [3, 4]. These were republications of documents originally issued in September 1958 as EPR Memorandum 37 by the Shell Development Company, Exploration and Production Research Division, Houston, Texas.
These two documents and the software they describe were authored by Joel C. Erdwinn, G. Clark Oliphint, C. M. Pearcy, and David M. Dahm. Erdwinn, Oliphint, and Dahm later went to Burroughs in Pasadena, California. All three worked on the highly-regarded Algol-58 compiler for the Burroughs 220, known as BALGOL. Erdwinn later went on to a distinguished career at Computer Sciences Corporation. Oliphint and Dahm worked on the software for the Burroughs B5000. Dahm continued to work and consult with Burroughs off and on for the rest of his career, and is probably best known as the designer, with Roy Guck, of the DMSII database management system. The Shell Assembler is an example of the talent and promise all three showed early in their careers.
The Part I manual describes the assembler, its coding notation, and operation. It also includes a detailed set of flowcharts. Part II contains an assembly listing of the assembler, written in itself. This listing is in two parts, corresponding to the two "movements" (today we would call them "passes") by which the assembler operated. These are remarkable documents. They give a full disclosure of the documentation, design, and coding of the assembler, apparently with the intent of allowing anyone to reconstruct the software from the documents alone. This effectively was open-source distribution, ca. 1958.
Of all the assemblers for the 205 we presently know about, the Shell Assembler is clearly the most sophisticated. The coding notation is based on a fixed-column input format, as is typical of most assemblers. Like Knuth's EASY and MEASY assemblers, the input format is even more columnar than most, designed for efficient processing by format bands in the 205's Cardatron card reader interface. This approach all but eliminates the need to parse the input text in software.
In addition to supporting a full set of mnemonic operator codes for the 205, the Shell Assembler has an extensive set of pseudo-operators for manipulating the address counter, managing addresses for the high-speed loops on the memory drum, coding numeric and alphanumeric literals, and defining and invoking both macros and library subroutines. The assembler also has the ability to maintain the libraries on magnetic tape. Perhaps its most unusual (and most-needed) pseudo-instructions provide a way to conveniently code the Cardatron format bands used to interface with IBM card readers, card punches, and tabulators (line printers). All of these are discussed in more detail in Part 2.
Perhaps the most unusual checking feature of the assembler is "invalid pair testing." It organizes the 205 machine instructions into a set of classes based on the processor state they require to execute and the state they establish during their execution. As each machine instruction is being assembled, the assembler determines whether that instruction is a reasonable one to follow the prior instruction, based on the states this instruction requires and the prior one leaves behind.
For example, many instructions operate upon the processor's A register, and several instructions clear or invalidate the contents of that register. If one of the former instructions follows one of the latter, the assembler flags that sequence as an invalid pair -- the second instruction is attempting to use machine state that has been destroyed by the first one or the second instruction is destroying state left by the first one, e.g., AD (Add) followed by CAD (Clear and Add, i.e., load A from a memory location). The load destroys the state left by the add.
Another example of invalid-pair testing concerns the processor's Overflow toggle. When execution of an instruction causes the Overflow toggle to be set, the next instruction executed must be one of the conditional branches; if it is not, the processor will halt on the Overflow condition. The assembler will flag as an invalid pair instances where Overflow could be set but the next instruction is not a conditional branch, e.g., AD (Add) followed immediately by SU (Subtract).
The flagging of invalid pairs is just a warning, and the assembly will continue regardless. Section V in the Part I manual describes in detail the instruction classes and how instruction pairs are checked.
Reconstructing the Shell Assembler
As mentioned above, the Part II manual consists of complete listings of the two movements of the assembler. Movement 1 is 59 pages and 2109 lines; Movement 2 is 34 pages and 1220 lines. These are assembly listings, with the address and generated object code for each instruction printed to the left of the assembler source.
Transcribing and Bootstrapping the Assembler
As with other software reconstructions for the 205, I transcribed both source and object code from the listings. This was done for two reasons. First was the typical reason -- once the file was transcribed, the source images could be extracted from it, assembled, and the listing from that assembly compared to the transcription. Differences in the addresses or assembled code should indicate transcription errors. I could then correct the discrepancies and reassemble, iterating until the assembler output matched the transcription.
Second, and more importantly in this case, the Shell Assembler is written in itself. Thus we have a chicken-and-egg problem -- where do we get the object code to assemble an assembler that is written in itself? By transcribing the object code along with the assembler source, the assembler can be bootstrapped by extracting the object code from the transcription, formatting it into a load tape, and then used to assemble the source.
I began transcribing in early August 2015 and finished in late October, generally doing a couple of pages a day. Proofing, correction, and the resolution of a couple of frustrating anomalies (discussed below) carried on for a few more weeks. The first successful "round-trip" assembly (assembling the assembler from the source and object code extracted from the transcription, then using the resulting object code to assemble again, with the output of the two assemblies matching) occurred around 20 November.
Assembly of Movement 1 requires about 26 minutes in the retro-205 emulator. The first pass (Movement 1 being assembled by Movement 1) requires about eight minutes. Its speed is limited by that of the card reader, emulating an IBM 087 collator at 240 cards/minute. The second pass (Movement 1 being assembled by Movement 2) requires about 18 minutes. Its speed is limited by the printer, emulating an IBM 407 tabulator at 150 lines per minute. The card punch (emulating an IBM 523 at 100 cards/minute) is slower than the printer, but only outputs a card every five words of assembled code. Since it is buffered by the Cardatron, its operation is normally overlapped by other processing.
Assembly of Movement 2 requires about 15 minutes in retro-205. The first pass requires about six minutes, with the second requiring about nine minutes.
Copies of the latest transcription, assembly listings, and tape and card outputs from assembling both movements are available in the project repository at . The history of transcription and correction is also available in prior commits to that directory of the repository.
Two utility scripts were written to support the reconstruction of the assembler. They are available in the software/tools/ directory of the project repository:
- Shell-Xscript-Reformatter.wsf : extracts the source and object code for each movement from their transcribed source files. It creates card decks for each movement in the form required for input to the assembler, and outputs a tape image with the object code formatted as a loadable tape. This was originally written to bootstrap the assembler from the transcription and ultimately do a "round-trip" test.
- Shell-LoadTapeBuilder.wsf : takes the output tapes from assembling the two movements of the assembler and reformats them into a loadable tape of the assembler object code.
Problems and Anomalies
The process of bootstrapping the assembler uncovered a couple of anomalies in the published listings and one very frustrating problem. That problem was encountered during the first assembly attempt: the pseudo-instructions that construct Cardatron format bands were generating errors. For example, consider this sequence of lines starting at sequence 0110 in Movement 1 (page 6 in the Part II manual):
FMB3 WFB 3¤P10Z¤ & P5AB & P6Z4N3BFMB3 is the label for the band data, representing its address in memory. WFB is the "Write Format Band" pseudo-instruction, used to describe how a line of data should be formatted for output to an IBM tabulator. The remainder of the text, continuing on to the next four lines, are the formatting strings themselves. The format band pseudo-instructions will be discussed more fully in Part 2 of this post, but for now accept that the strings between ampersands (&) describe the translation of one word in memory to columns on the tabulator, and the lozenge (¤) brackets a repeating word or group of words.
P6Z4N2B & NB4NB2NB4N3B
P5A2B & PAB4AB & P5AB
P8ZAB & 2¤P5A¤2B
6¤P5A¤ & 20B
The assembler was reporting two errors for this pseudo-instruction (see  for that version of the assembly listing):
110 PLUS SIGN MISSING
110 WRONG NUMBER OF DIGITS
Note that the plus sign (+) and ampersand are the same character on the IBM 407 tabulator, as are the right parenthesis ()) and lozenge. Which glyphs print depends upon the type of print wheel that is installed in the tabulator.
The first error is indicating there is a missing ampersand in the format; the second is indicating that one of the format phrases between ampersands does not account for all 11 digits of a 205 word. Yet those lines are transcribed exactly as they appear in the Part II manual. I spent quite a few hours staring at those lines, reading the discussion of format band generation in Section VI of the Part I manual, and studying the code in Movement 2 that generates the format bands, trying to understand what was wrong.
Finally in desperation, I decided to let the error message be my guide. Notice that there are no ampersands between phrases on different lines, so I added them at the start of the continued lines. That fixed the problem. Adding an ampersand at end of a line being continued also works.
With the additional ampersands in place, assembly of Movement 1 still generated errors for format bands FMB6 (format band too long) and FMB7 (wrong number of columns). What's worse, I noticed that the code for these bands in the Part II manual did not match what you would expect from the phrases in the WFB operand coding. For example, the listing in the manual shows this for FMB7:
0226 1091 3 3333 33 3333 FMB7 RFB B15¤P5A¤ & P4Z3ABThe assembler error is valid -- the phrases specify a format of one blank column, 15 groups of five alphanumeric columns, three alphanumeric columns, another blank column, six numeric columns, and finally 19 blank columns. That is a total of 105 columns, not the 80 required for a Read Format Band (card input) specification. What the generated code appears to represent is a format band specification like this:
0227 1092 3 3333 33 3333 P4Z6N19B
0228 1093 3 3333 33 3333
0229 1094 3 1313 13 3333
0230 1095 0 0000 31 3131
0231 1096 0 0000 00 1111
0232 1097 0 1111 11 1111
0233 1098 0 1111 11 1111
0234 1099 0 1111 11 1111
0235 1100 0 1111 11 1111
0236 1101 0 1111 11 1111
0237 1102 0 1111 11 1111
0238 1103 0 1111 11 1111
0239 1104 0 1111 11 1111
0240 1105 0 1111 11 1111
0241 1106 0 1111 11 1111
0242 1107 0 0000 31 3131
0243 1108 3 3333 33 0000
0244 1109 3 3333 33 3333
0245 1110 3 3333 33 3333
0246 1111 3 3333 33 3333
0247 1112 3 3333 33 3333
0248 1113 3 3333 33 3333
0249 1114 3 3333 33 3333
0250 1115 3 3333 33 3333
0251 1116 3 3333 33 3333
0252 1117 3 3333 33 3333
0253 1118 3 3333 33 3333
0254 1119 3 3333 33 3333
FMB7 RFB P7Z3N & 10¤P5A¤ & P6Z2AThat specifies a layout of three numeric columns, 10 groups of five alphanumeric columns, two alphanumeric columns, 6 numeric columns, and finally 19 blank columns, for the correct total of 80 columns.
There is an additional anomaly in the original listing for Movement 1: at sequence 1826 (page 53 in the Part II manual), the listing shows this:
1826 2689 0 0000 00 0000 ERR8 ALFS 0006 MEMORY OVERFLOWALFS is a pseudo-instruction that will encode up to six words (30 characters) of alphanumeric text into the numeric code used by the Cardatron. Alas, if you decode the generated words (they are stored in reverse order to match the way the Cardatron fetched data), they read "MORE THAN 4000 LOCATIONS," not "MEMORY OVERFLOW."
1827 2690 0 4956 55 6200
1828 2691 0 5356 43 4163
1829 2692 0 8480 80 8000
1830 2693 0 6348 41 5500
1831 2694 0 5456 59 4500
There are three other anomalies in the listing for Movement 2. At sequence 0111 (page 68 in the Part II manual), the operation code and address are blank:
0111 1271 0 0000 02 6010 .R STC CLEAR A AND TEMP&0003From both the comment and the generated code, however, it's clear that line should read:
0111 1271 0 0000 02 6010 .R STC TEMP &0003 CLEAR A AND TEMP&0003
At sequence 0390 (page 75 in the Part II manual), the operand address is invalidly formed (address increments must start in column 31 on the card, not column 30):
0390 1550 0 0000 02 6014 .B STC TEMP&0007 ZERO TEMP&0007The line assembles correctly if the ampersand is shifted to the right by one position:
0390 1550 0 0000 02 6014 .B STC TEMP &0007 ZERO TEMP&0007At sequence 0811 (page 87 in the Part II manual), the label field is ".1" but is never referenced. It should be ".A". I discovered this when the operand address assembled for the CNZ instruction two lines above it did not match the listing.
Having identified these problems and anomalies in the original listing, what should be done about them? I finally decided to do the following:
- The missing ampersands in the format band specifications may be the result of my old transcription nemesis on this project, suppression of leading zeroes by the 407 tabulator. "Leading zeroes" to the 407 are any characters that do not have a numeric punch in the IBM card code, so that includes the ampersand. In any case, the ampersands need to be there, thus I simply added them on the continuation lines of the format band specifications.
- The only explanation I can come up with for the problems with FMB6 and FMB7 is that someone cut-and-pasted the listing before reproducing it in the report. The manuals we have are republications by Burroughs of the original Shell Research report, so it is difficult to guess where -- or why -- this change was made. Perhaps a patch to the assembler was released, and only the object code side of the page was updated in the manual. The format band specification phrases are clearly wrong, however, and the generated format band data seems reasonable, so I changed the phrases in the operand field of the pseudo-instructions to match the generated band data.
- Similarly, the difference between the text and generated code for the ERR8 message is puzzling, but may have the same explanation as for the format bands above. Since the difference is not critical to the assembler's operation, and the error message as documented on page 11 of the Part I manual is "MEMORY OVERFLOW," I left the text alone and allowed the generated code to differ from the original listing.
- It's possible the three anomalies in Movement 2 were due to transient hardware errors -- I have seen other listings where the 407 clearly printed incorrect characters -- but it is clear in all three cases what the source text should have been, so I changed it to match the generated code.
- At sequence 0311, MTW (Magnetic Tape Write) follows CADA (Clear and Add Absolute). Magnetic tape instructions use the A register, which would thus destroy whatever CADA loaded into A.
- At sequence 0376, CAD (Clear and Add) follows CADA, which would thus destroy whatever was loaded by CADA.
- At sequence 0709, BA (Transfer B register to A) follows CAD, which would thus destroy whatever was loaded by CAD.
Using the Shell Assembler
The assembler was designed to be used with a 205 that had a Cardatron and magnetic tape equipment. The Cardatron required at least one card reader, card punch, and line printer. The assembler required three magnetic tape drives.
With all of this equipment in play, watching the assembly of a large program, especially one that invokes macros and subroutines from the library, can be quite entertaining. The emulator puts on a classic mainframe show of flashing lights, spinning tapes, and unit record equipment chunking through cards and print lines.
The 205 supported two types of magnetic tape drive, the Model 544 DataReader and Model 560 DataFile. The DataReader was a typical vertically-standing cabinet housing a reel-to-reel drive.
The DataFile was a semi-random access device that used fifty 250-foot strips of magnetic tape arranged in a series of parallel bins. It had a moving tape head assembly that could quickly select and read or write one of the strips at a time. The recording format for both DataReader and DataFile was the same -- tapes were 0.75 inches wide and supported two six-bit data lanes, only one of which could be read or written at a time. Thus, with 50 tape strips, the DataFile supported 100 lanes and a total of 100,000 blocks (10 million characters) of storage.
Data was recorded on the tape in fixed, 20-word blocks. These blocks were addressable, and the tape control unit could search for a block on a drive asynchronously while the processor was doing other work. The blocks could also be overwritten in place.
Between the search and block-overwrite capabilities, both the DataReader and DataFile could be used somewhat like a slow disk drive. In fact, the whole purpose of the DataFile design was simultaneously to increase capacity and reduce seek time, allowing it to be used as a true random-access device. Moving between tapes required 0.5 to 2.0 seconds; average search time along a tape strip was about 15.3 seconds, so average access time to an arbitrary block was about 16.3 seconds.
That is three orders of magnitude slower than today's disk drives, but about 1.5 orders of magnitude faster than reel-to-reel drives. Morevoer, once the starting block was located, sequential access was much faster -- 46 milliseconds per block. In order to provide efficient processing for large data sets, ElectroData, Burroughs, and their customers developed a number of clever file-handling techniques to minimize seeks and maximize sequential access to these devices.
The retro-205 emulator does not yet implement the DataFile, but its development is planned, and for now the assembler works fine with the three DataReader tape drives that the standard emulator configuration supports.
The assembler, as configured in the listings in the Part II manual, assumes Movement 1 starts at block 120 on lane 89 of DataFile unit 0 (also designated as unit 10), with Movement 2 starting at block 290 on that lane. When using a DataReader unit, only the low-order bit of the lane number is significant, so that corresponds to blocks 120 and 270 on lane 1.
In addition to the load tape on unit 0, the assembler requires scratch tapes on units 1 and 2, which would normally be DataReaders. The assembler also requires the card reader as Cardatron input unit 1, the card punch as Cardatron output unit 2, and the line printer as Cardatron output unit 3. The unit numbers and lane/block locations of the movements can be changed by patching the assembler, as described in Section IX of the Part I manual.
Overview of Operation
The assembler is most easily loaded and executed using a one-card bootstrap program, as described in Section XI of the Part I manual. A card image for that program is available in the project repository at . A loadable tape image for the assembler is also available in the repository at . The "===" characters in the first three characters of the bootstrap card specify format-band 6, but also impose reload-lockout on the card reader. This allows the assembler to load and initialize its format bands before the first card of the source deck to be assembled is read. This in turn allows the source deck to follow immediately behind the bootstrap card in the reader. The code for the bootstrap is:
0000 4 0110 44 7000 CRD redirect card data to address 7000The first and last instructions on this card, with the 4-bit of their sign digit set, are executed at the time the card is read and are not stored in memory. They are not part of the program the card loads into memory.
7000 0 8900 42 0120 MTS search tape 0 for lane 89, block 120
7001 0 0000 28 7000 CCU if tape busy, branch to 7000 and retry
7002 0 0000 40 0900 MTR read 100 blocks to address 900
7003 0 0000 28 7002 CC if tape busy, branch to 7002 and retry
7004 0 0000 30 0900 CUB block and branch from 900 to loop 7
7005 6 0000 20 7000 CU stop reading bootstrap; branch to 7000
The bootstrap loads Movement 1 from the tape and branches to its entry point at address 0900. Movement 1 reads the source cards, partially assembles them, and writes a 20-word block of data for each card to the scratch tape on unit 1. Any errors detected in that pass are written to the line printer and the Flexowriter. If macros or subroutines are defined or called out from the library, you will see the tape on unit 0 move back and forth as the assembler seeks first to the catalog for the library, then to the lane table, and then to the position for the macro itself in the library.
By default, the library is located starting at block 0000 on lane 80 (lane 0 on a DataReader), but this can be changed by modifying the lane table at addresses 2880-2899 in Movement 1. Much more efficient operation can be obtained by locating the library near the same block address as the lane table (219) but on a different lane, or on a different tape drive altogether.
Once Movement 1 senses the END card in the source deck, it rewinds the tape on unit 1. Then it loads Movement 2 into memory from tape unit 0 and branches to its entry point at address 1200. The symbol table and other data developed by Movement 1 remain in memory at addresses 0000-1150 and 3000-3999. The second movement reads the partially-assembled instructions from tape unit 1, resolves addresses, generates Cardatron format bands, writes an updated block for each assembled word to the tape on unit 2, lists each assembled instruction to the printer, and outputs the assembled code, five words per card, to the card punch. Any errors detected during this movement are written to the Flexowriter.
At the end of Movement 2, the assembler outputs a checksum for the assembled program and the word "LOAD" on the Flexowriter. It then attempts to read one digit from the Console input device (normally the Console numeric keypad).
- If you enter a digit 1-9 at the Console, the assembler rewinds unit 0 and reads 15 blocks (300 words) from lane 89, block 0 to address 3700. From the comments in the assembler listing, this was intended to load an operating system named "SLIM" and transfer control to it. We have no further information on SLIM, which was probably something developed and used locally by Shell Research.
- If you enter a zero digit, the assembler will clear main memory and from tape unit 2 load the program just assembled. It will then rewind unit 0 and read 15 blocks as described above.
To run the assembled program from the deck punched by the second movement, there is a loader program detailed in Section XI of the Part I manual for the five-per-card format punched by the assembler.
Detailed Operating Instructions
To run the assembler in the retro-205 emulator, perform the following steps. See  for detailed instructions on operating the Cardatron and card equipment and  for instructions on operating the tape drives.
- On the Supervisory Panel, make sure the LOCK/NORMAL and CONTINUOUS/STEP switches are both in the down position.
- Load magnetic tape unit 10 (same as unit 0) with the load tape image for the assembler. You can use the tape image in . Place the unit in REMOTE status.
- Load magnetic tape units 1 and 2 each with blank tapes and place them in REMOTE status.
- On the Control Console, make sure the OUTPUT switch is set to the PAGE (Flexowriter) position.
- On the Control Console, make sure the INPUT switch is set to the KEYBOARD position.
- On the Cardatron Control Unit, click the GENERAL CLEAR and then INPUT SETUP buttons in that order. The ORDER field of the C register will show 44 for a card-read instruction.
- On card reader unit 1:
- Make sure FORMAT SELECT is "By Col" and FORMAT COL is set to 1.
- Load the bootstrap card image into the reader. You can use the image in .
- Load the card deck for the program to be assembled after the bootstrap card.
- Click the reader's START button. The orange lights will flash and stop with RLO, FLO, FS4 and FS2 lit, indicating Read Lockout, Format Lockout, and format band 6 selected.
- Halt the computer by pressing STOP on the Control Console.
- Key the digit 0 on your keyboard to run the program just assembled. Make sure the Control Console window has the focus when you do this, otherwise the keystroke will not be recognized. The program will be loaded from magnetic tape unit 2. At the end, unit 2 will rewind and the Flexowriter will print the highest and lowest addresses loaded into memory. Tape unit 0 will then seek to lane 89, block 0 and read 15 blocks into address 3700 (presumably to load the SLIM operating system as discussed above). Finally, the program will attempt to read a card into address 6000.
- Key any digit 1-9 on your keyboard. Likewise make sure the Control Console window has the focus first. This will skip loading the program just assembled and simply cause tape drive 0 to seek to block 0, read 15 blocks to address 3700, then attempt to read a card, as above.
 "Knuth's EASY Assembler":
 "Knuth's Algol-58 Compiler":
 Shell Symbolic Assembly Program for the Burroughs 205, Part I, Burroughs Corporation, Bulletin 3038, March 1960:
 Shell Symbolic Assembly Program for the Burroughs 205, Part II, Burroughs Corporation, Bulletin 3038, March 1960:
 One-card bootstrap program to load and execute the Shell Assembler:
 Loadable tape image of the Shell Assembler for the retro-205 emulator:
 Movement 1 assembly listing from 2015-10-27, showing format band errors and other anomalies:
 Latest commit of Shell Assembler files to the retro-205 repository:
 Utility script to extract source and object code from the assembler transcription files:
 Utility script to convert the output tapes from assembling both movements of the assembler into a loadable tape:
 Assembly listing of Movement 2 of the assembler:
 "Using the retro-205 Cardatron":
 "Magnetic Tape for the retro-205 Emulator":