The Burroughs 205 and 220 Blog

Wednesday, November 30, 2022

Recovering the WRITE subroutine for the Burroughs 205 Algebraic Compiler

(by Tom Sawyer)

WRITE - A Slow Output Method

WRITE was the "Achilles heel" of the ALGOL compiler for the Burroughs 205. While the compiler produced reasonably efficient code, and results from any scientific calculation became available quickly, getting them out onto printed paper was a lengthy process. I had forgotten just how maddeningly slow the output process was until I began testing the recovered WRITE program. The blinking lights in the B-register as a line of print was formed brought back memories of waiting for each line to show up on our IBM 407. Five to ten seconds between lines was typical, but longer lines could easily take 30 seconds.

How It Works

Let's review what WRITE does. The format of a WRITE statement is:

WRITE(;;OUT1,FORM1);

This references a list of variables intended to be printed, such as:
OUTPUT OUT1(I,J,K,X,Y,Z);

and a format description such as:
FORMAT FORM1(*THE ANSWERS ARE*,3I8,3X8.5,W0);

Together, these three statements will deliver a single line containing "THE ANSWERS ARE" text, three integers and three decimal-embedded floating-point numbers each in an 8 character field, preceded by a single line slew on the printer.

One might expect that the FORMAT statement would be used to generate a CARDATRON format band taking advantage of the ability of a Burroughs 205 to move full words over to a CARDATRON cabinet and let that system insert spaces and decimal points, suppress leading zeroes and so on. (See Paul Kimpel's CARDATRON posts here.)

Instead, each line of print is formed in an output buffer and all of the editing is done on the 205. When complete, all 120 characters are sent as a long alphanumeric string to the CARDATRON. To describe this as slow is an understatement. The WRITE process yields a net speed of about 3 to 4 characters per second. Since the Flexowriter could deliver 10 characters per second and the CARDATRON up to 300 characters per second from machine language or assembly language code, the wait for output was agonizing.

The WRITE subroutine is completely driven by the FORMAT statement which is simply stored in-line in the program and is interpreted at run time by the WRITE subroutine. Each time that WRITE completes the interpretation of a numeric field in the FORMAT, a connection is made to the OUTPUT statement and the next variable is retrieved to place in the line at the current position.

Reconstructing WRITE From the Author's Notes

The biggest job I faced in restoring Don Knuth's Algebraic Compiler for the Burroughs 205 was reconstructing this WRITE subroutine. Anticipating a difficult job, I put it off until I had finished the first three arithmetic subroutines, SQRT, SIN and COS.

Fortunately by that time, I had the guidance of Knuth's original coding sheets! By the time I began to work on WRITE, Paul Kimpel had received the package from professor Knuth (described in my Sine post) that included early coding for the compiler and several subroutines.

There were two full pages devoted to WRITE.

I studied the code carefully before beginning to transcribe it into a source deck for Paul Kimpel's copy of the EASY assembler - making a few notes along the way.

Some Clever Code

Most of the code is straightforward, but it is interesting to look at a bit of it to gain insight into the programming style of this early era. Let's take a closer look at the upper left block of code on the sheet above.

The purpose of this snippet is to take a single alphanumeric character contained in the left two digits of the A-register and place it into the current position of an 80 position buffer that contains the line of print being generated. Simple enough in any machine that can address individual characters - but not so simple on our Datatron!

This block of code is written as a subroutine and entered with the B-register holding the return address.

The first thing to notice is that the code module consists of exactly twenty words. Coding a module to fit within a single one of the Burrough's 205 20 word high-speed memory loops was key to optimizing a program.

ZOUT is the variable containing a pointer to the current position in the buffer where the character in the A-register is to be placed. ZOUT is conveniently stored in main memory within this code block so that it is immediately available to us in high-speed memory.

The essential function of the code is to store that character in the right place and some calculations must be done. That requires use of the A-register so the routine must save the current contents (our character to be inserted) someplace and instruction 2 of this block stores the A-reg over the top of instruction 1. Using the 7000-loop itself for temporary storage is a mark of experience. (No harm was done to the content of any location in main memory by this trick.)

The final steps of the code block appear to be three standard Burroughs 205 instructions:

15 0000
72 BUFFR+5016
72 0000

15 is a Normalize (shift left to eliminate leading zeroes) and 72 is a Set B-register.

But, these are not what they appear to be. When this code is executed with ZOUT holding a value of 12, for instance, These three words will have been transformed into:

14 9976
74 0013+BUFFR
02 0013+BUFFR

which will be executed as SHIFT LEFT 16 positions, ADD, and STORE instructions placing the character in the appropriate position of the third word from the left edge of the line being developed at address BUFFR. One need not be a Burroughs 205 expert to appreciate the elegance of this code snippet. Once again, no harm came to any instructions in main memory. Modifying instructions on the fly would not be frowned upon for almost another decade.

The snippet also shows the early stage at which this sheet was prepared since it referenced a 16 word (80 character) buffer and would later actually require 24 words; it also contains a small error as the second SR (shift right) instruction should read SR 10, not SR 8.

Eventually, I found and fixed all of the "small errors" and had a beautifully functioning WRITE routine.

Dealing With the WRITE Speed.

To deal with the slow speed of WRITE, the only alternatives are to use the existing interface to code a simplified output routine, for instance printing a word at a time on the Flexowriter. Knuth discusses this approach with the director of the University of Virginia's computer center in his correspondence, noting that Purdue had already done so.

Another approach would be to skip direct printing and generate output to magnetic tape using the MTWRITE subroutine. Printing can then be accomplished with a separate customized assembler program that processes the output tape.

Sunday, November 6, 2022

Sines and Cosines on the Datatrons - Part II

Recovering the SIN() and COS() Functions

Once again, I had a chance to review the BALGOL 220 implementation of a function versus Don Knuth's implementation of the same function for Clevite Corporation in Cleveland. This time I was hoping to find a 30 word version of the SIN() function.

I first checked out the SIN() implementation in the BALGOL compiler for the Burroughs 220. Uh-oh, - 56 words versus the 31 words on the 205 that calculate both SIN() and COS().

However, the polynomial calculation of Sine involving powers of x only involves 5 multiplications and x raised to the 9th power in the BALGOL version versus 9 multiplications and x raised to the 17th power in the Clevite version.

As with the square root subroutine discussed previously, it is worth examining the two approaches more closely. First though, a bit of theory.

You might recall from calculus class that there is an infinite series that will approximate the Sine of x. This is called a Taylor series (and in this specialized case, a MacLaurin series) and looks like this:

That looks like a series that ought to converge pretty quickly - as long as x is less than 1 (or at least not much bigger than 1) - after all, 11! is about equal to 40 million.

Now, let's think about what value x is going to be in our calculation. First, x will need to be an angle measured in radians, not degrees. That is, a full 360 degrees will be represented as 2.0 Pi radians, or about 6.28 A value of 6 in that series may not lead to quick convergence. And our x could be any value - even in the hundreds, thousands or more.

Looking at a graph, however, the cyclical nature of the sine function suggests that just covering 0 to 360 degrees will cover all possibilities.

Noting that values from 180 to 360 are just the negative of 0 to 180 suggests we could cut that interval in half. And since the upside and downside of that smaller interval are symmetrical, we can maybe get by with just calculating values for 0 to 90 degrees. That is good news: 90 degrees is just Pi/2 in radians or around 1.57 It turns out that for values between 0 and 90 degrees, seven terms of this series will give us eight digit accuracy. This looks like a plan!

Let's use abbreviated versions of this series to calculate the sine of 90 degrees.

The BALGOL SIN() subroutine for the Burroughs 220 first uses a scaling technique to reduce the original input value to a result in the interval -90 to 90 degrees. It does this with the following techniques:

Multiply the input by 1/Pi - effectively changing the radian input into "half cycles." For instance, 126.44910 radians (this is 20 x 2Pi +Pi/4) becomes 40.249999. We throw away the digits before the decimal point and are left with 0.249999 That is one quarter of a half-cycle or 45 degrees.
Now, double the value. The net effect to this point is to have multiplied by 2/Pi which converts our original input value in radians to a fraction of a quarter cycle. If that fraction is greater than 1, we subtract 2 from it. This always yields a decimal fraction between -1 and 1 corresponding to an angle between -90 and 90 degrees as input to the next step.
Use a Power series of 5 terms to calculate a Sine value. 0.70710678 in this case.
Specifically:
SIN(x) = Ax+Bx^3+Cx^5+Dx^7+Ex^9
where
A= 6.2831849
B= -41.341677
C= 81.604783
D= -76.701934
E= 42.040797
Adjust the results if the original input was a value that required the result to be negative.
Return to the calling program.

That seems straightforward. And it is easy enough to check out those coefficients versus the ones in the Maclaurin series. We just factor out the 2/Pi raised to the various powers.

Well, that's interesting. The BALGOL coefficients are similar to the "adjusted" MacLaurin coefficients but differ by increasing amounts in the higher powers. Perhaps this is not a MacLaurin Series. We had better explore a bit more. (In fact, while the five-term MacLaurin series works quite well up to 45 degrees, it breaks down badly as we approach 90 degrees. We would need at least six terms to get 7 digit accuracy.)

Enter Cecil J. Hastings, Jr., The Great Approximator.

As was the case with the Square Root function, it helps to go back to the late 1940s and browse through some of the literature from the IBM Computation Seminar held in Endicott, NY in December of 1949. This was the first of the conferences organized by Cuthbert Hurd after joining IBM. IBM Customers were still in the early stages of learning to compute with the IBM 604 electronic calculating punch and the Card Programmed Calculator (CPC).

Cecil Hastings, Jr. of RAND Corporation contributed a paper on the approximation of various functions with calculations to replace tables on cards. Hastings, over the next few years, would develop a library of approximations that he distributed on loose-leaf sheets to interested subscribers. These, in effect, became the subroutine library for the 604 and CPC.

Willis Ware, in his memoir of the early RAND years notes that Bob Bemer (Mr. ASCII) actually drew some of the error curves for Hasting's charts.

In 1954, RAND published the approximations, with detailed error curves in the form of a book titled Approximations for Digital Computers. Besides the approximations themselves, it included a one hundred and twenty page exposition of approximation theory and practice that had originally been given as a presentation to the Los Angeles Chapter of the Digital Computers Association in 1953.

Hastings was dubbed "The Great Approximator" by Herb Grosch. Lacking a PhD, he never gained the stature of other early Numerical Analysts but his work is embedded in much of the computing of the 1950s and 60s as shown here.

Well, there in chart 16 of Hasting's approximations, we have the formula, the coefficients and the error curve for the BALGOL Sine function:

For anyone interested in pursuing the math in depth, I recommend Hasting's book, or any treatise on Chebyshev polynomial approximation. If you want just a little more information about why this form of approximation is important, you can judge from this chart found at Math Stack Exchange.

Back to the 205 version of ALGOL 58

The Clevite SIN() subroutine from Don Knuth takes a different approach from the BALGOL scheme. Rather than modifying the input angle to fit within a -90 to plus 90 degree range, Knuth found a power series that would converge adequately over a -360 to +360 degree range. This requires a (probably) Chebyshev polynomial with nine terms. The Clevite routine was written expecting the input to be in degrees and takes the following steps:

Multiply by 1/360 - effectively, effectively changing the input into "full cycles." For instance, 7245 degrees (this is 20 x 360 +45) becomes 20.125 We throw away the digits before the decimal point and are left with 0.125 That is one eighth of a full cycle or 45 degrees. No further adjustments are required.
Use a power series of 9 terms to calculate a Sine value. 00.70710678 in this case.
Return to the calling program,

How well does it work? I drew my own error curve and compared it to the BALGOL results. The manual only promises seven digits of accuracy and the routine comes close:

Using a MacLaurin series of equal power would also work very well out to 180 degrees but then definitely goes "off the chart." Where did the coefficients come from for the 205 version? I don't have an answer for that. But it is worth noting that the common method for finding a Sine or Cosine in the 1950s and 1960s, was to get out the Chemical Rubber Company's book of Standard Mathematical Tables and look them up.

This handy publication was produced in Cleveland with the assistance of considerable labor obtained at Case Institute. A commenter at the John Cook Blog left this description of how the labor was obtained:

R E Van Valkenburgh

29 November 2014 at 18:11

This also brings up the question of how the manual labor for number crunching has been acquired.

For those of you who remember the huge, “Handbook of Chemistry and Physics” (still?) published by the former Chemical Rubber Company (now CRC Press)… my dad used to tell us how the students at the then Case Institute of Technology (not Case-Western Reserve University) were conscripted to calculate the large number of large tables. Hey, it never hurt your grade if your professor asked you to volunteer.

Not only is it remarkable how some of these things were calculated; but even more remarkable how accurate they often were. (Case & the CRC had multiple independent calculations to help identify errors, according to dad.)

So there would have been plenty of expertise around the Case campus to support Sine approximations. Still, it might have been better to limit the input to range to just 180 degrees and adjust accordingly.

Of some interest is what happens as the initial input angle grows - for example, into the thousands or more, and we discard the first several digits - what happens to our accuracy? We could lose signifiance. Lop off the leading three or four digits of the angle, and we are left with only five or four digits as our input. The BALGOL manual addresses this - but only when we have an angle greater than ten million and lose all significance. The manual and the compiler note an error message: RESULT ILL-DEFINED FOR SIN On the 205, you are on your own to be aware of the problem.

Calculation of the COSINE

The Cosine function is a simple addition to the subroutine library. Since the Cosine function is equal to the Sine function offset by 90 degrees, this subroutine consists of a single addition instruction placed immediately before the Sine subroutine adding 90 degrees to the submitted angle.

Addendum for the Modern Era

Accuracy is not the only concern when calculating Sines and Cosines on a computer. Speed is also an issue. At about the same time as our two Datatron compilers were being written, a new method for calculating these functions was being developed within Convair Corporation's group working on a navigation computer for the B-58. The method was not published for a while but apparently became very popular among calculators and computers that did not have a hardware multiplication capability. There was a time when "multiplications per second" was a key performance measure for computers. The new method is known as CORDIC, involves no multiplications and is worth reading about either in Wikipedia or a serious paper if you have a math and hardware inclination. The always informative Ken Sherriff Blog has a post on CORDIC as found in the 8087 math coprocessor. He includes photos of the coefficients hard coded into the 8087.

Is speed a concern on either the BALGOL or Clevite implementation of SIN(x)? I don't think so; the Clevite routine runs faster than the square root subroutine on the 205. I will come back to the question, however, in a later blog post about benchmarking these old machines.

Additional Notes and Resources:

An exhaustive set of coefficients of polynomials for approximating SIN(x) carried out to 36 digits can be found under the title Fast Minimax Polynomial Approximations of Sine and Cosine; it can be found here. See the Summary in Appendix B for development details. This may be the best source of approximating polynomials on the Internet but I had no luck mapping them back to Knuth's set - probably because they were optimized on the interval (0, 90) rather than on (0, 360).

The original paper published in this area appears to be the Remez Exchange Algorithm published (in French) in 1934.

Numerical Methods for Scientists and Engineers, first published in 1962 by R. W. Hamming, is likely the definitive work on Approximation Theory from the early years. Hamming, then at Bell Labs, was also a presenter at the 1949 seminar in Endicott.

Modern treatment of the subject can be found in Approximation Theory and Approximation Practice by Lloyd N. Trefethen (SIAM, 2013)

Tuesday, October 18, 2022

Sines and Cosines on the Datatrons - Part I

The reconstruction of SIN() and COS() for the Burroughs Algebraic Compiler on the Burroughs 205 was much easier once the SQRT() effort was complete and working. So this might be a good place to talk about the structure of the subroutines and, more importantly, the invaluable assistance that Paul was able to locate for us.

I'll begin by noting that Paul Kimpel came into possession of copies of professor Don Knuth's initial "rough draft" source code for several of the Burroughs 205 subroutines back in 2015.

Paul now picks up the story of

The Package from the Professor

From Paul:

During 2014-15 I built a software emulator for the ElectroData/Burroughs 205. A major goal of that project was to recover the ALGOL-58 compiler that Professor Knuth wrote for that machine during 1960-61, a PDF scan of a listing for which was among the papers that he had donated to the Computer History Museum in 2005. I transcribed that listing, and after also recovering the EASY and MEASY assemblers that were required, eventually got the compiler to run. Alas, the listing had included only the compiler itself and not its run-time library, so while the compiler appeared to generate good code, there was no way for it to finish and produce an executable program.

Nonetheless, I was quite satisfied with the results of my effort, and wrote Professor Knuth in mid-November 2015, describing what I had done. He replied by email shortly thereafter, expressing enthusiasm for what I had accomplished, and kindly offered to send me some other items from his files. One of those was a copy of a listing of a non-trivial program he had written in 1960 for use by the Clevite Corporation of Cleveland, Ohio, where Professor Knuth was enrolled in undergraduate studies at what is now Case Western Reserve University. Clevite made bearings for automobile engines, and the program appears to be an analysis of forces on such bearings. Since the parts of a reciprocating engine articulate and rotate during operation, such a program would need to use a variety of math functions, particularly trigonometric functions.

I transcribed that listing and got the program to assemble using the EASY assembler, and started to puzzle out the format and units of the input data it would require, but that proved to be quite difficult. I set the program aside at some point and eventually forgot about it.

In July 2020, Tom Sawyer wrote me with a progress report on his effort to reconstruct the compiler's run-time library and raised an issue about the precision of the square root routine he had been working on. In the middle of replying to Tom, I suddenly remembered the Clevite program and its math routines. I suggested to Tom that he try extracting those Clevite routines for use in the ALGOL-58 compiler's library, which he subsequently did.

Once Paul jogged my memory, I immediately took a look at the Clevite program and discovered a set of math subroutines including SQRT, SIN, COS, ARCTAN, EXP and LOG. This was wonderful; genuine "Knuth code" and they appeared to be about the right length for my existing 205 tape (see the final section of this post).

I implemented the SQRT, SIN and COS code from the Clevite program and was ready to move on to WRITE and READ which I knew were going to be difficult. WRITE was around 300 words of rather complicated 205 code. I thought that I would start with READ since there must be come similar code that I could borrow within the already working ALGOL compiler. After all, it handled the processing of statements like "PI = 3.1415927$"

I started browsing the 80 page listing that professor Knuth had sent to Paul and was amazed to find two pages of EASY/MEASY assembler code devoted to the WRITE subroutine! Just prior to this in the listing was source code for both SQRT and SIN. What a wonderful gift to our project.

Here is an illustration of the code from both listings. On the left is the original "rough draaft" source code for the sine calculation in the Burroughs 205 ALGOL-58 subroutines and on the right, the code in the Clevite program. They are identical, except for two constants highlighted in blue.

In addition, the coefficients used in the calculation are omitted in the compiler listing but are present in the Clevite code. We are fortunate to have both. The Clevite listing even displays the coefficients twice, helping with a legibility problem.

The only difference between the two sets of code is that the Clevite subroutine expects the input to be expressed in degrees, and the ALGOL 58 program expects the input to be expressed in radians. Interestingly both routines, as their first step, change the input value into fractions of a circle (or fractions of a sine cycle, if you prefer.)

For a 45 degree input, the Clevite program multiplies by 1/360 = .0027777778 to get 0.125

The ALGOL58 program multiplies by 1/(2Pi) = 0.15915494 to also get 0.125

This step has some very convenient benefits. For example, if the input were to be 7245 degrees (this is 20 x 360 +45) the result becomes 20.125 By discarding the integer and only keeping the fraction, we have an input that has been scaled to a number between 0 and 1 representing a fraction of a circle. The next 3 instructions do exactly that. Let's call that number x.

The programs then proceed to evaluate the following polynomial in x:

SIN(x) = Ax+Bx^3+Cx^5+Dx^7+Ex^9+Fx^11+Gx^13+Hx^15+Ix^17
where
A= 6.2831849
B= -41.341677
C= 81.604783
D= -76.701934
E= 42.040797
F= -15.046596
G= 3.7414651
H= -0.64069755
I= 0.060675276

This polynomial results in quite accurate results at any point in the interval 0,1 (approximately 7 digits of accuracy.) You can plug them into your favorite spreadsheet and check the results.

Note that the cosine subroutine simply consists of an alternate entry point one word previous to the sine entry point. A single instruction there adds 90 degrees to the angle for the Clevite routine or Pi/4 radians to the angle in the ALGOL-58 language routine. This takes advantage of the well-known relationship

COS(x)=SIN(x + 90 degrees)

This is as about as much that might be of interest to a programmer looking at the recovery of SIN(x) and COS(x) on the 205. But what about the mathematics involved? Where do those polynomial coefficients come from? How do other people calculate sines and cosines. And, how does this implementation compare to the one in BALGOL on the Datatron 220? I will take that up in a separate post, Part II.

Meanwhile, lets look at the actual layout of the compiler and subroutines on the magnetic tape that was distributed as Burroughs Algebraic Compiler back in 1960.

The Layout of the Compiler Tape

The full compiler occupies 276 blocks on a 10,000 block lane of a 205 tape. That is 276 blocks times 20 words per block x 11 digits per word equals 60,720 digits. If you count each word as roughly 5 bytes plus sign, it comes to 27,600 bytes. Stuffing a pretty full implementation of ALGOL 58 into that compact an area was quite an accomplishment.

Here is breakdown of the compiler segments on the tape:

Loader: This is a single block at the start which loads the next 200 blocks into memory and runs a checksum to see if everything is good to this point. This takes two tape read commands since 100 blocks is the maximum to be read by a single Burroughs 205 command. The tape reads require about ten seconds. The loader then performs a checksum on the 4000 words that were loaded which takes an additional 35 seconds. (A very useful hack when you are doing multiple compiles is to skip the checksum! There might have been some hazard to this back in the 1960s but is pretty low risk today. Even back then, you were much more likely to simply fail to read the tape than read it incorrectly.)

The Compiler: The next two hundred blocks on the tape, shown in red, comprise the 4000 words that are the actual compiler. You might notice a fair amount of empty space in the last quarter of that area. That is where tables are going to be stored. This code will read all of the source program and generate object code from the first card through your FINISH$ statement. Since all of this work is done in a single pass, address resolution is left as a project for the object code loader later. Some syntax checking takes place during compilation. If you misspell a reserved word, it might be immediately caught - for example, GOTO instead of GO TO or a second OUTPUT statement with the same name as a previous one. Other egregious errors like including a semicolon in front of an END will be caught on the spot and whenever errors are caught, compilation simply stops. It can take multiple tries to get to the end of the program the first time and careful desk-checking is advised!

The Compiler Overlay: Once the compiler has processed a program and detected a FINISH$ statement, the next 22 blocks, shown in orange, are read in overlaying 440 words beginning at location 2200. A check is immediately run to see if the sum of four counters is correct. If your program passes this test, you are on your way! This overlay will then load format bands for the CARDATRON reader, printer and punch onto the object program tape and finally, load any necessary subroutines from the following segments. If your program fails that first test, however, the compiler reports the dreaded ".1001" on the Flexowriter. This means "Unmatched left parenthesis or missing right operand. Probably goes back far into the program where the error actually occurred."

The Subroutine Segments: There were initially six subroutine segments on the compiler tape:

Floating to Fixed and Fixed to Floating. Calls to these subroutines are automatically generated when the compiler sees a statement like "X = 1$" and X has not been declared as an INTEGER.
LOG(X) and EXP(X)
SQRT(X), COS(X), SIN(X) and ARCTAN(X)
WRITE($$OUT,FORM)
READ($$IN)
PTREAD($$IN), PTWRITE($$OUT)

As can be seen in the picture above, my tape dump ran out of data a bit more than halfway through the LOG(X) and EXP(X) segment.

A seventh segment was added to the compiler tape later at the end of 1961 by Burroughs to incorporate magnetic tape access using the MTSEARCH(), MTREAD() and MTWRITE() calls. We have the good fortune to have the actual subroutines and the instructions for adding theses three new subroutines to the compiler because of the initiative of Paul Kimpel reaching out to Professor Knuth. This proved invaluable for dealing with the missing READ and WRITE routines. They also serve as an interesting template for anyone thinking about adding some new input or output subroutines.

Subroutine Lengths and Entry Points Overview

There are three tables within the compiler that relate information about the subroutine lengths. One specifies the length of the tape segments in blocks, a second notes how many words will be loaded from each segment and a third defines the entry point of a specific subroutine relative to the segment. For the SIN() function, it resides on a 7 block segment; 110 words will be loaded from that segment; and

SQRT()        begins at word 0
COS()        begins at word 26
SIN()           begins at word 27
ARCTAN() begins at word 57

all within the 110 word segment. This also tells us that the COS/SIN code should be 31 words in length, extending from word 26 through 56. It was very welcome news to discover that this was the length of the code found in the Knuth papers.