Welcome to the Datatron 205 and 220 Blog

This blog is a companion to T J Sawyer's Web Page that outlines the history of the Burroughs Datatron 205 and Paul Kimpel's incredible 205 and 220 emulators. Please visit those sites and return here to post any comments.

Wednesday, August 8, 2018

BALGOL – The Burroughs Algebraic Compiler

The Burroughs Algebraic Compiler for the 220 computer system was an early dialect of the Algol-58 language. Within Burroughs, it was styled as BAC-220, but was more popularly known as Burroughs Algol, or BALGOL.

In the prior post [1], I briefly summarized the movement in the 1950s toward the development of higher-level programming languages, the events leading up to the development of Algol, and an overview of the preliminary version of Algol, now known as Algol-58.

This post describes the origin of the Burroughs compiler, compares the BALGOL language features to those of the Algol-58 Preliminary Report [2], and briefly discusses compiler performance and influence.

Origin of BALGOL


In 1954, Robert S. ("Bob") Barton took a job at Shell Development Company Research in Houston, Texas, working on programming applications. By 1957, he had a talented group of young programmers working with him, including Joel Erdwinn, Clark Oliphint, C. M. Pearcy, and (as a summer employee) David Dahm.

This young gang of four wrote quite an advanced assembler for the ElectroData/Burroughs 205. The assembler was tape-based, but included a macro facility (including maintenance of a macro library on tape), special support for addressing the 205's high-speed memory loops, warnings for invalid adjacent instruction parings, and most usefully, a pseudo-instruction that would take COBOL PICTURE-like strings and generate the obtuse 319-digit patterns that are needed for Cardatron format bands.This became known as the Shell Assembler [15].

In 1959, Barton left Shell to work for the ElectroData division of Burroughs in Pasadena, California. Erdwinn, Oliphint, and Dahm (still as a summer employee) followed him to Pasadena. They became known as the "arthropods," because, well, they came from Shell. Not long after, Barton became the head of the so-called Automatic Programming Group in Pasadena, which was developing compilers and other programming aids.

By this time, the Burroughs 220 had been available to customers for about a year. While the 220 was a decimal, vacuum-tube machine, it had good (if somewhat slow) facilities for scientific and numerical computation. According to a memorandum Tom Sawyer has uncovered in the Burroughs archives at CBI, dated 12 November 1958, from R. A. Mallet of the Scientific Compiler Project, Burroughs had been working on a FORTRAN compiler with the goal that "the Burroughs 220 would be able to use any program written, in FORTRAN, for the IBM 650, 704, 709, and 7070..." The memo goes on to request a change in the way that arithmetic overflow was originally handled by the 220.

When Barton became head of Automatic Programming, he apparently assumed responsibility for this project, and according him in the 1985 B5000 Conference oral history [3, page 50], "It's correct to a certain extent in that the job that I had taken, under generally misleading conditions, called for doing an impossible FORTRAN which would also include conversion of assembly language from the 7090, or whatever the machine was at the time, automatically. I knew it couldn't be done, but that was my responsibility."

Somewhere in this time frame, the focus for the "scientific compiler" shifted from FORTRAN to Algol-58, the Preliminary Report [2] for which had been published in December 1958. It is not clear whether this happened at Barton's instigation or the young team took this direction on their own, but Dahm tells this amusing – and telling – anecdote from the B5000 Conference [3, page 27]:
I remember when I was a summer employee in the summer of 1959, I was working on the [BALGOL] compiler with another fellow named Joel Erdwin [sic], who was the project leader. And we were busily trying to do our version of IAL, and one day Bob Barton came along and he had a FORTRAN manual in his hand. It was a really nice, well-done FORTRAN manual done by IBM. He said, "Why don't you guys do this language because then you wouldn't have to write a manual?" We rejected that as being not a very good reason for picking a language. [Laughter] So, basically, I would say that the decision, that the compiler we would do would be ALGOL as opposed to FORTRAN was made by a summer employee and a project leader. I don't know that anyone else was really involved in making that decision.
One further clue as to the shift in focus is Barton's statement immediately following his complaint over the "impossible FORTRAN" [3, page 50]:
Erdwin [sic] would never have done a FORTRAN. I mean, he'd been going through this kind of educational experience at Shell, and he was not the sort of guy that would waste his time doing FORTRAN. He knew too much about language.
So it appears the decision to do an Algol-58 compiler was Erdwinn's, but probably with Barton's active approval and support. Barton gives Erdwinn full credit for the design of the BALGOL compiler, saying it was "a brilliant job of programming" and "Erdwin's [sic] masterpiece" [3, page 49].

Over the short term, it would probably have been better for Burroughs to have gone forward with a FORTRAN compiler, albeit one with more realistic goals. Over the longer term, though, the choice to go with Algol-58, plus the experience gained from building BALGOL and having customers use it, helped set Burroughs on the path to the B5000/5500, B6500/6700/7700, and later related systems that continue to be produced and sold today as Unisys ClearPath MCP systems.

According to the BALGOL reference manual [4], the compiler was completed and first installed for customer use in March 1960. We have two scans of listings of the compiler, one dated 1 May 1961 [5], citing in the heading [Joel] Erdwinn, [Jack] Merner, [Fran] Crowder, [Joe] Speroni, and [Donald] Knuth as authors. The second listing, dated 1 February 1962 [6], adds the names [David] Dahm, [Clark] Oliphint, Logemann, and [Toni] Schuman.

Erdwinn left Burroughs shortly after the BALGOL compiler was finished to join fast-growing Computer Sciences Corporation (CSC, now DXC Technologies). He spent the bulk of the next two decades building compilers and system software for CSC's customers, particularly compilers for none other than... FORTRAN.

Merner, Oliphint, and Dahm continued with Burroughs and, with Barton, had major roles in the development of the B5000/5500. Dahm is also known for his work on the B5500 Timesharing MCP and CANDE, and on the DMSII data base management system for the B6700/7700.

BALGOL vs. Algol-58        


BALGOL for the Burroughs 220 is one of several early implementations of the Algol-58 language described in the 1958 Preliminary Report [2]. Other well-known implementations are System Development Corporation's JOVIAL, the U.S. Navy's NELIAC, and the University of Michigan's MAD. The Preliminary Report was a progress report on the current state of a committee's attempt to develop a universal algorithmic language, not a fully-vetted specification. As such, and given the limited capabilities of computer systems at the time, all implementations based on the Report deviated from it, often significantly.

Of the Algol-58 variants, BALGOL is arguably the one that most closely followed the syntax and semantics put forth in the Preliminary Report. The BALGOL compiler is a single-pass design, intended for fast compilation at the expense of highly-optimized code, and oriented to a compile-and-go job shop environment.

The single-pass design imposes some restrictions on the dialect of Algol-58 BALGOL can support, and on the ordering of some language elements in the program text, as is discussed in the following sections. It drops some features of Algol-58, and implements others in slightly different ways, but adds little except for pragmatic facilities such as FORTRAN-like input/output, an open subroutine, program segmentation and overlay facilities, some diagnostic facilities, and the ability to incorporate machine-language routines into the compiled object code.

BALGOL's reference manual even has a detailed set of transliteration rules [4, Appendix E], as called for in the Preliminary Report, describing the differences between the Report and BALGOL's implementation. Going the Preliminary Report one better, the manual has a complete syntax definition for BALGOL in Backus-Naur Form [4, Appendix D].

The remainder of this section discusses the differences between BALGOL and Algol-58 in detail. It follows the overview of Algol-58 features in the prior post [1].

Symbology


Where the Algol-58 reference language uses a 63-character set (not counting upper- and lower-case equivalents), Burroughs chose to stick with the 48-character set available with IBM punched-card equipment, which most sites used through the 220's Cardatron subsystem.

Rather than attempt some sort of delimiter scheme to support the word-symbols of Algol-58 (begin, end, if, for, etc.), in BALGOL these symbols are considered to be reserved words. Spaces are significant and must be used as delimiters between reserved words and identifiers.
  • The 26 English letters: A-Z [upper-case only]
  • The decimal digits: 0-9
  • Arithmetic operators: +     .   /   *
  • Relational operators: LSS   LEQ   EQL   GEQ   GTR   NEQ
  • Logical operators: NOT   OR   AND   EQIV [equivalence]   IMPL [implication]
  • Sequential operators: GO,   GO TO,   RETURN,   STOP,   FOR,   IF,   OR,   EITHER IF,   OR IF,   SWITCH,   UNTIL,   OTHERWISE,   OVERLAY
  • Separators: ,   ..   $   =   **   BEGIN   END
  • Brackets: (   )
  • Declarators: INTEGER,   BOOLEAN,   FLOATING,   REAL,   ARRAY,   FUNCTION,   COMMENT,   PROCEDURE,   SUBROUTINE,   SEGMENT,    FINISH,   INPUT,   OUTPUT,   FORMAT,   MONITOR,   TRACE,   DUMP,   EXTERNAL STATEMENT,   EXTERNAL PROCEDURE
Compare this to the symbology described for Algol-58 in the prior post [1].

The IBM Type 407 tabulator used as a printer with the Cardatron had interchangeable type wheels. Two common wheel sets were "A" (commercial) and "F" (FORTRAN). Four character glyphs differed between the two, with the commercial set shown on the left below and FORTRAN on the right:
#   =
&   +
%   (
¤   )
The internal character codes for these pairs were the same; they simply printed differently on different 407s. In addition, the 407 does not support the semicolon (;), so BALGOL uses the dollar sign ($) in its place.

Aside from these substitutions, BALGOL adopted the following changes in symbology from the Algol-58 Preliminary Report:
  • The period (.) replaces the small cross (×) for multiplication.
  • The asterisk (*) replaces the up/down arrow brackets () for exponentiation.
  • Mnemonic equivalents replace the relational and logical operators, as shown above.
  • Two adjacent periods (..) replace the colon (:).
  • The equal sign (=) replaces the assignment operator (:=). EQL is used as the relational operator for equality.
  • Two adjacent asterisks (**) replace the subscripted (10) to delimit the scale factor in numeric literals for scientific notation.
  • Square brackets, [ ], are not supported by the 407, so parentheses are used to enclose array subscripts.
There are a number of additional declarators and sequential operators (statements), which will be discussed separately below. The complete list of reserved words for BALGOL is in Appendix C of the reference manual [4]. The identifiers for the routines in the compiler library are also considered to be reserved. The standard set of library routines is listed in Appendix G of the reference manual, although the library (and hence the set of reserved identifiers) could be revised during the compiler generation process.

Data Types and Declarations


BALGOL supports only INTEGER, REAL (FLOATING), and BOOLEAN data types. Unlike some other dialects of Algol-58 (e.g., JOVIAL), there is no provision for records, strings, or other composite types.

Literals


Literals are specified the same way as in Algol-58, except that numbers have the general form I.F**±E, where I is a whole integer, .F is a decimal fraction, and **±E is a scaling factor as a power of ten. The fraction and scale factor are optional, as is the sign preceding the exponent of the scaling factor. Unlike Algol-58, a number cannot be expressed as simply a scaling factor without at least an integer prefix to it. In addition, a literal cannot begin or end with a period for the decimal point (.), as this could be confused with the use of the period as a multiply operator. Boolean literals continue to be represented by 0 and 1 for false and true, respectively.

Identifiers and Types


As with Algol-58, BALGOL assumes all identifiers refer to REAL variables by default. A type declaration can specify an identifier to be of type INTEGER or BOOLEAN. Array identifiers within type declaration lists need not be followed by empty subscript lists indicating dimensionality – that is determined by the ARRAY declaration. The parentheses surrounding the entire list of identifiers in a type declaration are also optional in BALGOL.

Identifiers must start with a letter, which may be followed by letters and digits, up to a total length of 50 characters. An identifier may not contain embedded spaces or special characters.

BALGOL also provides a generic type specifier termed a prefix. A type declaration can include an identifier followed by an ellipsis (three adjacent periods, ...). Any identifier not otherwise covered by a type declaration that begins with the same set of characters automatically receives the specified type. Multiple prefixes can be specified with overlapping subsets of the same leading characters. Unless an identifier is fully stated in a type declaration, the longest prefix matching an identifier determines its type. An OTHERWISE clause may specify a globally-default type.
INTEGER I..., J..., K..., L..., M..., N...$
BOOLEAN KEEP, LAST$
REAL MAT..., JOULES$
REAL OTHERWISE$
With these declarations, the identifier MATRIX would be of type REAL, not INTEGER. Because BALGOL is a one-pass compiler, type declarations must precede first use of an identifier in an expression or array declaration.

Array Declarations


Array declarations differ from those in Algol-58. The most significant difference is that custom bounds are not supported. Each dimension of an array is declared simply as its length. All indexing is one-relative. Only literal lengths may be specified.

As an extension, BALGOL permits an array declaration to specify initial values for the array. If the dimension list is followed by an equal sign (=) and a list of literal values in parentheses, the compiler will fill the array with those values at the time the program is loaded. For arrays of multiple dimensions, a linear list of values is specified. These are assigned to the array elements in row-major order, starting with the last dimension. If the list of values is shorter that the total number of elements, the remaining elements are initialized to zero.
ARRAY A(9), B(4,12), C(10, 10, 10)$
ARRAY MONTHDAYS(12) = (31,28,31,30,31,30,31,31,30,31,30,31)$

Function Declarations


BALGOL supports in-line function declarations like Algol-58, but with a slightly different syntax. The function identifier must be preceded by the reserved word FUNCTION, and the parameter list delimited from the function body by an equal sign (=). Use of an in-line function is otherwise the same.
FUNCTION TORADS(DEGREES) = DEGREES . 3.1415926/180$
FUNCTION PYTHAGOREAN(A, B) = SQRT(A*2 + B*2)$

Diagnostic Declarations 


BALGOL supports three declarations for its run-time diagnostic facilities:
  • MONITOR specifies a list of variable identifiers and labels. At run time, whenever one of the named variables is modified, or any variable within the scope of a named label is modified, the first five characters of that variable's identifier are printed along with the variable's new value.
  • DUMP specifies a list of variables or labels. When a dump is triggered, the abbreviated names and the values of the variables are printed. Labels are printed with the count of the number of times they have been passed. A dump is triggered by a TRACE declaration.
  • TRACE specifies a list of label identifiers, each optionally followed by a literal integer in parentheses. If an integer (n) is specified, then every n-th time that label is passed in the program, all DUMP declarations in the program (or if execution is within a procedure, all DUMP declarations within that procedure) are triggered, and the values in their variable lists are dumped. If a label in a TRACE list is not followed by an integer, then the corresponding DUMP declarations are triggered each time the label is passed, as if (1) had been specified.
MONITOR X, Y, Z$
DUMP A, B, D$
TRACE AGAIN, INNERLOOP(100)$

Segmentation and Overlay


BALGOL supports multi-level segmentation of the object code for a program. A segment is simply a compound statement preceded by the reserved word SEGMENT and an identifier that labels the segment. Segments may be nested. The code outside any segment declaration is considered to be a global segment, and is permanently resident while the program is running.

A segment is called into memory using the OVERLAY statement, naming the segment label. Note that this statement does not transfer control to the segment, it merely loads it into memory. A separate GO TO statement must be used to enter the segment, and the segment label cannot be used for this purpose. Typically, the first statement inside the segment's compound statement is given a label, and a manual transfer is made to that label. Similarly, exit from the segment is not automatic, and an explicit transfer of control must be coded to return to the code for some containing segment.

Section 9 of the reference manual [4] gives the rules for overlaying segments and transferring into and out of them.

FINISH Declaration


Every BALGOL program must have a FINISH declaration at its end. This signals to the compiler the end of the BALGOL text and the start of the process to link in any library routines or external machine-language programs. The "$" following FINISH is required.

While the reference manual classifies this as a declaration, it is actually an executable statement, as it causes the compiler to emit a halt instruction with 9669 00 9669 ("XX") displayed in the C register. Pressing START on the Control Console at this point will execute a card or paper-tape read (depending on how the compiler was generated), with the intent of booting the next job in sequence.
FINISH$

Other Declarations


Comment declarations are the same as in Algol-58. After the reserved word COMMENT all text is ignored until the next "$".

The switch declaration of Algol-58 is not supported by BALGOL. Instead there is a more primitive SWITCH statement, as described below.

There are additional declarations to support formatted input and output of data. These are discussed in the section on input-output below.

Expressions


BALGOL follows the scheme of Algol-58 expressions closely, but more completely defines the resulting type of an expression, and adds a few twists.

Arithmetic Expressions


Arithmetic expressions are the same as in Algol-58, with the following exceptions:
  • Array identifiers are indexed using a list of subscripts in parentheses rather than square brackets.
  • Different characters are used to specify multiplication and exponentiation, as described above under "Symbology."
  • Expressions are evaluated left to right, except that default precedence is defined to be exponentiation first, then multiplication, then division, then addition/subtraction. Parentheses, as usual, can override the default precedence.
  • In certain cases, the multiply operator (.) may be omitted, as in standard mathematical notation. The specific rules are stated in Section 3 of the reference manual [4], but basically the operator may be omitted:
    • between a closing parenthesis on the left and an opening parenthesis on the right
    • between a simple variable or literal and a parenthesis on either side
    • between a closing parenthesis on the left and a subscripted or function variable on the right
    • between a literal on the left and a simple, subscripted, or function variable on the right.
     See the examples below.
  • The type of an arithmetic expression is integral if all of its components are integral and real otherwise. Note that inner expressions may be evaluated as integers while the overall expression may result in being of type real. The compiler will coerce literals and some constant expressions to the appropriate destination type at compile time where possible.
  • The integer precision on the 220 is ten decimal digits plus sign, so all integer operations are effectively truncated on the high end to ten digits, e.g., 5000000001 + 5000000002 = 3. The floating-point precision is eight digits.
Note in the second example below the use of implied multiplication with GAMMA. Note in the fourth example not only the implied multiplication of 2 and SQRT(S), but also that the expression is interpreted as (SIGN(A(R,L))) / (2SQRT(S)) since multiplication has a higher precedence than division.
ALPHA + BETA / GAMMA
(ALPHA + BETA) GAMMA
A*2/2*(B-1)
SIGN(A(R,L))/2SQRT(S)

Boolean Expressions


Boolean expressions operate in BALGOL as in Algol-58 with a few enhancements:
  • Parentheses around an arithmetic relation are not required.
  • Array subscripting is the same as for arithmetic expressions.
  • The Algol-58 relational symbols are replaced by mnemonic operators, as described under "Symbology" above.
  • If a relation is between operations of different types, the compiler will coerce the integer operand to floating point. If the integer operand is a literal, this coercion will occur at compile time.
  •  BALGOL adds a Boolean operator for implication (IMPL). A IMPL B evaluates to true unless A is true and B is false.
  • Evaluation of Boolean expressions is from left to right with this operator precedence: NOT > AND > OR > IMPL > EQIV.
(ALPHA GEQ BETA) OR (GAMMA NEQ EPSILON) EQIV NOT OVERRIDE

Statements


The concept of statements in BALGOL is essentially the same as that in Algol-58. The "$" is used to separate statements, not terminate them. The end of a line or record of program text has no syntactic significance. Statements may be split across lines and multiple statements may be grouped on a line.

Since the BALGOL character set does not include the colon (:), statement labels are indicated by two adjacent periods (..). Labels may be either identifiers or literal integers. Leading zeroes in a numeric label are not significant.

Compound statements are identical to those in Algol-58, except that the terminating END of a compound statement may be labeled to provide an exit point. An END may not be immediately preceded by a "$". If the BEGIN of a compound statement is labeled, the label may optionally be repeated after the corresponding END. Curiously, left and right parentheses can be substituted for BEGIN and END, respectively.

As with Algol-58, declarations may be intermixed with statements. Since BALGOL is a single-pass compiler, however, declarations for an identifier after the identifier has been first referenced (and its type determined at that point) are ignored.

The do macro statement of Algol-58 is not supported by BALGOL.

Assignment Statements


The BALGOL assignment statement is the same as for Algol-58, except that the operator "=" is used instead of ":=". Type coercion works the same – integer expressions are converted before being stored in real variables and real expressions are truncated to integer before being stored in integer variables. A Boolean expression will result in a 0 or 1 being stored in a numeric variable. Numeric expressions cannot be stored in Boolean variables.

BALGOL adds the ability to assign an expression to multiple variables. The stores take place in right-to-left order, starting with the variable immediately to the left of the expression. The expression value is coerced as necessary to a destination variable's type before the store, and that coerced valued is then applied to the next store in sequence. Note that this is multiple assignment of a single expression; embedded assignment within an expression (as in Algol-60 and C) is not supported.
ALPHA = BETA = GAMMA = 2DELTA + 1$

GO TO and SWITCH Statements


GO TO statements are more primitive in BALGOL, since it does not support the Algol-58 switch declaration, and thus there is no concept of a designational expression. GO TO may only reference a label. The word TO may be omitted.

To compensate for the lack of the switch declaration, BALGOL supports a SWITCH statement. This is a classic computed-go-to construct, where the value of an expression selects the branch destination from a list of labels. If the value is zero, control continues in sequence after the SWITCH statement. Values starting at 1 select a label from the switch list. If the value is negative or greater than the number of labels in the switch list, the behavior is undefined (internally, the branch table is built and indexed backwards, and an out-of-range index will result in a branch to some arbitrary instruction).
SWITCH CHOICE-1, (CASE1, CASE2, CASE3, CASE4)$

IF Statements and Alternative Statements


The IF statement works identically in BALGOL as in Algol-58.

The alternative (if either... or if...) statement of Algol-58 is supported in BALGOL, but with a slightly different syntax. Instead of starting with if either, the statement starts with EITHER IF. Otherwise, the chaining of OR IF clauses and terminating END are the same.

BALGOL adds an additional feature to the EITHER IF syntax. Instead of the terminating END, you may write "$ OTHERWISE$" followed by another statement, which of course may be a compound statement. This statement will be executed if none of the others in the skip chain have been selected. Thus BALGOL does have an "else." I found to my great chagrin that the "$" before "OTHERWISE" is absolutely necessary – without it, mysterious cascading errors result.
EITHER IF A > B$ FWD = 1$ OR IF A < B$ FWD = 0 END$

EITHER IF A > B$ FWD = 1$ OTHERWISE$ FWD = 0$

FOR and UNTIL Statements


The FOR statement in BALGOL functions identically to that in Algol-58, but with a slight change in syntax. The iteration list may be composed of individual values or triplets that define a sequence of values, but the triplet form is specified as "(initial, increment, final)" rather than Algol-58's "initial (increment) final".

BALGOL adds a new iteration statement, UNTIL, similar to the modern while statement. It evaluates a Boolean expression and iterates until the expression becomes true. If the expression is initially true, the body of the loop does not execute at all.
FOR I = 1, 3, 5, 7, 11, (13, 7, 99), 101, (103, 1, 125)$
    A(I) = I$

UNTIL A > 125 OR OUTTAHERE$
    BEGIN
    OUTTAHERE = A EQL 77$
    A = A + 3
    END$

STOP Statement


BALGOL supports the STOP statement of Algol-58 with two extensions. First, it is a simple halt, and does not terminate execution of the program. Pressing START on the Control Console will resume the program with the next statement in sequence.

Second, the word STOP may be followed by an optional arithmetic expression. The value of the expression will be displayed in the A register on the Control Console. All STOPs display 0137 00 7310 in the C register.

Procedure Declarations and Procedure Calls


Procedure declarations are somewhat simplified in BALGOL compared to Algol-58, but most of the capability is retained.

Procedure Declarations


The first significant difference for BALGOL is that multiple entry points are not supported, so the procedure heading does not have multiple identifiers each with their own parameter list. No label of the same name within the body of the procedure is required.

Second, the input and output parameter lists are not delimited by the "=:" operator. Instead, parameters are arranged in three groups: input, output, and reference parameters. Input parameters may be expressions or arrays. Output parameters may be variables, including arrays. Reference parameters include formal labels, formal functions and procedures, and identifiers for the INPUT, OUTPUT, and FORMAT declarations described below in the section on input-output. The parameter groups are delimited by "$". Any of the groups may be omitted, and the "$" for any trailing empty groups need not be specified. There is no difference between passing an array as an input parameter or as an output parameter.

Formal array identifiers must be specified with an empty subscript list containing n-1 commas to indicate an array of n dimensions. Formal procedure and function identifiers must be followed by an empty parenthesis pair.

Type declarations for the formal parameters must occur immediately after the BEGIN that introduces the body, not before it as in Algol-58. After the final END of the procedure, the procedure identifier may be stated, but it must be followed by an empty parenthesis pair. If the procedure returns a value like a function, the return value is assigned to the procedure's identifier as for Algol-58, but the identifier must be followed by an empty parenthesis pair in this case as well. See the Simpson's Rule example below for an illustration of the syntax.

A RETURN statement is required to exit the procedure.

Once a procedure has been declared, its identifier is available globally throughout the remainder of the program text and may be referenced, even within the body of another procedure.
PROCEDURE CROUT4 ($ N, A(,), B(), Y(), PIVOT(), DET, EX7$
                  SINGULAR, IP())$
    BEGIN
    INTEGER N, PIVOT, EX7$
    ...
    RETURN
    END CROUT4()$
In the example above, A, B, Y, and PIVOT are arrays, SINGULAR is a label, and IP is a function or a value-returning procedure. There are no input parameters in this case, only output and reference parameters.

Procedure Call


The differences for a BALGOL procedure call follow those for the procedure's declaration. Actual parameters are divided into input, output, and reference groups, and the groups delimited by "$". Expressions and scalar variables in the input group are effectively passed by value. Everything else is passed by reference (address).

Array slices can be passed by leaving some subscript positions empty for the actual array parameter, n empty positions specifying an n-dimensional slice. When passing a procedure or function as an actual parameter, the procedure or function identifier must be followed by an empty pair of parentheses. Unlike Algol-58, the number of parameters for the function is not indicated, and currying of the parameters is not supported.
CROUT4 ($ K, MAT(,), RHS(), FACTORS(), PIV(), DET, POWER$
    SINGULARITY, INNERPRODUCT())$

SUBROUTINE Declarations


BALGOL supports an open SUBROUTINE declaration. This is simply a compound statement that is given a name and can be called as a parameter-less procedure from other points in the program. All access to variables is global. Any declarations within the body of the SUBROUTINE are treated as if they had been placed outside its body and do not create local variables. A subroutine is called using the ENTER statement. Exit from the subroutine is by means of a RETURN statement, as for procedures.
SUBROUTINE COMMONCODE$
  BEGIN
  X = Y + Z$
  Y = A(I,J)*2 + Z$
  RETURN$
  END$

ENTER COMMONCODE$

Intrinsic Functions


BALGOL defines a number of intrinsic functions. These are not part of the compiler's library – they instead generate in-line code. All are considered to be reserved identifiers:
  • MOD(X, MODULUS) – remainder division of X by MODULUS
  • MAX(X1, X2, ..., XN) – maximum value from a list of arithmetic expressions
  • MIN(X1, X2, ..., XN) – minimum value from a list of arithmetic expressions
  • SIGN(X) – sign of the argument expression, -1, 0, +1, as the type of the expression
  • ABS(X) – absolute value of the argument expression, as the type of the expression
  • PCS(N) – returns a Boolean value indicating whether the specified Program Control Switch on the 220 Control Console is on, N = 0-9.
 In addition, BALGOL defines a standard set of mathematical functions in its library. These are described in Appendix G of the reference manual [4].

External Routines



PROCEDURE declarations may be prefixed with the reserved word EXTERNAL. In this case the body of the procedure is omitted. An external routine is one whose body will be supplied in relocatable machine language after the FINISH declaration at the end of the BALGOL text. The compiler will assign storage for the machine language instructions and link them into the object code. Instructions for preparing external machine-language routines and the linkage mechanism are in Appendix F of the reference manual [4].

BALGOL supports an additional type of machine-language linkage, EXTERNAL STATEMENT. This is both a declaration and an executable statement. It declares a label identifier that is matched to the identifier of a machine-language routine that will follow the FINISH declaration for the program. The statement branches to the external machine-language code. This is not a subroutine linkage, though, and BALGOL offers no intrinsic means for returning to the point in the program after the EXTERNAL STATEMENT.

Input-Output Facilities


The Algol-58 Preliminary Report does not address the subject of input and output of data for a program. BALGOL provides input-output capabilities somewhat similar to FORTRAN. The default configuration of the compiler and its library assumes input from punched cards and output to a line printer or a card punch, but alternate versions of the compiler could be generated to use other media, such as paper tape.

Input to a BALGOL program is free-form, with values delimited by one or more spaces or the end of the 80-character card or data record. The input formatter accepts integer and floating-point values. Numbers in scientific notation use "," to delimit the power-of-10 scale factor, not "**" as in BALGOL text. Character strings delimited by "$" may be read into a sequence of integer variables or array elements, five characters per word.

Values from the input medium are mapped to variables within the BALGOL program by means of an INPUT declaration. This specifies a named list of variables that is in turn referenced by a READ statement. Iteration through arrays is supported in much the same way as in FORTRAN:
INPUT MATRIX (N, FOR I=(1,1,N)$
                     FOR J=(1,1,N)$
                         (M1(I,J), M2(J,I)))$

READ ($$ MATRIX)$
The INPUT declaration generates a co-routine that supplies the address of the next element in the list on each call to it. The READ statement calls this co-routine and performs the actual formatting and data transfer. It keeps reading from the external medium until the input list is satisfied.

A second form of READ allows it to terminate before satisfying the list and signal that fact to the program:
READ ($ ENDED$ MATRIX)$
where ENDED is a Boolean variable that is set to true (1) if the input is interrupted by a signal in the input data stream. The signal is indicated by the 10-character sequence " SENTINEL " starting in the second character position of an input line or record. This signal is ignored if the first form of READ is used.

Output from a BALGOL program is not free-form, but formatted in a way similar to that of FORTRAN. An OUTPUT declaration defines the mapping of program values to the output medium, analogous to an INPUT declaration. It also generates a co-routine and is in turn referenced by a WRITE statement. While an input list may only name variables, an output list may specify its elements as general expressions.

The data is formatted and arranged on the output medium according to the specifications of a FORMAT declaration. This is like a FORTRAN FORMAT, although the formatting phrases are coded somewhat differently. The WRITE statement performs the actual formatting and data transfer, naming the identifiers for an output list and a format. A WRITE statement may also omit the data list and specify only a format identifier if the format contains only literal data.
OUTPUT MATOUT (N, SUM, FOR I=(1,1,N)$
                       FOR J=(1,1,N)$
                          (M1(I,J), M1(I,J)/M2(I,J)))$
FORMAT F1 (I8, B2, *SUM = *, X12.6, W2, (8F14.8, W0)),
       F2 (B20, *THATS ALL FOLKS*, W2)$

WRITE ($$ MATOUT, F1)$
WRITE ($$ F2)$
The BALGOL input-output facilities are described in Section 8 of the reference manual [4].

Examples


The prior post [1] exhibited near its end an example Algol-58 procedure for computing an integral using Simpson's Rule. The following is that example procedure transformed into BALGOL, and combined with declarations, procedure calls, and input-output statements to make it a complete program:
2 PROCEDURE SIMPS(A, B, DELTA, V$$ F())$
2   BEGIN
2   INTEGER K, N$
2
2   IBAR = V(B-A)$
2   N = 1$
2   H = (B-A)/2$
2   J = H(F(A) + F(B))$
2
2 J1..
2   S = 0$
2   FOR K = (1, 1, N)$
2     S = S + F(A + (2K-1)H)$
2
2   I = J + 4H.S$
2   IF DELTA LSS ABS(I-IBAR)$
2     BEGIN
2     IBAR = I$
2     J = (I+J)/4$
2     N = 2N$
2     H = H/2$
2     GO TO J1
2     END$
2
2   SIMPS() = I/3$
2   RETURN$
2   END SIMPS()$
2
2 FUNCTION TORADS(X) = 3.1415926X/180$
2
2 FUNCTION DARCTAN(X) = 1/(X*2 + 1)$
2
2 PROCEDURE LOGISTICSIGMOID(X)$
2   BEGIN
2   LOGISTICSIGMOID() = 1/(1 + EXP(-X))$
2   RETURN$
2   END LOGISTICSIGMOID()$
2
2 SUM = SIMPS(TORADS(30.0), TORADS(90.0), 0.00001, 2.0$$
2             SIN())$
2 WRITE($$ RESULT, F1)$
2 SUM = SIMPS(0.0, 1.0, 1**-5, 2.0$$ DARCTAN())$
2 WRITE($$ RESULT, F2)$
2 SUM = SIMPS(0.5, 3.0, 1**-5, 2.0$$ LOGISTICSIGMOID())$
2 WRITE($$ RESULT, F3)$
2
2 OUTPUT RESULT(SUM)$
2 FORMAT F1(*SINE INTEGRAL =    *,X10.6,W0),
2        F2(*DARCTAN INTEGRAL = *,X10.6,W0),
2        F3(*LOGISTIC INTEGRAL =*,X10.6,W0)$
2 FINISH$
The "2"s in the leftmost column of the program text are an artifact of the 220's Cardatron punched-card subsystem, and must be present to select an input format when the cards are read. BALGOL uses "2" for source cards, "6" for machine-language code, and "5" for data cards. The output printed by this program is:
SINE INTEGRAL =       .866025
DARCTAN INTEGRAL =    .785398
LOGISTIC INTEGRAL =  2.074510
I have transcribed the example programs from Section 11 of the compiler reference manual [4] and have them running in my retro-220 emulator [7]. I have prepared a few other programs, including the Simpson's Rule example above, and have them running in the emulator as well.

These examples are available from the BALGOL-Examples directory in the emulator's open-source project repository [8]. Each example includes source code, sample data, and compilation listings both with and without the code generated by the compiler. A few examples also include equivalent programs for purpose of comparison, written in the Algol-60 dialects of the Burroughs B5500 or modern Unisys ClearPath MCP ("E-mode") systems.

The emulator is written in Javascript and runs in a standard web browser. It can be run directly from its hosting site at [7], or you can download the files from the project repository and install it on some other web server. For instructions on running the emulator and the compiler, see the introductory blog post [9] and the wiki pages in the project repository.

BALGOL Compiler Performance


BALGOL was designed from the beginning to be a fast, one-pass compiler that generated reasonably good code. It was able to do small local optimizations within statements, and some rather fancy constant folding, but did not attempt to do more global optimizations, such as common expression factoring.

Even though the 220 was not a very fast machine, both the user-friendly nature of BALGOL and the speed of its compilation process gave it an advantage in environments where lots of relatively small programs were being submitted by lots of relatively unsophisticated programmers. For this reason it was quite popular at universities and research institutions. Stanford, Caltech, Georgia Tech, Cornell, Case (now Case Western Reserve), and Stanford Research Institute all had 220s and used BALGOL.

Various claims about the compiler's performance have been made. In his 2013 anecdote [10] on the use of BALGOL at Sanford in the early 1960s, Robert Braden stated "For typical source programs, the BALGOL compiler ran at line printer speed, 150 LPM [lines per minute], close to the card reader speed of 240 CPM [cards per minute]."

In a guest editorial [11] for the May 1977 issue of IEEE Computer devoted to stack machines, David Bulman talks about Bob Barton and the influence of BALGOL on the design of the Burroughs B5000/5500, and says of BALGOL, "About sixty compiler instructions were executed to generate each machine language instruction..." A little later he states, "The fact that Balgol [sic] could be compiled at the read speed of the card reader on the B220 lent some credibility within Burroughs to Barton and his ideas for a high-level language machine" (i.e., the B5000).

I have been unable to confirm Bulman's claim on instructions executed per instruction generated. I implemented an instruction counter in my retro-220 emulator and recorded counts of instructions executed and instructions generated for a few small and medium-size programs. There are some choices on just which instructions should be included (e.g., compiler load time, which takes about seven seconds and generates no instructions, or library linking, which generates lots of instructions without processing any source code), but excluding the library linking process, I mostly see ratios of 400-500:1, not 60:1.

I have also been unable to confirm either Braden's or Bulman's claims on compiler speed. For blank cards, comments, and cards with with relatively little text (say, a trimmed length up to 20 or so characters), the compiler does seem to operate at or near the line printer speed of 150 LPM. For complex statements and declarations, there is a noticeable pause between lines, often 2-3 seconds. The best average speed I have seen is close to 100 CPM, but that is without generating a compilation listing. A more typical average speed with a listing is 80-90 CPM.

The retro-220 emulator is designed to run programs at the speed of a real 220, using instruction times published in its Operational Characteristics manual [12] and the performance of peripheral devices from its Operating Procedures manual [13]. Emulator performance matches predicted performance on some very small benchmarks where it is easy to analyze the instruction timings, but timing is much more difficult to predict on larger, more complex programs that do significant amounts of I/O, such as the BALGOL compiler.

The emulator uses about 80% of an Intel 3.3GHz i3-2120 core when running under Windows 7 in the Firefox 61.0 browser and compiling a non-trivial program. It uses about 60% of a core when executing a program doing no I/O. Most of that processor use is devoted to browser overhead (garbage collection, rendering updates to the control panels and peripheral device displays, etc.), not to emulating 220 instructions, which requires only about 5% of a core. CPU utilization in Google Chrome 68.0 is about 30% less. In any case, the emulator's performance does not appear to be limited by its host environment.

It is entirely possible that the emulator is not regulating its performance correctly and running slower than a real 220, but there is a contradictory report to consider. In a 1960 letter [14] to the Communications of the ACM, Bob Barton briefly described BALGOL (calling it a "nameless compiler") and stated, "1. Compiling Rate: Averages 500 single-address instructions per minute." In the retro-220 emulator, I am seeing instructions generated at rates of 600-700 per minute, compiling the same programs where cards are being read at only 80-100 CPM. So Barton gives us a report of compiler performance that is lower than that observed in the emulator.

From this, I conclude that the claim of compiling at card-reader or line-printer speed is somewhat overstated. This should take nothing away from the compiler's performance, which for its time was remarkable, especially considering the intrinsic speed of the 220.

Significance and Impact of BALGOL


As mentioned above, BALGOL had a significant impact within Burroughs on the design of the B5000/5500 computer system that followed the 220. Not only did the compiler give the Burroughs design team in Pasadena the confidence that a system could be designed to support fast, one-pass compilers and use solely higher-level languages (the B5000/5500 never had a resident assembler, and its modern descendants still don't), but many of the data structure techniques it used internally were given hardware support. The I/O and diagnostic features of the language also migrated into the Burroughs Algol-60 compilers that followed BALGOL.

One striking result of the influence and experience of BALGOL upon the B5000/5500 is that Burroughs did considerably better in terms of performance of its one-pass Algol-60 compiler for that machine. It could easily compile at card-reader speed -- using a 1400 CPM reader.

Braden [10] describes the impact, under the guidance of George Forsythe, that BALGOL had during his time at Stanford on the growth of what he terms "computer literacy." The language's features and ease of use, along with the compiler's operational efficiency created a highly-productive, quick-turnaround environment that encouraged people to develop programming skills and begin using them in their studies and research.

Braden also mentions some students who were stimulated by the environment and went on to notable careers in computing and computer science. Two of those students, Roger Moore and Larry Breed, along with Braden, succeeded in cloning the BALGOL compiler from the 220 to an IBM 7090 after Stanford acquired that latter machine. The cloned language was referred to as SUBALGOL. It was bootstrapped in stages so that the compiler was eventually written in itself. It also offered the same degree of operational efficiency on the much-faster 7090 as had been enjoyed on the 220.

The next blog post will discuss the significant effort to recover the BALGOL compiler source code and get it into running condition, as well as what I have learned over the course of this project about the compiler's internal operation.

References


[1] "Algol-58 – the International Algebraic Language,"
http://datatron.blogspot.com/2018/07/algol-58-international-algebraic.html
[2] "Preliminary Report—International Algebraic Language," Communications of the ACM, vol. 1 #12, December 1958, pp. 8-22,
http://www.softwarepreservation.org/projects/ALGOL/report/Algol58_preliminary_report_CACM.pdf.
[3] "Burroughs B5000 Conference" 1985-09-06, Charles Babbage Institute,
http://hdl.handle.net/11299/107105.
[4] "BAC-220: Burroughs Algebraic Compiler," Revised Edition, 220-21017, Burroughs Corporation, Detroit, Michigan, March 1963,
https://bitsavers.org/pdf/burroughs/Electrodata/220/220-21017_B220_BALGOL_196303.pdf.
[5] BALGOL Listing, May 1961,
https://bitsavers.org/pdf/burroughs/Electrodata/220/BALGOL_listing_196105.pdf.
[6] BALGOL Listing, March 1962,
http://archive.computerhistory.org/resources/text/Knuth_Don_X4100/PDF_index/k-1-pdf/k-1-u2196-balgol220compiler.pdf.
[7] Burroughs 220 Emulator Hosting Site, http://www.phkimpel.us/Burroughs-220/.
[8] retro-220 Emulator Project Repository, https://github.com/pkimpel/retro-220/.
[9] "Introducing the retro-220 Emulator,"
http://datatron.blogspot.com/2018/06/introducing-retro-220-emulator.html.
[10] "Burroughs Algol at Stanford University, 1960-1963," Robert Braden, IEEE Annals of the History of Computing, vol.35 #4, October-December 2013, p.69-73,
https://ieeexplore.ieee.org/document/6674030/.
[11] "Stack Computers," David M. Bulman, IEEE Computer, vol.10 #5, May 1977, p.14-16,
https://www.computer.org/csdl/mags/co/1977/05/01646477.pdf.
[12] Operational Characteristics of the Burroughs 220, Burroughs Corporation, Bulletin 5020A, August 1960,
https://bitsavers.org/pdf/burroughs/Electrodata/220/5020A_B220_OperCharac_196008.pdf.
[13] Handbook of Operating Procedures for the Burroughs 220, Burroughs Corporation, Bulletin 5023, November 1969,
https://bitsavers.org/pdf/burroughs/Electrodata/220/5023_B220_OperatingProcedures_19911.pdf.
[14] "Another (Nameless) Compiler for the Burroughs 220," R. S. Barton, Communcations of the ACM, vol.4 #1, January 1961, p.A11, https://dl.acm.org/citation.cfm?doid=366062.366070.
[15] "The Shell Assembler, Part 1,"
http://datatron.blogspot.com/2016/06/the-shell-assembler-part-1.html.