The Burroughs 205 and 220 Blog: July 2018

Sunday, July 29, 2018

Algol-58 – The International Algebraic Language

By the early 1950s, the number and variety of electronic computers were increasing, the scope of their application to science, business, and government was rapidly expanding, and the demand for the programming those applications required was expanding in concert. This demand stimulated the need for increasingly efficient and effective ways to prepare programs, which in turn led to the development of higher-level programming languages.

One of the first attempts at designing a universal programming language, at least for numeric computation, was the International Algebraic Language (IAL), also known as Algol-58. The Burroughs Algebraic Compiler (BALGOL) for the 220 computer system was an early dialect of Algol-58. To prepare for a detailed discussion of BALGOL on the 220 in the next post [1], this post will discuss the development of Algol-58 and present an overview of the language.

Background

Initially, preparation of the program steps for machine execution was done entirely on the machine's terms. Instructions were prepared directly using the numeric or alphanumeric code required by the hardware. Hence the popular use of "code" and "coding" as synonyms for programs and programming. There is a lot of manual bookkeeping involved in programming at this level, especially in keeping track of addresses within the code for data storage and branch locations. That bookkeeping task is compounded when changes must be made to an existing program.

Mnemonic notations arose fairly quickly to ease the clerical task of coding, along with a number of higher-level abstractions, such as flowcharting, to aid in the overall conception and design of programs. Not long after, people began to realize that the machine itself could assist with the clerical aspects of generating code, especially in keeping track of address assignments. This led to the development of assembly languages and the assembler programs to process them.

From the very beginning of automated computing, though, there has been a strong desire to be able to approach programming more on our terms, in a way that is more in keeping with the way we think about the applications to which we want to apply a computer. Since most early applications involved numerical calculations based on mathematical formulae, there was an obvious desire to express programs using standard notations from mathematics. FORTRAN is one of the earliest and best known realizations of this desire, but as Donald Knuth made clear in his 2003 lecture, "A Dozen Precursors of FORTRAN" [2], attempts to express programs using higher-level abstractions can be traced all the way back to Konrad Zuse during the 1940s.

A parallel desire has been to be able to express programs in a more general way than in the coding for a specific type of machine. The half-life of a given computer design in the 1950s was only a couple of years, so as new designs came along, existing programs had to be completely redone using the instruction coding for each new machine. Being able to express programs in a way that was not specific to any particular machine would ease this burden. It would also open the possibility of more easily adapting programs written for one machine so that they could run on entirely different ones.

Many efforts towards translation of standard arithmetic expressions into machine instructions were going on, starting in the early 1950s. A preliminary report on FORTRAN was available by 1954, but other languages were starting to appear, such as Alick Glennie's Autocode, Alan Perlis' Internal Translator (IT, also known as the Purdue Compiler), and Grace Hopper's A0 and MATH-MATIC. There were also a number of interpretive expression evaluators, such as the Whirlwind Algebraic Interpreter. By the second half of the 1950s, the number of different (and incompatible) programming notations was beginning to proliferate.

In 1955, the German Association for Applied Mathematics and Mechanics (GAMM) held a meeting in Darmstadt on electronic computers. This resulted in the formation of a committee to work on formula translation. Then in May 1957, the Association for Computing Machinery (ACM), SHARE (the IBM user organization), USE (the Sperry Univac user organization), and DUO (the ElectroData/Burroughs user organization) held a conference in Los Angeles to explore the exchange of computing information. The group concluded that a single, universal programming language would be very desirable, and recommended that the ACM form a committee to pursue the idea. A few months later, the GAMM group reached a similar conclusion and contacted the ACM to suggest a joint conference that would make recommendations on a common language.

By April 1958, both the ACM and GAMM committees had independently developed proposals for a universal language and met to exchange them. They found they had much in common, so arranged to hold a joint working session between representatives of each group to combine their proposals and resolve the differences. That joint session was charged with the following objectives [3]:

The new language should be as close as possible to standard mathematical notation and be readable with little further explanation.
It should be possible to use it for the description of computing processes in publications.
The language should be translatable into machine programs.

The joint ACM-GAMM meeting was held in May 1958 at ETH in Zurich, Switzerland. The product of that meeting was the "Preliminary Report—International Algebraic Language" [3]. The language was initially referred to as the IAL, but the name was soon determined to be unsuitable, and Algol, short for "algorithic language," was suggested instead. The successor to the preliminary IAL was known as Algol from the start, so to distinguish them, the IAL eventually became known as Algol-58 and its successor as Algol-60.

FORTRAN had been released by IBM for almost a year by the time of the 1958 ACM-GAMM joint session in Zurich. FORTRAN was being received warmly, especially by the scientific community. It was not exactly an attractive notation, though, and was certainly not an ideal way to exchange definitions of algorithmic processes, especially among people accustomed to the elegance of mathematics. Even John Backus, the driving force behind FORTRAN, could not have been completely pleased with it, as he was one of the ACM representatives to the Zurich joint session. Thus, there was motivation to come up with something much better.

Overview of Algol-58

This section attempts to describe the IAL as presented in the Preliminary Report in somewhat informal terms. It largely follows the structure of that Report.

Symbology

Algol-58 uses a 63-character set, which allowed for a number of traditional mathematical symbols, and may have been influenced by six-bit codes for teletype and punched paper-tape equipment. Early FORTRAN, of course, had been limited by the 48-character set used in IBM punched-card equipment.

The basic symbols defined for the language are:

The 26 English letters: A-Z [including both upper and lower case; the report is silent on the subject of case sensitivity]
The decimal digits: 0-9
Arithmetic operators: + – × /
Relational operators: < ≤ = ≥ > ≠
Logical operators: ¬ [not] ∨ [or] ∧ [and] ≡ [equivalence]
Sequential operators: go to, do, return, stop, for, if, or, if either, or if
Separators: . , : ; := =: → ₁₀ begin end
Brackets: ( ) [ ] ↑ ↓
Declarators: procedure, array, switch, type, comment

The role of spaces is not described in the Preliminary Report. It seems likely the intent is that, as in FORTRAN, spaces can be inserted for readability, but they do not serve as delimiters, and their presence or absence does not affect the interpretation of the program text.

Note that, as with Algol-60, the boldface words in the lists above are considered not to be identifiers or sequences of characters, but atomic symbols in their own right. Thus, "go to" is not two words or five characters, but a unique symbol in the language. The subscripted "₁₀" is used to indicate the power-of-10 scaling factor for numbers in scientific notation.

Also note that this is the symbol set for the reference language, which is intended to specify the language itself and not necessarily be used for coding programs or publishing algorithms. The report anticipated that a more restricted set of characters, a hardware representation, would need to be used by an actual compiler targeting a real machine. It also anticipated that an expanded set of characters might be used in a publication language used to communicate algorithms in the literature. Each instance of a publication language or hardware representation is to be accompanied by a set of transliteration rules for mapping to and from the reference language.

Data Types and Declarations

The Algol-58 report specifies three data types – integer, real (floating-point), and Boolean (logical) values, plus arrays of these entities. It appears to anticipate the use of additional types, but is silent on how they should be handled.

Literals

Numeric literals have the general form I.F₁₀±E, where I is a whole integer, .F is a decimal fraction, and ₁₀±E is a scaling factor as a power of ten. The fraction and scale factor are optional, as is the sign preceding the exponent of the scaling factor. A number could also be expressed as simply a scaling factor by omitting the integer 1 before it, e.g., "₁₀-3" for 0.001.

Boolean literals are represented as the integer literals 0 [false] and 1 [true].

Identifiers and and Types

Identifiers are defined in the usual way, starting with a letter and optionally followed by a string of letters and decimal digits. No maximum length for an identifier is specified.

By default, all identifiers are considered to refer to variables of type real. A type declaration can be used to specify that an identifier refers to an integer or Boolean variable. Note that the type declaration does not define a variable or assign it storage. The declaration merely associates a type with an identifier. Variables are declared upon first reference to their identifiers, which need not appear in any type declaration.

integer (i, j, k); Boolean (on, off);

Array Declarations

Array declarations name a list of identifiers to be considered as arrays and to specify the bounds of each array's dimensions. The bounds notation is a little different from Algol-60 -- all of the lower bounds are specified first in a comma-delimited list, then a colon (:), then the corresponding upper bounds are specified in a second comma-delimited list. The Report specifies the bounds to be "integers," so it appears that dynamic array sizing is not supported. Multiple array identifiers can be associated with a common set of bounds, as in Algol-60. If an array is to be of a type other than real, it must appear in a type declaration, and must be followed by square brackets containing sufficient commas to indicate the dimensionality of the array:

integer i, j, m[], n[,,];
array a, b, c[1:10], d[0, 0:7,19], m[-7:7], n[1,1:9,9];

Switch Declarations

A switch defines a multi-way branch, after the idea of the computed go-to in FORTRAN, but essentially the same as switches in Algol-60. Note that it is a declaration, not a statement. It defines an ordinal list of designational expressions, which are either simple statement labels or indexed switch identifiers. A statement label may be an identifier or a literal integer.

A switch is used by the go to statement, which takes as its target not a label, but the more general designational expression construct. The index used with a switch identifier is evaluated dynamically, not only for a switch named in a go to statement, but also for any switch identifiers used in designational expressions within switch declarations. Thus the evaluation of a designational expression may result in a cascade of switches being evaluated before a label is finally encountered and the destination of a branch determined.

switch s1:= (start, 44, s2[i×j-1], s1[k]);
switch s2:= (5, 4, 3, 2, 1, blastoff);

go to s1[3];

Comment Declarations

Algol-58 considers a comment to be a declaration. It can appear between any instance of other declarations and statements, but otherwise has no effect on the meaning of the program. Like the Algol-60 comment, it starts with the symbol comment and continues until the next semicolon (;) is encountered.

Function Declarations

Algol-58 supports the declaration of an expression as a function, similar to the in-line function of FORTRAN. The function defines a list of parameters, the actual values of which will be substituted into the expression. Any variables in the expression not named in the parameter list are referenced globally in the containing program.

toRads(degrees) := degrees×3.1415926/180;
Pythagorean(a, b) := sqrt(a↑2↓ + b↑2↓);

length := Pythagorean(base, height);

The language also supports procedure declarations and calls. These are a more complex subject, and are discussed in a separate section below.

Expressions

Expressions in Algol-58 are formed in the usual way, from combinations of variable references, function calls, and operators of various types. The values of scalar variables are referenced as simple identifiers. Elements of arrays are referenced as the identifier for the array, followed by a comma-delimited list of subscripts in square brackets ([ ]). Subscripts can be general arithmetic expressions, which are rounded to integers before indexing into the array.

Functions (and procedures that return values as a function) are evaluated by naming the function, followed by a comma-delimited list of expressions in parentheses corresponding by position to the parameters of the function, as shown in the example for the Pythagorean function declaration above.

Arithmetic Expressions

An arithmetic expression, A, may take any of the following forms:

A numeric literal, e.g., "123.456"
A simple identifier, e.g., "alpha"
A subscripted identifier, e.g., "points[i, j]"
A function call, e.g., "Pythagorean(x, y)"
An arithmetic expression prefixed by "+" or "–", e.g., "–A"
Two arithmetic expressions joined by an arithmetic operator (+ - × /), e.g., "A/7.0"
One arithmetic expression followed by a second arithmetic expression in up/down arrow brackets to indicate the first expression raised to the power of the second expression, e.g., "A₁↑A₂↓"
An arithmetic expression in parentheses, e.g., "(A)"

Evaluation of arithmetic expressions is by default from left to right, with multiplication and division having precedence over addition and subtraction. The Report is silent on the precedence of exponentiation, but presumably it is higher than multiplication/division. Parentheses may be used to alter or make clear the precedence by which operators are to be applied to values.

alpha
alpha + beta / gamma
(alpha + beta) / gamma
a↑2↓ / 2↑b-1↓
(m[i] + n[j, k])×(m[j] - n[k, i])

Boolean Expressions

A Boolean expression, B, may take the following forms:

The literal value for false, e.g., 0
The literal value for true, e.g., 1
A simple or subscripted identifier, e.g., "empty" or "flags[k]"
A function call, e.g., "testValid(assertions())"
An arithmetic relation, consisting of two arithmetic expressions (A) separated by a relational operator (< ≤ = ≥ > ≠) and enclosed in parentheses, e.g., (A₁ < A₂)
A Boolean expression prefixed by the not operator, e.g., ¬B
Two Boolean expressions separated by a Boolean operator (∨ ∧ ≡), e.g., B₁ ∧ B₂

Evaluation of Boolean expressions is strictly left to right. Parentheses must be used to specify any operator precedence. The not operator applies only to the expression immediately to its right.

(alpha ≥ beta) ∨ (gamma ≠ epsilon) ≡ override

Statements

The individual operations of a program are defined by statements. As is typical, statements are executed sequentially unless directed otherwise by a containing statement. Algol-58 is a free-form notation, without particular line-length or columnar conventions. Statements are separated by (not terminated by) the semicolon (;). The end of a line or record in the program text has no syntactic significance.

Algol-58 introduced the idea of a compound statement by using the symbols begin and end to bracket a sequence of individual statements. The compound statement may then be used in any place that an individual statement can be used, e.g., as the object of an if or for statement. There is no provision for an empty or null statement, so strictly speaking, a semicolon cannot precede an end.

Statements can be prefixed by a label and a colon (":"), which serves to declare the label. The label can be an identifier or an integer literal. If a compound statement has a label, then the terminating end of the compound statement may optionally repeat the label to indicate the span of the compound statement, e.g.,

start:
        begin
      x:= x+1;
      if (x < 100);
          go to start;
      end start

Declarations may be interspersed with statements. The order and placement of declarations within a program is immaterial. Algol-58 did not support the Algol-60 concept of blocks, where declarations within a compound statement create a nested identifier scope, which has visibility of the identifier scopes of all containing blocks. Declarations embedded within compound statements are interpreted as if they were placed outside any compound statement.

Assignment Statements

The assignment statement works similar to that in most languages. The identifier of a variable (possibly subscripted) on the left is assigned the value of the expression on the right. The assignment operator is ":=", a standalone equal sign ("=") being reserved for use in arithmetic relations.

If the variable on the left of the assignment operator is of an arithmetic type, the expression on the right must yield an arithmetic value. The Report implies that the value of the expression will be coerced as necessary to the type of the assignment variable, but does not explicitly say so. Mixed integer/real expressions are permitted, but the Report does not say how the resulting type of an expression should be determined.

If the variable on the left is arithmetic and the expression on the right is Boolean, the truth value of the expression will be converted to an integer 0 if it is false or 1 if it is true. The converse is not true, however -- only Boolean expressions can be assigned to Boolean variables.

Go To Statements

The go to works as in most languages, except that as noted earlier, the operand of a go to is a designational expression. Thus you can branch to a simple label or to an indexed switch identifier. As also mentioned earlier, all designational expressions in switches are evaluated dynamically using the current values of their index expressions. Index expressions are ordinal (i.e., one-relative). The Report is silent on the behavior of a switch when its index is less than one or greater than the number of designational expressions in its declaration.

If Statements and Alternative Statements

The if statement works as in most languages, except that it serves only as a guard for the next statement. There is no else in Algol-58. The guarded statement may of course be a compound statement. The condition to be tested may be any Boolean expression.

if (i > 0);
if (j < i);
j := i+1;

Algol-58 implements a variant of the if statement that functions as a skip-chain. Each if either or or if is evaluated in sequence, the statement associated with the first condition that evaluates true is executed, and any remaining conditions and statements are skipped. There is no provision for a default or catch-all statement if none of the specified conditions is met:

if either (i = 1);
      j := i+1;
or if (i = 2);
      j := i+2;
or if (i = 3);
      j := j+i;
or if (i > 3);
      begin
      if (j > 0);
          j := i;
      if (j < 0);
      j := 0;
      end
end

For Statements

The only iterative statement in Algol-58 is the for statement, which extended FORTRAN's idea and anticipated the complex for statement of Algol-60. Specification of the sequence of values over which iteration will occur has two forms. The first is a simply comma-delimited list of arithmetic expressions. The control variable is assigned the value of the next expression in the list at the start of each iteration. When the end of the list is reached, the loop terminates:

for x := 2, 4, 5, 9, 16, 27, 31, -1;
cell[x] := cell[x]×3;

The second form is more like that of FORTRAN, with initial, increment, and final values for the control variable. Each value may be an arithmetic expression. The middle expression in parentheses is the increment value:

for x := 1 (1) n-1;
      begin
      if (cell[x] < 0);
          cell[x] := -cell[x];
      if (cell[x] > 0);
          cell[x] := 1
   end

The two forms can be combined to iterate over a complex sequence of values:

for x := 1, 5, 8, 99 (-7) 9, 101, 127, i (j) k, 255;
begin ... end

Do Statements

The do statement is not so much an executable statement as a generator of executable statements. It is more in keeping with the idea of a macro. This is best illustrated with a simple example:

first: A := (B) / (C);
last: go to D;
...
do first, last (result→A, alpha×beta-1→B, gamma→C, top→D)

The do statement names a range of other statements identified by the labels of the first and last statements in the range. A one-statement range may be specified using a single label. The entire do statement is replaced by a copy of that range of statements, less the bracketing labels and any embedded declarations. Within the copy, each of the identifiers on the right of an arrow ("→") is replaced by the corresponding sequence of symbols on the left of the arrow. In the example above, the do statement would be replaced by:

result := (alpha×beta-1) / gamma;
go to top;

The range of statements copied by a do statement can itself contain do statements, in which case any nested do statements are replaced first, so that the final text contains no do statements.

Stop Statements

The stop statement terminates execution of the program. No resumption of execution from the stop is possible.

Procedure Declarations and Procedure Calls

The concept of a procedure in Algol-58 is that of a closed subroutine having possibly multiple entry and exit points. All identifiers referenced within the body of a procedure refer either to locally-declared variables and labels or to the formal parameters of the procedure. Global variables in the containing program are not visible within the procedure unless they are passed as parameters. Variables local to the procedure body are not visible outside the procedure.

Procedure Declaration

Parameters to a procedure are classified as either input parameters or output parameters and are specified in separate parameter lists delimited by a "=:" symbol. The general form of a procedure declaration is:

procedure I₁ (Pi₁) =: (Po₁), I₂ (Pi₂) =: (Po₂), I₃ (Pi₃) =: (Po₃), ...

where the I_n are the identifiers of entry points within the procedure. Each such identifier must also be a label within the body of the procedure that defines the location where that entry point will commence execution.

The Pi_n are lists of formal input parameters that supply values to the procedure, and the Po_n are lists of formal output parameters that receive results from the procedure. The formal variables in the lists can, of course, be different for the different entry points.

Formal input parameters may consist of a scalar identifier, an array identifier followed by some number of empty subscript positions in square brackets (to indicate the dimensionality), or a function or procedure identifier followed by some number of empty parameter positions within parentheses (to indicate the number of parameters to the formal function or procedure).

Formal output parameters may consist of a scalar identifier, an array identifier with empty subscript positions as for input parameters, or a label identifier. The output parameter list and its preceding "=:" can be omitted, in which case the procedure can be called as a function and be used within an expression. The return value is defined by assigning it to the entry label identifier as if it were an ordinary scalar variable, similar to the method used in Algol-60. See the example below for an illustration of this.

Following the procedure heading defining the entry points and parameter lists, there can be a set of declarations supplying type information for the formal input and output parameters. The type for label parameters is deduced from the context of their use in the body of the procedure. Surprisingly, array declarations for formal parameters can have expressions in their bounds lists, provided any identifiers in the expressions are formal parameters Undeclared formal parameters are of type real, as usual. The body of the procedure follows this, and must be enclosed in a begin/end pair. Any declarations within the body of the procedure refer only to local variables.

Procedure Call Statement

A procedure call statement states the identifier of a procedure entry point, followed by input and output lists for the actual parameters, with the two lists again delimited by "=:". The Report specifies that in the call, the replacement of formal parameters by actual parameters "may be considered to be a replacement of every occurrence within the procedure of the symbols (or sets of symbols) listed in formal parameters, by the symbols (or sets of symbols) listed as actual parameters in the corresponding positions of the procedure statement, after enclosing in parentheses every expression not enclosed completely in parentheses already." This seems to be an early form of call by name, which was more fully developed in Algol-60. There is no call by value, even for input parameters.

The report allows array slices to be passed to array parameters. The dimensionality of the actual parameter slice must match that of the formal parameter, i.e., the number of empty subscript positions in the actual and formal parameters must be the same. More interestingly, functions and procedures passed as actual parameters may have a mixture of supplied and empty parameters. Again, the number of empty parameters must match between the actual and formal parameters. This appears to imply support for currying or closures.

Exit from a procedure is accomplished using either the return statement (which can only occur within the body of a procedure) or a go to statement referencing (perhaps indirectly through a switch) a formal label in the output parameter list. Providing multiple exits from a subroutine was a common technique in machine-language programming at the time, so label parameters attempt to provide a similar facility. Each path of execution from an entry point must eventually reach a return or a go to referencing a formal label. The behavior if execution runs off the end of the procedure body without executing a return or a go to referencing a formal label is not defined.

Example Procedure

The Report show an example procedure declaration for Simpson's Rule integration. It is reproduced below, without its comments, with semicolons inserted between statements, and reformatted into a more modern style. Note that "F()" indicates a formal function with one parameter, and "abs()" is an intrinsic function defined by the Report with the customary meaning of absolute value:

procedure Simps (F(), a, b, delta, V);
      begin
Simps:
      Ibar := V×(b-a);
      n := 1;
      h := (b-a)/2;
      J := h×(F(a) + F(b));
J1:
      S := 0;
      for k := 1 (1) n;
      S := S+F(a + (2×k-1)×h);
      I := J + 4×h×S;
      if (delta < abs(I-Ibar));
          begin
              Ibar := I;
            J := (I+J)/4;
          n := 2×n;
          h := h/2;
            go to J1
          end;
      Simps := I/3;
      return;
      integer (k, n)
      end Simps

A call on this procedure might look like this, where "poly()" is some function or value-returning procedure defined elsewhere in the program:

area := Simps(poly(), x, x+20, 2₁₀-5, 5₁₀25);

Impact of the Preliminary Report

Perhaps the most enduring impact of the Preliminary Report was upon the continuing effort to refine the language, resulting two years later in publication of the 1960 "Report on the Algorithmic Language ALGOL 60" [4], followed by the 1963 "Revised Report on the Algorithmic Language Algol-60" [5].

As should be clear from comments in the overview above, the Algol-58 Preliminary Report is not a complete specification for a programming language, nor was it ever intended to be. The first sentence in the report makes this clear:

Note: In the interest of immediate circulation of the results of the ACM-GAMM committee work on an algebraic programming language, this preliminary report is presented. ... This report has not been thoroughly examined for errors and inconsistencies.

The Report leaves many things unsaid, and not a few unclear. It should be viewed for what it was, a progress report on an effort to define a universal algorithmic language. Many ideas that became more fully developed in Algol-60 appear in this Report, and no doubt a great deal was learned from its preparation about how a language should be specified, leading to the development of Backus-Naur Form and the elegant syntax-and-semantics presentation of the Algol-60 report.

Nonetheless, the Report was good enough, and the ideas in it attractive enough, to get a lot of creative juices flowing.

The first compiler for Algol-58 was running by the end of 1958 on a Zuse Z22 computer in Mainz, Germany. It was constructed by the so-called ZMMD group, consisting of Friedrich Bauer, Hermann Bottenbruch, Heinz Rutishauser, and Klaus Samelson, the GAMM representatives to the joint working session that produced the Report. That this compiler was available only six months after the joint session may indicate the extent to which the GAMM proposals formed the core of the preliminary language and how far along the ZMMD group was towards creating a compiler at the time of the joint session.

Availability of the Preliminary Report quickly launched the development of many other hardware implementations of Algol-58, all deviating in greater or lesser degrees from the reference language described by the Report. The best known of these are:

ALGO (Bendix Corporation, Los Angeles, California: Bendix G-15)
ALGOL 30 (Dartmouth College, Hanover, New Hampshire: Librascope LGP-30)
BALGOL (Burroughs Corporation, Pasadena, California: Burroughs 220)
JOVIAL (System Development Corporation, Santa Monica, California: multiple targets)
MAD (University of Michigan, Ann Arbor, Michigan: IBM 7090, Univac 1108)
NELIAC (Naval Electronics Laboratory, San Diego, California: multiple targets)

BALGOL followed the syntax and semantics Preliminary Report quite closely and implemented most of its features, although sometimes with slightly different syntax. In the next post [1], we will examine BALGOL in some detail and see its adaptation of Algol-58.

References

[1] "BALGOL – the Burroughs Algebraic Compiler,"
http://datatron.blogspot.com/2018/08/balgol-burroughs-algebraic-compiler.html
[2] "A Dozen Precursors of FORTRAN," Donald Knuth, lecture on 2003-12-03, Computer History Museum, http://www.computerhistory.org/collections/catalog/102622137.
[3] "Preliminary Report--International Algebraic Language," Communications of the ACM, vol. 1 #12, December 1958, pp. 8-22,
http://www.softwarepreservation.org/projects/ALGOL/report/Algol58_preliminary_report_CACM.pdf.
[4] "Report on the Algorithmic Language ALGOL 60", ed. Peter Naur, Communications of the ACM, vol. 3, #5, May 1960, p.299-314,
http://www.softwarepreservation.org/projects/ALGOL/standards/report/Algol60_report_CACM_1960_June.pdf.
[5] "Revised Report on the Algorithmic Language Algol-60," ed. P. Naur, Communications of the ACM, vol. 6 #1, January 1963, p. 1-17,
http://www.softwarepreservation.org/projects/ALGOL/report/Algol60_report_CACM_1960_June.pdf.

Wednesday, July 18, 2018

retro-220 Emulator Version 1.00 Released

Development of the retro-220 emulator for the Burroughs 220 has reached the point where we can consider it to be functionally complete and reasonably-well debugged. Therefore, with this release I have advanced the version number to 1.00 and announce this to be the first official release of the emulator.

This is Paul Kimpel with another guest post on Tom's 205 and 220 blog. Version 1.00 is a relatively minor release, consisting of some important bug fixes, a bit of general tuning and tweaking, and a couple of minor new features. This post will describe those changes.

The source code and supporting materials for version 1.00 of the emulator are available from the project repository. The wiki pages have been updated with the corrections and new features for this release. You can download the source from the repository and run it from your own web server. Alternatively, feel free to run it from our hosting site. Please see the Getting Started wiki page for instructions.

In addition to version 1.00, I am announcing the first release of the recovered Burroughs Algebraic Compiler for the 220, also known as BAC-220 or BALGOL. That is briefly discussed at the end of this post.

New Features

The most visible new feature is on the Control Console. For the few past releases, there have been some mysterious numbers in the lower-left corner of the panel that have never been explained. This was a preliminary display of run-time statistics that are maintained internally by the emulator's Processor module. With this release, I have added a new statistic and reworked the display in an attempt to be more meaningful. That portion of the Console panel now looks like this:

Control Console Run-time Statistics

The display consists of four numbers:

n.nn=D -- Average Delay Deviation Time (milliseconds). The emulator frequently inserts time delays into its operation, either to throttle performance so it will match that of a real 220, or to model the timing of peripheral device I/Os. Generating accurate delays in a browser environment is difficult, so the emulator has a mechanism that accumulates the difference between requested delays and actual delay times, feeding back the deviations to adjust future delays in an attempt to maintain timing accuracy in the average. The "=D" number is the exponential running average deviation for Processor throttling delays. Numbers less that 4ms are typical; numbers consistently greater than 4ms may indicate that emulated time is falling behind real time. The number may go negative for reasons that are too complex to explain here, but see this this discussion for the retro-205 emulator, which uses a similar delay adjustment mechanism.
nn.nn%S -- Average Processor Slack. This is the exponential running average percentage of real time the emulated 220 Processor is not using the real processor in your workstation. This is another measure of how well emulated time is staying ahead of real time. Values greater than 70% can be considered to be healthy. Values less than 50% can be problematic, as the browser must do things other than execute the emulator's Javascript code, e.g., render updates to the windows and garbage-collect memory.
n.nn=R -- Average Processor Slice Run Time (milliseconds). To throttle performance, the emulated Processor runs in spurts. It typically runs for a time slice of about 10ms emulated, then compares the emulated time it has accumulated for individual instructions against the browser's real-world clock and inserts a delay to allow real-world time to catch up with emulated time. The time slice may terminate early, usually because an I/O was initiated. The number is the exponential running average real time that these slices are running. The value will varying depending upon the speed of your workstation's processor and the nature of the 220 program you are running, but I usually see values in the range 0.01 - 0.03ms.
nnn,nnn=I -- Instruction Counter. This number counts up by one for every 220 instruction that is executed. The count is incremented at the beginning the Execute Phase in the Processor.

The first three statistics accumulate as long as the emulator is "powered up." They can only be reset by powering down the emulator and powering it up again. The Instruction Counter is reset whenever the Interval Timer is reset, either at power-up or by clicking the black ZERO TIMER button to the right of the Instruction Counter.

Related to the revised timing statistics, the mechanism within the Processor that maintains emulated time has been revised to be more accurate, especially during I/O operations. During an I/O, the Processor is idle except to service memory requests from the I/O devices. Thus emulated time continues to accumulate even though the Processor is not executing instructions.

As part of the revisions to the Processor's timing mechanism, the deviation adjustment algorithm mentioned in #1 above has been redesigned for the umpteenth time. This algorithm is implemented in the emulator's global setCallback() function that centrally manages timing delays. That function is common across the retro-220, retro-205 and retro-b5500 emulators, with each project continuing to contribute to its ongoing refinement.

Corrections

If there is one instruction in the 220 that has given me fits, it is Compare Field A/R (CFA/CFR, op code 18). This instruction compares a partial-word field (or optionally the whole word) in the A or R register against a corresponding field of a word in memory, setting the LOW/EQUAL/HIGH comparison toggles depending on whether the field in the register is less than, equal to, or greater than the field in the memory word, respectively.

Comparing values like this is relatively straightforward at the hardware level -- you effectively subtract the values and look to see if the result is zero (meaning equal), and if not, you look at the sign of the result, with negative indicating a low condition. That is the way it was done on the 220, unless the specified field includes the sign digit. In that case, the 220 used a rather bizarre collation, which if you are interested, you can read about in the Operational Characteristics manual on page 2-15, under Remarks.

The people who wrote the BALGOL compiler were very focused on table storage efficiency, and frequently used the sign digit as a one-digit field. They also really knew how CFA/CFR worked, and used it to advantage. I have encountered at least three cases during debugging of the compiler where my knowledge of how that instruction should work proved deficient, resulting in the compilation going off into Never-Never Land. The most recent incident involved the compiler's Overlay module comparing a word with a sign of 7 to a word with a sign of 0, and the emulator producing the wrong answer. That bug is now fixed, but I won't be surprised if this is not the last of my troubles with that instruction.

Other, more minor problems that have been corrected include:

When reading data from a Cardatron card reader or magnetic tape, it is possible to have the Processor add the contents of the B register to the low-order four digits of selected words before storing them in memory. This was generally done to relocate operand addresses in instruction words as a program was being loaded. The emulator was isolating the operand address incorrectly before applying the value in the B register to it, corrupting the value of the word. The emulator now performs this B-register modification correctly.
Tom Sawyer recently sent me some Burroughs correspondence that indicated Processor alarm conditions should be cleared when the Reset/Transfer switch is activated. The emulator was not doing this, and that oversight is now corrected in this release.
The control panel for teletype printers was not correctly displaying the tab stop positions that had been set for the device. It now displays the tab stops properly.
The speed for teletype printers operating in "Whippet" mode has been reduced from 5000 characters/second to 1000. The paper-advance delay has been reduced from an average of 200ms per line to 75ms.
On the Console register displays, you toggle the state of the lamps (and the bits they represent) by clicking on them. On a real 220, you would set bits by pressing the small, white, rectangular button below each lamp. The intent was that the buttons would work this way in the emulator as well, but they were not responding to clicks. This has been fixed, and you can now toggle a lamp by clicking on either the lamp or its white button.
Paper-tape image files for the emulator are ordinary ASCII text files. They represent each 220 word on the tape as a separate line in the file. For numeric words, the line consists of 11 digits. For alphanumeric words -- indicated by a sign digit of 2 -- the word is represented as five alphanumeric characters after the sign digit. This representation has a problem when the word has trailing space characters and those trailing spaces have been trimmed from the line, perhaps by a text editor. The paper tape reader will now supply trailing spaces as necessary if a line with a sign digit of 2 has less than five characters following the sign.

BALGOL – The Burroughs Algebraic Compiler

Recovery and restoration of the BALGOL compiler for the 220 has been one of the main goals of this project from its inception. BALGOL is a dialect of the preliminary International Algorithmic Language, also known as Algol-58, which was a precursor to the better-known Algol-60. Work on transcription of the compiler source code from listings started in the Fall of 2016, actually before work on the emulator itself.

Getting the compiler to run again has involved transcribing over 10,000 lines of assembly coding, writing two cross-assemblers to translate the transcriptions into 220 object code, ferreting out the contents and organization of the magnetic tape on which the compiler was distributed, and inventing out of whole cloth the several small utility card decks that went along with that tape. It has been a lot of work, but in the end very gratifying.

The compiler first started working at the end of 2017. Since then, I have been preparing example programs, testing the compiler against them, and resolving the inevitable errors that have been uncovered -- some of them taking me weeks of part-time effort to resolve. There have been problems with everything -- the transcriptions, the assemblers, the emulator, and not least, my flawed understanding of correct BALGOL syntax.

I am now pleased to announce the availability of BALGOL for general use. The compiler passes all of my tests, including all of the example programs published in Section 11 of its compiler reference manual. Everything necessary to run the compiler in the retro-220 emulator is available from the project repository. Wiki pages are also available describing how to use the compiler and how to build the compiler from scratch. You are welcome to try it out for yourself. Note that the current version of the compiler requires emulator version 1.00 in order to operate properly.

I am preparing a series of blog posts describing the BALGOL dialect, the effort to recover the compiler and make it run, and an overview of its operation. These will be posted over the next weeks, so stay tuned.

The Burroughs 205 and 220 Blog

Welcome to the Datatron 205 and 220 Blog