One of the first attempts at designing a universal programming language, at least for numeric computation, was the International Algebraic Language (IAL), also known as Algol-58. The Burroughs Algebraic Compiler (BALGOL) for the 220 computer system was an early dialect of Algol-58. To prepare for a detailed discussion of BALGOL on the 220 in the next post [1], this post will discuss the development of Algol-58 and present an overview of the language.
Background
Initially, preparation of the program steps for machine execution was done entirely on the machine's terms. Instructions were prepared directly using the numeric or alphanumeric code required by the hardware. Hence the popular use of "code" and "coding" as synonyms for programs and programming. There is a lot of manual bookkeeping involved in programming at this level, especially in keeping track of addresses within the code for data storage and branch locations. That bookkeeping task is compounded when changes must be made to an existing program.
Mnemonic notations arose fairly quickly to ease the clerical task of coding, along with a number of higher-level abstractions, such as flowcharting, to aid in the overall conception and design of programs. Not long after, people began to realize that the machine itself could assist with the clerical aspects of generating code, especially in keeping track of address assignments. This led to the development of assembly languages and the assembler programs to process them.
From the very beginning of automated computing, though, there has been a strong desire to be able to approach programming more on our terms, in a way that is more in keeping with the way we think about the applications to which we want to apply a computer. Since most early applications involved numerical calculations based on mathematical formulae, there was an obvious desire to express programs using standard notations from mathematics. FORTRAN is one of the earliest and best known realizations of this desire, but as Donald Knuth made clear in his 2003 lecture, "A Dozen Precursors of FORTRAN" [2], attempts to express programs using higher-level abstractions can be traced all the way back to Konrad Zuse during the 1940s.
A parallel desire has been to be able to express programs in a more general way than in the coding for a specific type of machine. The half-life of a given computer design in the 1950s was only a couple of years, so as new designs came along, existing programs had to be completely redone using the instruction coding for each new machine. Being able to express programs in a way that was not specific to any particular machine would ease this burden. It would also open the possibility of more easily adapting programs written for one machine so that they could run on entirely different ones.
Many efforts towards translation of standard arithmetic expressions into machine instructions were going on, starting in the early 1950s. A preliminary report on FORTRAN was available by 1954, but other languages were starting to appear, such as Alick Glennie's Autocode, Alan Perlis' Internal Translator (IT, also known as the Purdue Compiler), and Grace Hopper's A0 and MATH-MATIC. There were also a number of interpretive expression evaluators, such as the Whirlwind Algebraic Interpreter. By the second half of the 1950s, the number of different (and incompatible) programming notations was beginning to proliferate.
In 1955, the German Association for Applied Mathematics and Mechanics (GAMM) held a meeting in Darmstadt on electronic computers. This resulted in the formation of a committee to work on formula translation. Then in May 1957, the Association for Computing Machinery (ACM), SHARE (the IBM user organization), USE (the Sperry Univac user organization), and DUO (the ElectroData/Burroughs user organization) held a conference in Los Angeles to explore the exchange of computing information. The group concluded that a single, universal programming language would be very desirable, and recommended that the ACM form a committee to pursue the idea. A few months later, the GAMM group reached a similar conclusion and contacted the ACM to suggest a joint conference that would make recommendations on a common language.
By April 1958, both the ACM and GAMM committees had independently developed proposals for a universal language and met to exchange them. They found they had much in common, so arranged to hold a joint working session between representatives of each group to combine their proposals and resolve the differences. That joint session was charged with the following objectives [3]:
- The new language should be as close as possible to standard mathematical notation and be readable with little further explanation.
- It should be possible to use it for the description of computing processes in publications.
- The language should be translatable into machine programs.
FORTRAN had been released by IBM for almost a year by the time of the 1958 ACM-GAMM joint session in Zurich. FORTRAN was being received warmly, especially by the scientific community. It was not exactly an attractive notation, though, and was certainly not an ideal way to exchange definitions of algorithmic processes, especially among people accustomed to the elegance of mathematics. Even John Backus, the driving force behind FORTRAN, could not have been completely pleased with it, as he was one of the ACM representatives to the Zurich joint session. Thus, there was motivation to come up with something much better.
Overview of Algol-58
This section attempts to describe the IAL as presented in the Preliminary Report in somewhat informal terms. It largely follows the structure of that Report.
Symbology
Algol-58 uses a 63-character set, which allowed for a number of traditional mathematical symbols, and may have been influenced by six-bit codes for teletype and punched paper-tape equipment. Early FORTRAN, of course, had been limited by the 48-character set used in IBM punched-card equipment.
The basic symbols defined for the language are:
- The 26 English letters: A-Z [including both upper and lower case; the report is silent on the subject of case sensitivity]
- The decimal digits: 0-9
- Arithmetic operators: + – × /
- Relational operators: < ≤ = ≥ > ≠
- Logical operators: ¬ [not] ∨ [or] ∧ [and] ≡ [equivalence]
- Sequential operators: go to, do, return, stop, for, if, or, if either, or if
- Separators: . , : ; := =: → 10 begin end
- Brackets: ( ) [ ] ↑ ↓
- Declarators: procedure, array, switch, type, comment
Note that, as with Algol-60, the boldface words in the lists above are considered not to be identifiers or sequences of characters, but atomic symbols in their own right. Thus, "go to" is not two words or five characters, but a unique symbol in the language. The subscripted "10" is used to indicate the power-of-10 scaling factor for numbers in scientific notation.
Also note that this is the symbol set for the reference language, which is intended to specify the language itself and not necessarily be used for coding programs or publishing algorithms. The report anticipated that a more restricted set of characters, a hardware representation, would need to be used by an actual compiler targeting a real machine. It also anticipated that an expanded set of characters might be used in a publication language used to communicate algorithms in the literature. Each instance of a publication language or hardware representation is to be accompanied by a set of transliteration rules for mapping to and from the reference language.
Data Types and Declarations
The Algol-58 report specifies three data types – integer, real (floating-point), and Boolean (logical) values, plus arrays of these entities. It appears to anticipate the use of additional types, but is silent on how they should be handled.
Literals
Numeric literals have the general form I.F10±E, where I is a whole integer, .F is a decimal fraction, and 10±E is a scaling factor as a power of ten. The fraction and scale factor are optional, as is the sign preceding the exponent of the scaling factor. A number could also be expressed as simply a scaling factor by omitting the integer 1 before it, e.g., "10-3" for 0.001.
Boolean literals are represented as the integer literals 0 [false] and 1 [true].
Identifiers and and Types
Identifiers are defined in the usual way, starting with a letter and optionally followed by a string of letters and decimal digits. No maximum length for an identifier is specified.
By default, all identifiers are considered to refer to variables of type real. A type declaration can be used to specify that an identifier refers to an integer or Boolean variable. Note that the type declaration does not define a variable or assign it storage. The declaration merely associates a type with an identifier. Variables are declared upon first reference to their identifiers, which need not appear in any type declaration.
integer (i, j, k); Boolean (on, off);
Array Declarations
Array declarations name a list of identifiers to be considered as arrays and to specify the bounds of each array's dimensions. The bounds notation is a little different from Algol-60 -- all of the lower bounds are specified first in a comma-delimited list, then a colon (:), then the corresponding upper bounds are specified in a second comma-delimited list. The Report specifies the bounds to be "integers," so it appears that dynamic array sizing is not supported. Multiple array identifiers can be associated with a common set of bounds, as in Algol-60. If an array is to be of a type other than real, it must appear in a type declaration, and must be followed by square brackets containing sufficient commas to indicate the dimensionality of the array:
integer i, j, m[], n[,,];
array a, b, c[1:10], d[0, 0:7,19], m[-7:7], n[1,1:9,9];
Switch Declarations
A switch defines a multi-way branch, after the idea of the computed go-to in FORTRAN, but essentially the same as switches in Algol-60. Note that it is a declaration, not a statement. It defines an ordinal list of designational expressions, which are either simple statement labels or indexed switch identifiers. A statement label may be an identifier or a literal integer.
A switch is used by the go to statement, which takes as its target not a label, but the more general designational expression construct. The index used with a switch identifier is evaluated dynamically, not only for a switch named in a go to statement, but also for any switch identifiers used in designational expressions within switch declarations. Thus the evaluation of a designational expression may result in a cascade of switches being evaluated before a label is finally encountered and the destination of a branch determined.
switch s1:= (start, 44, s2[i×j-1], s1[k]);
switch s2:= (5, 4, 3, 2, 1, blastoff);
go to s1[3];
Comment Declarations
Algol-58 considers a comment to be a declaration. It can appear between any instance of other declarations and statements, but otherwise has no effect on the meaning of the program. Like the Algol-60 comment, it starts with the symbol comment and continues until the next semicolon (;) is encountered.
Function Declarations
Algol-58 supports the declaration of an expression as a function, similar to the in-line function of FORTRAN. The function defines a list of parameters, the actual values of which will be substituted into the expression. Any variables in the expression not named in the parameter list are referenced globally in the containing program.
toRads(degrees) := degrees×3.1415926/180;The language also supports procedure declarations and calls. These are a more complex subject, and are discussed in a separate section below.
Pythagorean(a, b) := sqrt(a↑2↓ + b↑2↓);
length := Pythagorean(base, height);
Expressions
Expressions in Algol-58 are formed in the usual way, from combinations of variable references, function calls, and operators of various types. The values of scalar variables are referenced as simple identifiers. Elements of arrays are referenced as the identifier for the array, followed by a comma-delimited list of subscripts in square brackets ([ ]). Subscripts can be general arithmetic expressions, which are rounded to integers before indexing into the array.
Functions (and procedures that return values as a function) are evaluated by naming the function, followed by a comma-delimited list of expressions in parentheses corresponding by position to the parameters of the function, as shown in the example for the Pythagorean function declaration above.
Arithmetic Expressions
An arithmetic expression, A, may take any of the following forms:
- A numeric literal, e.g., "123.456"
- A simple identifier, e.g., "alpha"
- A subscripted identifier, e.g., "points[i, j]"
- A function call, e.g., "Pythagorean(x, y)"
- An arithmetic expression prefixed by "+" or "–", e.g., "–A"
- Two arithmetic expressions joined by an arithmetic operator (+ - × /), e.g., "A/7.0"
- One arithmetic expression followed by a second arithmetic expression in up/down arrow brackets to indicate the first expression raised to the power of the second expression, e.g., "A1↑A2↓"
- An arithmetic expression in parentheses, e.g., "(A)"
alpha
alpha + beta / gamma
(alpha + beta) / gamma
a↑2↓ / 2↑b-1↓
(m[i] + n[j, k])×(m[j] - n[k, i])
Boolean Expressions
A Boolean expression, B, may take the following forms:
- The literal value for false, e.g., 0
- The literal value for true, e.g., 1
- A simple or subscripted identifier, e.g., "empty" or "flags[k]"
- A function call, e.g., "testValid(assertions())"
- An arithmetic relation, consisting of two arithmetic expressions (A) separated by a relational operator (< ≤ = ≥ > ≠) and enclosed in parentheses, e.g., (A1 < A2)
- A Boolean expression prefixed by the not operator, e.g., ¬B
- Two Boolean expressions separated by a Boolean operator (∨ ∧ ≡), e.g., B1 ∧ B2
(alpha ≥ beta) ∨ (gamma ≠ epsilon) ≡ override
Statements
The individual operations of a program are defined by statements. As is typical, statements are executed sequentially unless directed otherwise by a containing statement. Algol-58 is a free-form notation, without particular line-length or columnar conventions. Statements are separated by (not terminated by) the semicolon (;). The end of a line or record in the program text has no syntactic significance.
Algol-58 introduced the idea of a compound statement by using the symbols begin and end to bracket a sequence of individual statements. The compound statement may then be used in any place that an individual statement can be used, e.g., as the object of an if or for statement. There is no provision for an empty or null statement, so strictly speaking, a semicolon cannot precede an end.
Statements can be prefixed by a label and a colon (":"), which serves to declare the label. The label can be an identifier or an integer literal. If a compound statement has a label, then the terminating end of the compound statement may optionally repeat the label to indicate the span of the compound statement, e.g.,
start:Declarations may be interspersed with statements. The order and placement of declarations within a program is immaterial. Algol-58 did not support the Algol-60 concept of blocks, where declarations within a compound statement create a nested identifier scope, which has visibility of the identifier scopes of all containing blocks. Declarations embedded within compound statements are interpreted as if they were placed outside any compound statement.
begin
x:= x+1;
if (x < 100);
go to start;
end start
Assignment Statements
The assignment statement works similar to that in most languages. The identifier of a variable (possibly subscripted) on the left is assigned the value of the expression on the right. The assignment operator is ":=", a standalone equal sign ("=") being reserved for use in arithmetic relations.
If the variable on the left of the assignment operator is of an arithmetic type, the expression on the right must yield an arithmetic value. The Report implies that the value of the expression will be coerced as necessary to the type of the assignment variable, but does not explicitly say so. Mixed integer/real expressions are permitted, but the Report does not say how the resulting type of an expression should be determined.
If the variable on the left is arithmetic and the expression on the right is Boolean, the truth value of the expression will be converted to an integer 0 if it is false or 1 if it is true. The converse is not true, however -- only Boolean expressions can be assigned to Boolean variables.
Go To Statements
The go to works as in most languages, except that as noted earlier, the operand of a go to is a designational expression. Thus you can branch to a simple label or to an indexed switch identifier. As also mentioned earlier, all designational expressions in switches are evaluated dynamically using the current values of their index expressions. Index expressions are ordinal (i.e., one-relative). The Report is silent on the behavior of a switch when its index is less than one or greater than the number of designational expressions in its declaration.
If Statements and Alternative Statements
The if statement works as in most languages, except that it serves only as a guard for the next statement. There is no else in Algol-58. The guarded statement may of course be a compound statement. The condition to be tested may be any Boolean expression.
if (i > 0);Algol-58 implements a variant of the if statement that functions as a skip-chain. Each if either or or if is evaluated in sequence, the statement associated with the first condition that evaluates true is executed, and any remaining conditions and statements are skipped. There is no provision for a default or catch-all statement if none of the specified conditions is met:
if (j < i);
j := i+1;
if either (i = 1);
j := i+1;
or if (i = 2);
j := i+2;
or if (i = 3);
j := j+i;
or if (i > 3);
begin
if (j > 0);
j := i;
if (j < 0);
j := 0;
end
end
For Statements
The only iterative statement in Algol-58 is the for statement, which extended FORTRAN's idea and anticipated the complex for statement of Algol-60. Specification of the sequence of values over which iteration will occur has two forms. The first is a simply comma-delimited list of arithmetic expressions. The control variable is assigned the value of the next expression in the list at the start of each iteration. When the end of the list is reached, the loop terminates:
for x := 2, 4, 5, 9, 16, 27, 31, -1;The second form is more like that of FORTRAN, with initial, increment, and final values for the control variable. Each value may be an arithmetic expression. The middle expression in parentheses is the increment value:
cell[x] := cell[x]×3;
for x := 1 (1) n-1;The two forms can be combined to iterate over a complex sequence of values:
begin
if (cell[x] < 0);
cell[x] := -cell[x];
if (cell[x] > 0);
cell[x] := 1
end
for x := 1, 5, 8, 99 (-7) 9, 101, 127, i (j) k, 255;
begin ... end
Do Statements
The do statement is not so much an executable statement as a generator of executable statements. It is more in keeping with the idea of a macro. This is best illustrated with a simple example:
first: A := (B) / (C);The do statement names a range of other statements identified by the labels of the first and last statements in the range. A one-statement range may be specified using a single label. The entire do statement is replaced by a copy of that range of statements, less the bracketing labels and any embedded declarations. Within the copy, each of the identifiers on the right of an arrow ("→") is replaced by the corresponding sequence of symbols on the left of the arrow. In the example above, the do statement would be replaced by:
last: go to D;
...
do first, last (result→A, alpha×beta-1→B, gamma→C, top→D)
result := (alpha×beta-1) / gamma;The range of statements copied by a do statement can itself contain do statements, in which case any nested do statements are replaced first, so that the final text contains no do statements.
go to top;
Stop Statements
The stop statement terminates execution of the program. No resumption of execution from the stop is possible.
Procedure Declarations and Procedure Calls
The concept of a procedure in Algol-58 is that of a closed subroutine having possibly multiple entry and exit points. All identifiers referenced within the body of a procedure refer either to locally-declared variables and labels or to the formal parameters of the procedure. Global variables in the containing program are not visible within the procedure unless they are passed as parameters. Variables local to the procedure body are not visible outside the procedure.
Procedure Declaration
Parameters to a procedure are classified as either input parameters or output parameters and are specified in separate parameter lists delimited by a "=:" symbol. The general form of a procedure declaration is:
procedure I1 (Pi1) =: (Po1), I2 (Pi2) =: (Po2), I3 (Pi3) =: (Po3), ...where the In are the identifiers of entry points within the procedure. Each such identifier must also be a label within the body of the procedure that defines the location where that entry point will commence execution.
The Pin are lists of formal input parameters that supply values to the procedure, and the Pon are lists of formal output parameters that receive results from the procedure. The formal variables in the lists can, of course, be different for the different entry points.
Formal input parameters may consist of a scalar identifier, an array identifier followed by some number of empty subscript positions in square brackets (to indicate the dimensionality), or a function or procedure identifier followed by some number of empty parameter positions within parentheses (to indicate the number of parameters to the formal function or procedure).
Formal output parameters may consist of a scalar identifier, an array identifier with empty subscript positions as for input parameters, or a label identifier. The output parameter list and its preceding "=:" can be omitted, in which case the procedure can be called as a function and be used within an expression. The return value is defined by assigning it to the entry label identifier as if it were an ordinary scalar variable, similar to the method used in Algol-60. See the example below for an illustration of this.
Following the procedure heading defining the entry points and parameter lists, there can be a set of declarations supplying type information for the formal input and output parameters. The type for label parameters is deduced from the context of their use in the body of the procedure. Surprisingly, array declarations for formal parameters can have expressions in their bounds lists, provided any identifiers in the expressions are formal parameters Undeclared formal parameters are of type real, as usual. The body of the procedure follows this, and must be enclosed in a begin/end pair. Any declarations within the body of the procedure refer only to local variables.
Procedure Call Statement
A procedure call statement states the identifier of a procedure entry point, followed by input and output lists for the actual parameters, with the two lists again delimited by "=:". The Report specifies that in the call, the replacement of formal parameters by actual parameters "may be considered to be a replacement of every occurrence within the procedure of the symbols (or sets of symbols) listed in formal parameters, by the symbols (or sets of symbols) listed as actual parameters in the corresponding positions of the procedure statement, after enclosing in parentheses every expression not enclosed completely in parentheses already." This seems to be an early form of call by name, which was more fully developed in Algol-60. There is no call by value, even for input parameters.
The report allows array slices to be passed to array parameters. The dimensionality of the actual parameter slice must match that of the formal parameter, i.e., the number of empty subscript positions in the actual and formal parameters must be the same. More interestingly, functions and procedures passed as actual parameters may have a mixture of supplied and empty parameters. Again, the number of empty parameters must match between the actual and formal parameters. This appears to imply support for currying or closures.
Exit from a procedure is accomplished using either the return statement (which can only occur within the body of a procedure) or a go to statement referencing (perhaps indirectly through a switch) a formal label in the output parameter list. Providing multiple exits from a subroutine was a common technique in machine-language programming at the time, so label parameters attempt to provide a similar facility. Each path of execution from an entry point must eventually reach a return or a go to referencing a formal label. The behavior if execution runs off the end of the procedure body without executing a return or a go to referencing a formal label is not defined.
Example Procedure
The Report show an example procedure declaration for Simpson's Rule integration. It is reproduced below, without its comments, with semicolons inserted between statements, and reformatted into a more modern style. Note that "F()" indicates a formal function with one parameter, and "abs()" is an intrinsic function defined by the Report with the customary meaning of absolute value:
procedure Simps (F(), a, b, delta, V);A call on this procedure might look like this, where "poly()" is some function or value-returning procedure defined elsewhere in the program:
begin
Simps:
Ibar := V×(b-a);
n := 1;
h := (b-a)/2;
J := h×(F(a) + F(b));
J1:
S := 0;
for k := 1 (1) n;
S := S+F(a + (2×k-1)×h);
I := J + 4×h×S;
if (delta < abs(I-Ibar));
begin
Ibar := I;
J := (I+J)/4;
n := 2×n;
h := h/2;
go to J1
end;
Simps := I/3;
return;
integer (k, n)
end Simps
area := Simps(poly(), x, x+20, 210-5, 51025);
Impact of the Preliminary Report
Perhaps the most enduring impact of the Preliminary Report was upon the continuing effort to refine the language, resulting two years later in publication of the 1960 "Report on the Algorithmic Language ALGOL 60" [4], followed by the 1963 "Revised Report on the Algorithmic Language Algol-60" [5].
As should be clear from comments in the overview above, the Algol-58 Preliminary Report is not a complete specification for a programming language, nor was it ever intended to be. The first sentence in the report makes this clear:
Note: In the interest of immediate circulation of the results of the ACM-GAMM committee work on an algebraic programming language, this preliminary report is presented. ... This report has not been thoroughly examined for errors and inconsistencies.The Report leaves many things unsaid, and not a few unclear. It should be viewed for what it was, a progress report on an effort to define a universal algorithmic language. Many ideas that became more fully developed in Algol-60 appear in this Report, and no doubt a great deal was learned from its preparation about how a language should be specified, leading to the development of Backus-Naur Form and the elegant syntax-and-semantics presentation of the Algol-60 report.
Nonetheless, the Report was good enough, and the ideas in it attractive enough, to get a lot of creative juices flowing.
The first compiler for Algol-58 was running by the end of 1958 on a Zuse Z22 computer in Mainz, Germany. It was constructed by the so-called ZMMD group, consisting of Friedrich Bauer, Hermann Bottenbruch, Heinz Rutishauser, and Klaus Samelson, the GAMM representatives to the joint working session that produced the Report. That this compiler was available only six months after the joint session may indicate the extent to which the GAMM proposals formed the core of the preliminary language and how far along the ZMMD group was towards creating a compiler at the time of the joint session.
Availability of the Preliminary Report quickly launched the development of many other hardware implementations of Algol-58, all deviating in greater or lesser degrees from the reference language described by the Report. The best known of these are:
- ALGO (Bendix Corporation, Los Angeles, California: Bendix G-15)
- ALGOL 30 (Dartmouth College, Hanover, New Hampshire: Librascope LGP-30)
- BALGOL (Burroughs Corporation, Pasadena, California: Burroughs 220)
- JOVIAL (System Development Corporation, Santa Monica, California: multiple targets)
- MAD (University of Michigan, Ann Arbor, Michigan: IBM 7090, Univac 1108)
- NELIAC (Naval Electronics Laboratory, San Diego, California: multiple targets)
References
[1] "BALGOL – the Burroughs Algebraic Compiler,"
http://datatron.blogspot.com/2018/08/balgol-burroughs-algebraic-compiler.html
[2] "A Dozen Precursors of FORTRAN," Donald Knuth, lecture on 2003-12-03, Computer History Museum, http://www.computerhistory.org/collections/catalog/102622137.
[3] "Preliminary Report--International Algebraic Language," Communications of the ACM, vol. 1 #12, December 1958, pp. 8-22,
http://www.softwarepreservation.org/projects/ALGOL/report/Algol58_preliminary_report_CACM.pdf.
[4] "Report on the Algorithmic Language ALGOL 60", ed. Peter Naur, Communications of the ACM, vol. 3, #5, May 1960, p.299-314,
http://www.softwarepreservation.org/projects/ALGOL/standards/report/Algol60_report_CACM_1960_June.pdf.
[5] "Revised Report on the Algorithmic Language Algol-60," ed. P. Naur, Communications of the ACM, vol. 6 #1, January 1963, p. 1-17,
http://www.softwarepreservation.org/projects/ALGOL/report/Algol60_report_CACM_1960_June.pdf.
No comments:
Post a Comment