META II
META II is a domain-specific programming language for writing compilers. It was created in 1963–1964 by Dewey Val Schorre at UCLA. META II uses what Schorre called syntax equations. Its operation is simply explained as:
Each syntax equation is translated into a recursive subroutine which tests the input string for a particular phrase structure, and deletes it if found.[1]
Meta II programs are compiled into an interpreted byte code language. VALGOL and SMALGOL compilers illustrating its capabilities were written in the META II language,[1][2] VALGOL is a simple algebraic language designed for the purpose of illustrating META II. SMALGOL was a fairly large subset of ALGOL 60.
Notation
[edit]META II was first written in META I,[3] a hand-compiled version of META II. The history is unclear as to whether META I was a full implementation of META II or a required subset of the META II language required to compile the full META II compiler.
In its documentation, META II is described as resembling BNF, which today is explained as a production grammar. META II is an analytical grammar. In the TREE-META document these languages were described as reductive grammars.
For example, in BNF, an arithmetic expression may be defined as:
<expr> := <term> | <expr> <addop> <term>
BNF rules are today production rules describing how constituent parts may be assembled to form only valid language constructs. A parser does the opposite taking language constructs apart. META II is a stack-based functional parser programming language that includes output directive. In META II, the order of testing is specified by the equation. META II like other programming languages would overflow its stack attempting left recursion. META II uses a $ (zero or more) sequence operator. The expr parsing equation written in META II is a conditional expression evaluated left to right:
expr = term
$( '+' term .OUT('ADD')
/ '-' term .OUT('SUB'));
Above the expr equation is defined by the expression to the right of the '='. Evaluating left to right from the '=', term is the first thing that must be tested. If term returns failure expr fails. If successful a term was recognized we then enter the indefinite $ zero or more loop were we first test for a '+' if that fails the alternative '-' is attempted and finally if a '-' were not recognized the loop terminates with expr returning success having recognized a single term. If a '+' or '-' were successful then term would be called. And if successful the loop would repeat. The expr equation can also be expressed using nested grouping as:
expr = term $(('+' / '-') term);
The code production elements were left out to simplify the example. Due to the limited character set of early computers the character /
was used as the alternative, or, operator. The $
, loop operator, is used to match zero or more of something:
expr = term $( '+' term .OUT('ADD')
/ '-' term .OUT('SUB')
);
The above can be expressed in English: An expr is a term followed by zero or more of (plus term or minus term). Schorre describes this as being an aid to efficiency, but unlike a naive recursive descent compiler it will also ensure that the associativity of arithmetic operations is correct:
expr = term $('+' term .OUT('ADD')
/ '-' term .OUT('SUB')
);
term = factor $( '*' factor .OUT('MPY')
/ '/' factor .OUT('DIV')
);
factor = ( .ID
/ .NUMBER
/ '(' expr ')')
( '^' factor .OUT('EXP')
/ .EMPTY);
With the ability to express a sequence with a loop or right ("tail") recursion, the order of evaluation can be controlled.
Syntax rules appear declarative, but are actually made imperative by their semantic specifications.
Operation
[edit]META II outputs assembly code for a stack machine. Evaluating this is like using an RPN calculator.
expr = term
$('+' term .OUT('ADD')
/'-' term .OUT('SUB'));
term = factor
$('*' factor .OUT('MPY')
/ '/' factor .OUT('DIV'));
factor = (.ID .OUT('LD ' *)
/ .NUM .OUT('LDL ' *)
/ '(' expr ')')
( '^' factor .OUT('XPN'/.EMPTY);
In the above .ID and .NUM are built-in token recognizers. * in the .OUT code production references the last token recognized. On recognizing a number with .NUM .OUT('LDL' *) outputs the load literal instruction followed the number. An expression:
- (3*a^2+5)/b
will generate:
LDL 3
LD a
LDL 2
XPN
MPY
LDL 5
ADD
LD b
DIV
META II is the first documented version of a metacompiler,[notes 1] as it compiles to machine code for one of the earliest instances of a virtual machine.
The paper itself is a wonderful gem which includes a number of excellent examples, including the bootstrapping of Meta II in itself (all this was done on an 8K (six bit byte) 1401!)."—Alan Kay
The original paper is not freely available, but was reprinted in Doctor Dobb's Journal (April 1980). Transcribed source code has at various times been made available (possibly by the CP/M User Group). The paper included a listing of the description of Meta II, this could in principle be processed manually to yield an interpretable program in virtual machine opcodes; if this ran and produced identical output then the implementation was correct.
META II was basically a proof of concept. A base from which to work.
META II is not presented as a standard language, but as a point of departure from which a user may develop his own META "language".[1]
Many META "languages" followed. Schorre went to work for System Development Corporation where he was a member of the Compiler for Writing and Implementing Compilers (CWIC) project. CWIC's SYNTAX language built on META II adding a backtrack alternative operator positive and negative look ahead operators and programmed token equations. The .OUT
and .LABEL
operations removed and stack transforming operations :<node>
and !<number>
added. The GENERATOR language based on LISP 2 processed the trees produced by the SYNTAX parsing language. To generate code a call to a generator function was placed in a SYNTAX equation. These languages were developed by members of the L.A. ACM SIGPLAN sub-group on Syntax Directed Compilers. It is notable how Schorre thought of the META II language:
The term META "language" with META in capital letters is used to denote any compiler-writing language so developed.[1]
Schorre explains META II as a base from which other META "languages" may be developed.
See also
[edit]Notes
[edit]- ^ Ignoring META I which is only mentioned in passing in the META II document.
References
[edit]- ^ a b c d META II A SYNTAX-ORIENTED COMPILER WRITING LANGUAGE (Dewey Val Schorre UCLA Computing Facility 1964)
- ^ Dewey, Val Schorre (1963). "A Syntax - Directed SMALGOL for the 1401". ACM Natl. Conf., Denver, Colo.
- ^ Dewey, Val Schorre (1963). META II: a syntax-oriented compiler writing language (PDF). UCLA: UCLA Computing Facility.