.EQ delim $$ .EN .P1 "Miscellaneous literature" %A Noam Chomsky %T Three models for the description of language %J IEEE Trans. Inform. Theory %V 2 %N 3 %D 1956 %P 113-124 %K done .La In an attempt to delineate the set of correct English sentences, the author considers three mechanisms. Finite-state automata .Xr "finite-state automaton" are rejected on the grounds that they cannot cope with arbitrary nesting. Phrase structure .Xr "phrase structure grammar" grammars are considered probably applicable but declared impractical due to their problems in expressing context conditions. Most of these problems can be solved if we augment PS grammars with .Df "transformation rules" , "transformation rule" which specify the rearrangement of parts of the derivation tree. %T On certain formal properties of grammars %A Noam Chomsky %J Inform. Control %V 2 %N %P 137-167 %D 1959 %K done .La This article discusses what later became known as the Chomsky hierarchy. .Xr "Chomsky hierarchy" Chomsky defines type 1 grammars in the \(l"context-sensitive\(r" .Xr "context-sensitive grammar" way. His motivation for this is that it permits the construction of a tree as a structural description. Type 2 grammars exclude $epsilon$-rules, .Xr "$epsilon$-rule" so in Chomsky's system, type 2 grammars are a subset of type 1 grammars. .Lp Next, the so called .Df "counter languages" "" "counter language" are discussed. A counter language is a language recognized by a finite automaton, extended with a finite number of counters, each of which can assume infinitely many values. $ L sub 1 = roman "{" a sup n b sup n | n > 0 roman "}" $ is a counter language, $ L sub 2 = roman "{" xy | x, y \(mo roman "{" a, b roman "}" sup "*",$ $y$ is the mirror image of $x roman "}"$ is not, so there are type 2 languages that are not counter languages. The reverse is not investigated. .Lp The Chomsky Normal Form .Xr "Chomsky Normal Form" is introduced, but not under that name, and a bit different: Chomsky calls a type 2 grammar @i[regular] if production rules have the form $ A -> a $ or $ A -> BC $, with $ B != C$, and if $ A -> alpha A beta $ and $ A -> gamma A eta $ then $ alpha = gamma $ and $ beta = eta $. A grammar is .Df "self-embedding" if there is a derivation $ A "\*(>*" alpha A beta $ with $ alpha != epsilon $ and $ beta != epsilon $. The bulk of the paper is dedicated to the theorem that the extra power of type 2 grammars over type 3 grammars lies in this self-embedding property. %A J.W. Backus %A F.L. Bauer %A J. Green %A C. Katz %A J. McCarthy %A P. Naur\ (Ed.) %A A.J. Perlis %A H. Rutishauser %A K. Samelson %A B. Vauquois %A J.H. Wegstein %A A. van\ Wijngaarden %A M. Woodger %T Report on the algorithmic language ALGOL 60 %J Commun. ACM %V 3 %N 5 %D May 1960 %P 299-314 %K done .Xr "Algol 60" .La First application of a BNF .Xr "Backus-Naur Form" grammar (for the definition of a programming language). Revised report by the same authors: @i[Commun. ACM], vol. 6, no. 1, p. 1-17, Jan 1963. %A R.A. Brooker %A I.R. MacCallum %A D. Morris %A J.S. Rohl %T The compiler-compiler %J Annual Review in Automatic Programming %V 3 %N %D 1960 ???? %P 229-275 %K done .La One of the first extensive descriptions of a compiler-compiler. Parsing is done by a backtracking .Xr "backtracking" non-exhaustive top-down parser using a transduction-like grammar. .Xr "transduction grammar" This grammar is kept in an integrated form and modifications can be made to it while parsing. %A Robert W. Floyd %T A descriptive language for symbol manipulation %J J. ACM %V 8 %N %D Oct 1961 %P 579-584 %K done .La Original paper describing Floyd productions. .Xr "Floyd production" See Section 9.3.1. %A Robert S. Ledley %A James B. Wilson %T Automatic-programming-language translation through syntactical analysis %J Commun. ACM %V 5 %N 3 %D March 1962 %P 145-155 %K done .La An English-to-Korean (!) translation system is described in detail, in which parts of the Korean translation are stored in attributes .Xr "attribute" in the parse tree, to be reordered and interspersed with Korean syntactic markers on output. The parser is Irons' [CF 1961]. .Au Irons CF 1961 %A Melvin E. Conway %T Design of a separable transition-diagram compiler %J Commun. ACM %V 6 %N 7 %D July 1963 %P 396-408 %K done .La .Xr "transition diagram" The first to introduce @i[coroutines] .Xr "coroutine" and to apply them to structure a compiler. The parser is Irons' [CF 1961], .Au Irons CF 1961 made deterministic by a No-Loop Condition and a No-Backup Condition. It follows transition diagrams .Xr "transition diagram" rather than grammar rules. %A Robert W. Floyd %T The syntax of programming languages\-a survey %J IEEE Trans. Electronic Comput. %V EC-13 %N %D 1964 %P 346-353 %K done .La .Xr "syntax" Early analysis of the advantages of and problems with the use of grammars for the specification of programming languages. Contains a bibliography of almost 100 entries. %A Jerome Feldman %A David Gries %T Translator writing systems %J Commun. ACM %V 11 %N 2 %D Feb. 1968 %P 77-113 %K done .La Grand summary of the work done on parsers (with semantic actions) .Xr "semantic action" before 1968. %A D.J. Cohen %A C.C. Gotlieb %T A list structure form of grammars for syntactic analysis %J Computing Surveys %V 2 %N 1 %D 1970 %P 65-82 %K done .La CF rules are represented as linked lists of alternatives, which themselves are linked lists of members. The trick is that both lists end in different null pointers. This representation is very amenable to various backtracking .Xr "backtracking" and non-backtracking top-down and bottom-up parsing methods (by interpreting the grammar). Several practical parsers are given in flowchart form. An algorithm is given to \(l"invert\(r" a grammar, i.e. the linked lists, to create a data structure that will efficiently guide a bottom-up parser. %A A. Birman %A J.D. Ullman %T Parsing algorithms with backtrack %J Inform. Control %V 23 %N 1 %D 1973 %P 1-34 %K done .La .Xr "backtracking" Models classes of recursive descent parsers, .Xr "recursive descent" capable of recognizing all deterministic context-free languages .Xr "deterministic language" and also some non-context-free languages. %A B.W. Kernighan %A L.L. Cherry %T A system for typesetting mathematics %J Commun. ACM %V 18 %N 3 %D March 1975 %P 151-157 %K done .Xr "typesetting" .La A good example of the use of an ambiguous .Xr "ambiguity" grammar to specify the preferred analysis of special cases. %A A.V. Aho %A S.C. Johnson %A J.D. Ullman %T Deterministic parsing of ambiguous grammars %J Commun. ACM %V 18 %N 8 %D 1975 %P 441-452 %K done .La Demonstrates how LL and LR parsers can be constructed for certain classes of ambiguous .Xr "ambiguity" grammars, using simple disambiguating rules, .Xr "disambiguating rule" such as operator-precedence. .Xr "operator-precedence" %A Jacques Cohen %T Experience with a conversational parser generation system %J Softw. Pract. Exper. %V 5 %N %P 169-180 %D 1975 %K done .La Realistic description of the construction of a professional interactive parser generator. .Xr "parser generator" %A Jay Earley %T Ambiguity and precedence in syntax description %J Acta Inform. %V 4 %N %D 1975 %P 183-192 %K done .La .Xr "syntax" .Xr "ambiguity" Informal description of how to use precedence information for disambiguation. %A Michael Marcotty %A Henry F. Ledgard %A Gregor V. Bochmann %T A sampler of formal definitions %J Computing Surveys %V 8 %N 2 %D June 1976 %P 191-276 %K done .La Describes and compares four semantic definition methods: VW grammars, production systems and the axiomatic approach, Vienna Definition Language, and attribute .Xr "attribute grammar" grammars. No clear winner emerges. %A R.M. Wharton %T Resolution of ambiguity in parsing %J Acta Inform. %V 6 %N 4 %D 1976 %P 387-395 %K done .La .Xr "ambiguity" It is proposed that ambiguity be resolved in a bottom-up parser by 1) reducing upon a shift/reduce conflict, 2) reducing the shorter right-hand side upon a reduce/reduce conflict and 3) reducing the textual first right-hand side upon a reduce/reduce conflict with equal lengths. In a top-down parser, criteria similar to 2) and 3) are applied to each LL(1) conflict. .Xr "LL(1) conflict" %A R.B. Hunter %A A.D. McGettrick %A R. Patel %T LL versus LR parsing with illustrations from Algol 68 %J ACM SIGPLAN Notices %V 12 %N 6 %D June 1977 %P 49-53 %K done .La Syntax-improved LL(1) .Xr "LL(1)" (Foster [Transform 1968]) .Xr "transformations on grammars" .Au Foster Transform 1968 and LR(1) .Xr "LR(1)" are equally unsuccessful in handling a CF version of the grammar of Algol 68. .Xr "Algol 68" After hand adaptation .Xr "modifying a grammar" LL(1) has the advantage. %A Niklaus Wirth %T What can we do about the unnecessary diversity of notation for syntactic definitions? %J Commun. ACM %V 20 %N 11 %D Nov 1977 %P 822-823 %K done .La Introduces Wirth's notation for extended CF grammars, .Xr "extended context-free grammar" using @t[{...}] for repetition, @t([...]) for optionality, @t[(...)] for grouping and @t["..."] for quoting. %A Jacques Cohen %A Martin S. Roth %T Analyses of deterministic parsing algorithms %J Commun. ACM %V 21 %N 6 %D June 1978 %P 448-458 %K done .La Gives methods to calculate the average parsing times and their standard deviations from the input grammar, for several parsers. The resulting formulae are finite series, and are sometimes given in closed form. %A Kuo-Chung Tai %T On the implementation of parsing tables %J ACM SIGPLAN Notices %V 14 %N 1 %D Jan 1979 %P 100-101 %K done .La How to implement parsing tables using hashing. %A Peter Deussen %T One abstract accepting algorithm for all kinds of parsers %B Automata, languages and programming %S Lecture Notes in Computer Science #71 %E Hermann A. Maurer %I Springer-Verlag %C Berlin %D 1979 %P 203-217 %K done .La Parsing is viewed as an abstract search problem, for which a high-level algorithm is given. The selection predicate involved is narrowed down to give known linear parsing methods. %A Robert Endre Tarjan %A Andrew Chi-Chih Yao %T Storing a sparse table %J Commun. ACM %V 22 %N 11 %D Nov. 1979 %P 606-611 %K done .La Two methods of storing sparse tables are presented and analysed: trie structure and double displacement. %A Hanan Samet %T A coroutine approach to parsing %J ACM Trans. Prog. Lang. Syst. %V 2 %N 3 %D July 1980 %P 290-306 %K done .La Some inputs consist of interleaved chunks of text conforming to different grammars. An example is programming text interrupted at unpredictable points by macro-processor directives. This situation can be handled by having separate parsers for each grammar, cooperating in coroutine fashion. .Xr "coroutine" %A Anton Nijholt %T Parsing strategies: A concise survey %B Mathematical Foundations of Computer Science %S Lecture Notes in Computer Science #118 %E J. Gruska & M. Chytil %I Springer-Verlag %C Berlin %D 1981 %P 103-120 %K done .La The context-free parser and language field is surveyed in terse prose. Highly informative to the connoisseur. %A Esko Ukkonen %T Lower bounds on the size of deterministic parsers %J J. Comput. Syst. Sci. %V 26 %D 1983 %P 153-170 %K done .Xr "deterministic parser" .La Worst-case lower bounds for the parser sizes are given for the various classes of LL($k$) .Xr "LL($k$)" and LR($k$) .Xr "LR($k$)" parsers for $k = 0, 1$ and $k >= 2$. All LL($k$) .Xr "LL($k$)" lower bounds are polynomial, except the one for full LL($k>1$), .Xr "full LL($k$)" which is exponential; all LR($k$) .Xr "LR($k$)" bounds are exponential. %A Fernando C.N. Pereira %A David H.D. Warren %T Parsing as deduction %B Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics %C Cambridge, Mass. %D 1983 %P 137-144 %K done .La The Prolog .Xr "Prolog" deduction mechanism is top-down depth-first. It can be exploited to do parsing, using Definite Clause .Xr "definite clause" grammars. Parsing can be done more efficiently with Earley's technique. .Xr "Earley parser" The corresponding Earley deduction .Xr "Earley deduction" mechanism is derived and analysed. %A Anton Nijholt %T Deterministic top-down and bottom-up parsing: historical notes and bibliographies %I Mathematisch Centrum %C Amsterdam %D 1983 %P 118 %K done .La Over a 1000 references about LL($k$), .Xr "LL($k$)" LR($k$) .Xr "LR($k$)" and precedence parsing, with short histories and surveys of the three methods. %A Peter Dencker %A Karl D\(:urre %A Johannes Heuft %T Optimization of parser tables for portable compilers %J ACM Trans. Prog. Lang. Syst. %V 6 %N 4 %D Oct 1984 %P 546-572 %K done .La Given an $n times m$ parser table, an $n times m$ bit table is used to indicate which entries are error entries; this table is significantly smaller than the original table and the remaining table is now sparse (typically 90-98% don't-care entries). .Xr "don't-care entry" The remaining table is compressed .Xr "table compression" row-wise (column-wise) by setting up an interference graph in which each node corresponds to a row (column) and in which there is an edge between any two nodes the rows (columns) of which occupy an element in the same position. A (pseudo-)optimal partitioning is found by a minimal graph-colouring heuristic. %A W.M. Waite %A L.R. Carter %T The cost of a generated parser %J Softw. Pract. Exper. %V 15 %N 3 %D 1985 %P 221-237 %K done .La Supports with measurements the common belief that compilers employing generated parsers suffer significant performance degradation with respect to recursive descent .Xr "recursive descent" compilers. Reasons: interpretation of parse tables .Xr "parse table" versus direct execution, attribute .Xr "attribute" storage allocation and the mechanism to determine which action(s) to perform. Then, a parser interface is proposed that simplifies integration of the parser; implementation of this interface in assembly language results in generated parsers that cost the same as recursive descent .Xr "recursive descent" ones. The paper does not consider generated recursive descent parsers. %A Gerard D. Finn %T Extended use of null productions in LR(1) parser applications %J Commun. ACM %V 28 %N 9 %D Sept 1985 %P 961-972 %K done .Xr "LR(1)" .La Extensive account of how to use an LR parser for conversion purposes. .Xr "document conversion" Makes a strong case for the use of parsers for conversion. Contains a good introduction to parsing. %A Robert Gerardy %T Experimental comparison of some parsing methods %J ACM SIGPLAN Notices %V 22 %N 8 %D Aug 1987 %P 79-88 %K done .La Experimental time measurements for recursive descent, .Xr "recursive descent" operator-precedence .Xr "operator-precedence" and SLR(1) .Xr "SLR(1)" parsing show a ratio of 1 : 4 : 10, in that order. All parsers were written in Pascal .Xr "Pascal" and parsed a mini-Pascal language. %A Michael Share %T Resolving ambiguities in the parsing of translation grammars %J ACM SIGPLAN Notices %V 23 %N 8 %D Aug 1988 %P 103-109 %K done .La The UNIX LALR parser generator .Xr "parser generator" @i[yacc] .Xr "@i[yacc]" is extended to accept LALR conflicts and to produce a parser that requests an interactive user decision when a conflict occurs while parsing. The system is used in document conversion. .Xr "document conversion" %A Josef Grosch %T Generators for high-speed front-ends %B Compiler Compilers and High-Speed Compilation %S Lecture Notes in Computer Science #371 %E D. Hammer %I Springer-Verlag %C Berlin %D 1989 %P 81-92 %K done .La A coherent system of lexical scanner generator, LALR(1) .Xr "LALR(1)" parser generator .Xr "parser generator" and LL(1) .Xr "LL(1)" parser generator, .Xr "parser generator" using a uniform input syntax, is presented. The scanner beats UNIX @i[lex] .Xr "@i[lex]" by a factor of 5, the LALR parser beats @i[yacc] .Xr "@i[yacc]" by a factor of 2. %A Vance E. Waddle %T Production trees: a compact representation of parsed programs %J ACM Trans. Prog. Lang. Syst. %V 12 %N 1 %D Jan 1990 %P 61-83 %K done .Xr "production tree" .La Redundant items are removed from a traditional parse tree through a number of techniques: unit productions are contracted, terminals symbols are removed, structure information in links is replaced by a rule number, etc. Each node in the resulting parse tree corresponds to one right-hand side and contains the rule number and a list of pointer to the nodes for the non-terminals in the right-hand side. A space saving .Xr "space requirements" of a factor 20 is achieved on the average. A grammar form that corresponds more closely to this representation is defined. %A Frank G. Pagan %T Comparative efficiency of general and residual parsers %J ACM SIGPLAN Notices %V 25 %N 4 %D April 1990 %P 59-68 %K done .La The switch from table-driven .Xr "table-driven" LL(1) .Xr "LL(1)" to recursive descent .Xr "recursive descent" or from table-driven .Xr "table-driven" LR(1) .Xr "LR(1)" to recursive ascent .Xr "recursive ascent" is viewed as an example of @i[partial computation]. .Xr "partial computation" Underlying theory and examples are given. %A Boris Burshteyn %T On the modification of the formal grammar at parse time %J ACM SIGPLAN Notices %V 25 %N 5 %D May 1990 %P 117-123 %K done .La Modifying the grammar under control of and utilizing information obtained by the parsing process is proposed as a means of handling context-sensitivity. For example, the recognition of the declaration of an array @t[aaa] could cause the introduction of a new grammar rule $ expr -> "@t[aaa]" "@t{[}" expr "@t{]}" $, (generated from a template), thus allowing forms like @t{aaa[3]} to be used. With a correction in the same journal, Vol. 25, No. 8, p 6. .EQ delim $$ .EN .P1 "Unrestricted PS and CS grammars" This section also covers some other non-context-free grammar types, excluding VW (two-level) grammars and affix grammars, .Xr "affix grammar" for which see Section 13.3. %A Alfred V. Aho %T Indexed grammars \- an extension of context-free grammars %J J. ACM %V 15 %N 4 %D Oct 1968 %P 647-671 %K done .La In an .Df "indexed grammar" , each non-terminal $N$ in a sentential form is followed by zero or more \(l"indices\(r", which govern which of the alternatives for $N$ are allowed for this occurrence of $N$. The indices propagate according to specific rules. $L(CF) ~ "\(sb" ~ L(Indexed) ~ "\(sb" ~ L(CS)$. %A William A. Woods %T Context-sensitive parsing %J Commun. ACM %V 13 %N 7 %D July 1970 %P 437-445 %X all references are inaccessible! %K done .La The paper presents a canonical form for context-sensitive ($epsilon$-free) .Xr "$epsilon$-free" derivation trees. Parsing is then performed by an exhaustive guided search over these canonical forms exclusively. This guarantees that each possible parsing will be found exactly once. %A Jacques Loeckx %T The parsing for general phrase-structure grammars %J Inform. Control %V 16 %N %D 1970 %P 443-464 %K done .La The paper sketches two non-deterministic parsers for PS grammars, one top-down, which tries to mimic the production process and one bottom-up, which tries to find a handle. The instructions of the parsers are derived by listing all possible transitions in the grammar and weeding out by hand those that cannot occur. Trial-and-error methods are discussed to resolve the non-determinism, but no instruction selection mechanisms are given. Very readable, nice pictures. %A Daniel A. Walters %T Deterministic context-sensitive languages, Parts I & II %J Inform. Control %V 17 %N 1 %P 14-61 %D 1970 %K done .Xr "deterministic language" .La The definition of LR($k$) .Xr "LR($k$)" grammars is extended to context-sensitive grammars. .Xr "context-sensitive grammar" Emphasis is more on theoretical properties than on obtaining a parser. %A Z.J. Ghandour %T Formal systems and analysis of context-sensitive languages %J Computer J. %V 15 %N 3 %D 1972 %P 229-237 %K done .La Ghandour describes a formal production system that is in some ways similar to but far from identical to a two-level grammar. A hierarchy of 4 classes is defined on these systems, with Class 1 \(ip Class 2 \(sp Class 3 \(sp Class 4, Class 1 \(ip CS and Class 4 \(eq CF. A parsing algorithm for Class 1 systems is given in fairly great detail. %A N.A. Khabbaz %T Multipass precedence analysis %J Acta Inform. %V 4 %N %D 1974 %P 77-85 %K done .La A hierarchy of CS grammars is given that can be parsed using multipass precedence parsing. The parser and the table construction algorithm are given explicitly. %A Eberhard Bertsch %T Two thoughts on fast recognition of indexed languages %J Inform. Control %V 29 %N %D 1975 %P 381-384 %K done .Xr "indexed grammar" .La Proves that parsing with (tree-)unambiguous indexed grammars is possible in $O ( n sup 2 )$ steps. .Xr "quadratic time dependency" %A Robert W. Sebesta %A Neil D. Jones %T Parsers for indexed grammars %J Intern. J. Comput. Inform. Sci. %V 7 %N 4 %D Dec 1978 %P 344-359 %K done .Xr "indexed grammar" .La Very good explanation of indexed grammars. Three classes of indexed grammars are defined, corresponding to strong-LL, .Xr "strong-LL($k$)" .Xr "strong-LL(1)" LL and LR, respectively. It is shown that the flag sets generated by indexed grammars are regular sets. %A C.J.M. Turnbull %A E.S. Lee %T Generalized deterministic left to right parsing %J Acta Inform. %V 12 %D 1979 %P 187-207 %K done .La The LR($k$) .Xr "LR($k$)" parsing machine is modified so that the reduce instruction removes the reduced right-hand side from the stack and pushes an arbitrary number of symbols back into the input stream. (The traditional LR($k$) .Xr "LR($k$)" reduce is a special case of this: it pushes the recognized non-terminal back into the input and immediately shifts it. The technique is similar to that put forward by D\(:om\(:olki [CF 1968].) .Au D\\\\(:om\\\\(:olki CF 1968 The new machine is capable of parsing efficiently a subset of the Type 0 grammars, .Df "DRP grammars" "" "DRP grammar" (for Deterministic Regular Parsable). .Xr "Deterministic Regular Parsable" Membership of this subset is undecidable, but an approximate algorithm can be constructed by extending the LR($k$) .Xr "LR($k$)" parse table .Xr "parse table" algorithm. Details are not given, but can be found in a technical report by the first author. %A Kurt Mehlhorn %T Parsing macro grammars top down %J Inform. Control %V 40 %N 2 %D 1979 %P 123-143 %K done .La .Df "Macro grammars" "" "macro grammar" are defined as follows. The non-terminals in a CF grammar are given parameters, as if they were routines in a programming language. The values of these parameters are strings of terminals and non-terminals (the latter with the proper number of parameters). A parameter can be passed on, possibly concatenated with some terminals and non-terminals, or can be made part of the sentential form. An algorithm to construct a recursive-descent .Xr "recursive descent" parser for a macro grammar (if possible) is given. %A G. Barth %T Fast recognition of context-sensitive structures %J Computing %V 22 %N %D 1979 %P 243-256 %K done .La A .Df "recording grammar" (an RG) .Xs "RG" "recording grammar" is a CF grammar in which each (numbered) production rule belongs to one of three classes: normal, recording and directed. During production, normal rules behave normally and a recording rule records its own occurrence by appending its number to a string called the $pi$-element. When production leaves a \(l"recording\(r" stage, the entire $pi$-element is added to a set called the $OMEGA$-component, which collects all contexts created so far. When production enters a \(l"directed\(r" stage, an element (a context) is retrieved from the $OMEGA$-component, transferred through a mapping $I$ and used to direct the choice of production rules until the element is exhausted. The expressive power of RGs is equal to that of Type 0 grammars. .Lp An LL($k$) .Xr "LL($k$)" version of RGs can be defined, based on LL($k$)-ness of the underlying CF grammar, plus a few simple restrictions on the mapping $I$; the resulting property is called .Df "RLL($k$)" . .Lp For parsing, an LL($k$) .Xr "LL($k$)" parse is performed; during \(l"normal\(r" parsing, nothing special is done, during \(l"recording\(r" parsing the rule numbers are recorded and subsequently added to the $OMEGA$-component; during \(l"directed\(r" parsing, which is actually \(l"checking\(r" parsing, the rule numbers are checked for consistency with the $OMEGA$-component, using a simple finite transducer. The parser (+ checker) works in linear time. .Xr "linear time dependency" .Lp It is not clear how convenient RLL($k$) RGs are; neither of the two examples provided to demonstrate the power of RGs is RLL($k$). %A G.Sh. Vol'dman %T A parsing algorithm for context-sensitive grammars %J Program. Comput. Softw. %V 7 %N %D 1981 %P 302-307 %K done .La .Xr "context-sensitive grammar" Extends Earley's algorithm .Xr "Earley parser" first to length-increasing phrase structure .Xr "phrase structure grammar" grammars and then to non-decreasing PS grammars (= CS grammars). The resulting algorithm has exponential time requirements .Xr "exponential time dependency" but is often much better. %A Lawrence A. Harris %T SLR(1) and LALR(1) parsing for unrestricted grammars %J Acta Inform. %V 24 %N %D 1987 %P 191-209 %K done .Xr "LALR(1)" .La The notion of an LR(0) item can easily be defined for unrestricted grammars: \(l"For each item $ lambda -> mu sub 1 "\*(Dt" X mu sub 2 $ there is a transition on $X$ to the item $ lambda -> mu sub 1 X "\*(Dt" mu sub 2 $ and an $epsilon$-transition .Xr "$epsilon$-transition" to any item $ X delta -> "\*(Dt" eta $\(r", for any symbol $X$. These items are grouped by subset construction .Xr "subset construction" into the usual states, called here .Df "preliminary states" , "preliminary state" since some of them may actually be ineffective. A GOTO .Xr "GOTO-action" function is also defined. If we can, for a given grammar, calculate the FOLLOW sets .Xr "FOLLOW set" of all left-hand sides (undecidable in the general case), we can apply a variant of the usual SLR(1) .Xr "SLR(1)" test and construct a parsing table for a parser as described by Turnbull and Lee [PSCS 1979]. .Au Turnbull PSCS 1979 .Au Lee PSCS 1979 .Lp To obtain the LALR(1) .Xr "LALR(1)" definition, a look-ahead grammar system is defined that will, for each item in each state, generate the (unrestricted) language of all continuations after that item. If we can, for a given grammar, calculate the FIRST .Xr "FIRST set" sets of all these languages (undecidable in the general case), we can apply a variant of the usual LALR(1) .Xr "LALR(1)" test and construct a parsing table for a similar parser. If one of the above constructions succeeds, a linear-time .Xr "linear time dependency" parser is obtained. .Lp The author gives many hand-calculated examples and explores error detection .Xr "error detection" properties. More general definitions of SLR(1) .Xr "SLR(1)" and LALR(1) .Xr "LALR(1)" are possible, encompassing larger sets of grammars, at the cost of a still further reduced chance of decidability. .EQ delim $$ .EN .P1 "Van Wijngaarden grammars and affix grammars" Note that van Wijngaarden grammars and two-level grammars are synonyms; affix .Xr "affix grammar" grammars are different. %A M. Sintzoff %T Existence of a van Wijngaarden syntax for every recursively enumerable set %J Annales de la Soci\('et\('e Scientifique de Bruxelles %V 81 %N II %D 1967 %P 115-118 %K done .La .Xr "syntax" A relatively simple proof of the theorem that for every semi-Thue system we can construct a VW grammar that produces the same set. %A A. van\ Wijngaarden %A B.J. Mailloux %A J.E.L. Peck %A C.H.A. Koster %A M. Sintzoff %A C.H. Lindsey %A L.G.L.T. Meertens %A R.G. Fisker %T Revised report on the algorithmic language ALGOL 68 %J Acta Inform. %V 5 %N %D 1975 %P 1-236 %K done .La VW grammars found their widest application to date in the definition of ALGOL 68. .Xr "Algol 68" Section 1.1.3 of the ALGOL 68 Revised Report contains a very carefully worded description of the two-level mechanism. The report contains many interesting applications. %A C.H.A. Koster %T Affix grammars %B ALGOL 68 Implementation %E J.E.L. Peck %I North-Holland Publ. Co. %C Amsterdam %D 1971 %P 95-109 %K done .Xr "Algol 68" .La Context conditions are expressed inside a context-free grammar by introducing @i[affixes], .Xr "affix" which are divided in @i[derived] .Xr "derived affix" and @i[inherited] .Xr "inherited affix" and which have to fulfill user-defined @i[primitive predicates]. .Xr "primitive predicate" If the affix grammar .Xr "affix grammar" is @i[well-formed], .Xr "well-formed affix grammar" a parser for it can be constructed. %A David Crowe %T Generating parsers for affix grammars %J Commun. ACM %V 15 %N 8 %D Aug 1972 %P 728-734 %K done .La A bounded-context .Xr "bounded-context" (Floyd productions) .Xr "Floyd production" parser is extended with affix .Xr "affix" manipulation. %A A. van\ Wijngaarden %T The generative power of two-level grammars %B Automata, Languages and Programming %S Lecture Notes in Computer Science #14 %E J. Loeckx %I Springer-Verlag %C Berlin %D 1974 %P 9-16 %K done .La .Xr "generative grammar" The generative power of VW grammars is illustrated by creating a VW grammar that simulates a Turing machine; the VW grammar uses only one metanotion, .Xr "metanotion" thus proving that one metanotion suffices. %A Sheila A. Greibach %T Some restrictions on $W$-grammars %J Intern. J. Comput. Inform. Sci. %V 3 %N 4 %D 1974 %P 289-327 %K done .La The consequences of two easily checkable restrictions on the form of the rules in a VW grammar are explored in great detail and are found to be surprising. Although this highly technical paper is not directly concerned with parsing, it is very instructive in that it shows methods of exploring the field. %A C.H.A. Koster %T Two-level grammars %B Compiler Construction: An Advanced Course %S Lecture Notes in Computer Science #21 %E F.L. Bauer & J. Eickel %I Springer-Verlag %C Berlin %D 1974 %P 146-156 %K done .La Easy introduction to two-level (VW) grammars, starting from one-level VW grammars. Examples of practical handling of context in a VW grammar. %A P. Deussen %T A decidability criterion for van Wijngaarden grammars %J Acta Inform. %V 5 %N %D 1975 %P 353-375 %K done .La The criterion, which is given in detail, can be paraphrased very roughly as follows: the language generated by a VW grammar is decidable if (but not only if) there are no $epsilon$-rules and either there are no free metanotions .Xr "metanotion" (occurring on the right-hand side only) or there are no dummy metanotions .Xr "metanotion" (occurring on the left-hand side only). %A David A. Watt %T The parsing problem for affix grammars %J Acta Inform. %V 8 %N %D 1977 %P 1-20 %K done .La A technique is described to convert an affix .Xr "affix grammar" grammar into a CF grammar called a .Df "head grammar" which contains a special kind of non-terminal, .Df "copy-symbols" . "copy-symbol" For the head grammar they are $epsilon$-rules, .Xr "$epsilon$-rule" but for the affix grammar .Xr "affix grammar" they effect affix manipulations on the affix stack. Primitive predicates .Xr "primitive predicate" are also $epsilon$-rules, .Xr "$epsilon$-rule" but do checks on the affixes. Parsing is done by any CF parser, preferably LR(1). .Xr "LR(1)" The affixes are not used to control the parsing but only to declare an input string erroneous: for the technique to work, the affix grammar .Xr "affix grammar" must in effect be an attribute grammar. .Xr "attribute grammar" %A J. Craig Cleaveland %A Robert C. Uzgalis %T Grammars for Programming Languages %I Elsevier %C New York %D 1977 %P 154 %K done .La In spite of its title, the book is a highly readable explanation of two-level grammars, also known as van Wijngaarden grammars or VW grammars. After an introductory treatment of formal languages, the Chomsky hierarchy .Xr "Chomsky hierarchy" and parse trees, it is shown to what extent CF languages can be used to define a programming language. These are shown to fail to define a language completely and the inadequacy of CS grammars is demonstrated. VW grammars are then explained and the remainder of the book consists of increasingly complex and impressive examples of what a VW grammar can do. These examples include keeping a name list, doing type checking and handling block structure in the definition of a programming language. Recommended reading. %A R. Meersman %A G. Rozenberg %T Two-level meta-controlled substitution grammars %J Acta Inform. %V 10 %N %D 1978 %P 323-339 %K done .La The authors prove that the uniform substitution rule is essential for two-level grammars; without it, they would just generate the CF languages. This highly technical paper examines a number of variants of the mechanisms involved. %A Lutz Wegner %T Bracketed two-level grammars \- a decidable and practical approach to language definition %B Automata, languages and programming %S Lecture Notes in Computer Science #71 %E Hermann A. Maurer %I Springer-Verlag %C Berlin %D 1979 %P 668-682 %K done .Xr "bracketed grammar" .La The metanotions .Xr "metanotion" of a VW grammar are partitioned into two blocks, \(l"synthesized\(r" .Xr "synthesized metanotion" and \(l"derived\(r"; .Xr "derived metanotion" they are separated in a hyperrule by special markers, \(l"brackets\(r", and are treated more or less as attributes. .Xr "attribute" Under reasonable conditions parsability can be obtained. The thus restricted VW grammars are very readable. %A Lutz Michael Wegner %T On parsing two-level grammars %J Acta Inform. %V 14 %N %D 1980 %P 175-193 %X almost all references are inaccessible %K done .La The article starts by defining a number of properties a VW grammar may exhibit; among these are \(l"left(right) bound\(r", .Xr "left bound" .Xr "right bound" \(l"free of hidden empty notions\(r", .Xr "free of hidden empty notions" \(l"uniquely assignable\(r" .Xr "uniquely assignable" and \(l"locally unambiguous\(r". .Xr "locally unambiguous" Most of these properties are undecidable, but sub-optimal tests can be devised. For each VW grammar $ G sub VW $, a CF .Df "skeleton grammar" $ G sub SK $ is defined by considering all hypernotions .Xr "hypernotion" in the VW grammar as non-terminals of $ G sub SK $ and adding the cross-references of the VW grammar as production rules to $ G sub SK $. $ G sub SK $ generates a superset of $ G sub VW $. The cross-reference problem for VW grammars is unsolvable but again any sub-optimal algorithm (or manual intervention) will do. Parsing is now done by parsing with $ G sub SK $ and then reconstructing and testing the metanotions. .Xr "metanotion" A long list of conditions necessary for the above to work are given; these conditions are in terms of the properties defined at the beginning. %A Dick Grune %T How to produce all sentences from a two-level grammar %J Inform. Process. Lett. %V 19 %N %D Nov 1984 %P 181-185 %K done .La All terminal productions are derived systematically in breadth-first order. .Xr "breadth-first production" The author identifies pitfalls in this process and describes remedies. A parser is used to identify the hyperrules involved in a given sentential form. This parser is a general CF recursive descent .Xr "recursive descent" parser to which a consistency check for the metanotions .Xr "metanotion" has been added; it is not described in detail. %A A.J. Fisher %T Practical LL(1)-based parsing of van Wijngaarden grammars %J Acta Inform. %V 21 %N %D 1985 %P 559-584 %K done .La Fisher's parser is based on the idea that the input string was generated using only a small, finite, part of the infinite .Df "strict grammar" that can be generated from the VW grammar. The parser tries to reconstruct this part of the strict grammar .Xr "strict grammar" on the fly while parsing the input. The actual parsing is done by a top-down interpretative LL(1) .Xr "LL(1)" parser, called the .Df "terminal parser" . It is driven by a fragment of the strict grammar .Xr "strict grammar" and any time the definition of a non-terminal is found missing by the terminal parser, .Xr "terminal parser" the latter asks another module, the .Df "strict syntax generator" , to try to construct it from the VW grammar. For this technique to work, the VW grammar has to satisfy three conditions: the defining CF grammar of each hyperrule is unambiguous, there are no free metanotions, .Xr "metanotion" and the skeleton grammar .Xr "skeleton grammar" (as defined by Wegner [VW 1980]) .Au Wegner VW 1980 is LL(1). .Xr "LL(1)" The parser system is organized as a set of concurrent processes (written in occam), .Xr "occam" with both parsers, all hyperrule matchers and several other modules as separate processes. The author claims that \(l"this concurrent organization ... is strictly a property of the algorithm, not of the implementation\(r", but a sequential, albeit slower, implementation seems quite possible. The paper gives heuristics for the automatic generation of the cross-reference needed for the skeleton grammar; .Xr "skeleton grammar" gives a method to handle .Df "general hyperrules" , "general hyperrule" hyperrules that fit all hypernotions, .Xr "hypernotion" efficiently; and pays much attention to the use of angle brackets .Xr "angle brackets" in VW grammars. %A Jacques Cohen %A Timothy J. Hickey %T Parsing and compiling using Prolog %J ACM Trans. Prog. Lang. Syst. %V 9 %N 2 %D April 1987 %P 125-164 %K done .La See same paper [CF 1987]. .EQ delim $$ .EN .P1 "General context-free parsers" %A E.T. Irons %T A syntax-directed compiler for ALGOL 60 %J Commun. ACM %V 4 %N 1 %D Jan 1961 %P 51-55 %K done .Xr "Algol 60" .La The first to describe a full parser. It is essentially a full backtracking .Xr "backtracking" recursive descent .Xr "recursive descent" left-corner parser. .Xr "left-corner parsing" The published program is corrected in a Letter to the Editor by B.H. Mayoh, @i[Commun. ACM], vol. 4, no. 6, p. 284, June 1961. %A Itiroo Sakai %T Syntax in universal translation %B Proceedings 1961 International Conference on Machine Translation of Languages and Applied Language Analysis %I Her Majesty's Stationery Office %C London %D 1962 %P 593-608 %K done .La .Xr "syntax" Using a formalism that seems equivalent to a CF grammar in Chomsky Normal .Xr "Chomsky Normal Form" Form and a parser that is essentially a CYK parser, .Xr "CYK parser" the author describes a translation mechanism in which the source language sentence is transformed into a binary tree (by the CYK parser). Each production rule carries a mark telling if the order of the two constituents should be reversed in the target language. The target language sentence is then produced by following this new order and by replacing words. A simple Japanese-to-English example is provided. %A E.T. Irons %T The structure and use of the syntax directed compiler %J Annual Review in Automatic Programming %V 3 %N %D 1962 %P 207-228 %K done .La Extended version of Irons [CF 1961]. .Au Irons CF 1961 %A E.T. Irons %T An error-correcting parse algorithm %J Commun. ACM %V 6 %N 11 %D Nov 1963 %P 669-673 %K done .La Contrary to the title, the most interesting part of this paper is the parser it describes, which is essentially Earley's algorithm .Xr "Earley parser" without look-ahead. The invention of this parser was prompted by the author's dissatisfaction with the error detection .Xr "error detection" properties of backtracking .Xr "backtracking" parsers. This one does not backtrack, it keeps all possible parsings in parallel instead. When the set of possible parsings becomes exhausted due to an error in the input, the last non-empty set is analysed for continuations, both terminal and non-terminal, including all successors and alternatives. Then input symbols are discarded until one is found that is a terminal in the continuation set or the beginning of a non-terminal in that set. Symbols are then inserted to bridge any possible gap thus created, and parsing continues. Note that this is essentially R\(:ohrich's algorithm. The author points out applications for this parser as a pattern matcher. %A Sheila A. Greibach %T Formal parsing systems %J Commun. ACM %V 7 %N 8 %D Aug 1964 %P 499-504 %K done .La \(l"A formal parsing system $ G = ( V , mu , T, R ) $ consists of two finite disjoint vocabularies, $V$ and $T$, a many-to-many map, $ mu $, from $V$ onto $T$, and a recursive set $R$ of strings in $T$ called syntactic sentence classes\(r" (verbatim). This is intended to solve an additional problem in parsing, which occurs often in natural languages: a symbol found in the input does not always uniquely identify a terminal symbol from the language (for instance, @i[will] (verb) versus @i[will] (noun)). On this level, the language is given as the entire set $R$, but in practice it is given through a \(l"context-free phrase structure .Xr "phrase structure grammar" generator\(r", i.e. a grammar. To allow parsing, this grammar is brought into what is now known as Greibach Normal Form: .Xr "Greibach Normal Form" each rule is of the form $ Z -> aY sub 1 ... Y sub m $. Now a .Df "directed production analyser" is defined which consists of an unlimited set of pushdown stores and an input stream, the entries of which are sets of terminal symbols, derived through $ mu $ from the lexical symbols. For each consecutive input entry, the machine scans the stores for a top non-terminal $Z$ for which there is a rule $ Z -> a Y sub 1 ... Y sub m $ with $a$ in the input set. A new store is filled with a copy of the old store and the top $Z$ is replaced by $ Y sub 1 ... Y sub m $; if the resulting store is longer than the input, it is discarded. Stores will contain non-terminals only. For each store that is empty when the input is exhausted, a parsing has been found. This is in effect non-deterministic top-down parsing with a one-symbol look-ahead. This is probably the first description of a parser that will work for any CF grammar. .Lp A large part of the paper is dedicated to undoing the damage done by converting to Greibach Normal Form. .Xr "Greibach Normal Form" %A T.V. Griffiths %A S.R. Petrick %T On the relative efficiencies of context-free grammar recognizers %J Commun. ACM %V 8 %N 5 %D May 1965 %P 289-300 %K done .La To achieve a unified view of the parsing techniques known at that time, the authors define a non-deterministic two-stack machine whose only type of instruction is the replacement of two given strings on the tops of both stacks by two other strings; the machine is started with the input on one stack and the start symbol on the other and it \(l"recognizes\(r" the input if both stacks get empty simultaneously. For each parsing technique considered, a simple mapping from the grammar to the machine instructions is given; the techniques covered are top-down (called top-down), left-corner .Xr "left-corner parsing" (called bottom-up) and bottom-up (called direct-substitution). Next, look-ahead techniques are incorporated to attempt to make the machine deterministic. The authors identify left-recursion .Xr "left-recursion" as a trouble-spot. All grammars are required to be $epsilon$-free. .Xr "$epsilon$-free" The procedures for the three parsing methods are given in a Letter to the Editor, @i[Commun. ACM], vol. 8, no. 10, p. 594, Oct 1965. %A Susumu Kuno %T The predictive analyzer and a path elimination technique %J Commun. ACM %V 8 %N 7 %D July 1965 %P 453-462 %K done .La The author extends his .Df "predictive analyser" (in modern terminology: an exhaustive top-down parser for grammars in Greibach Normal Form) .Xr "Greibach Normal Form" (see Kuno and Oettinger, reprinted by Grosz, Sparck\ Jones and Webber [NatLang 1986]) .Au Grosz NatLang 1986 .Au "Sparck\\\\ Jones" NatLang 1986 .Au Webber NatLang 1986 with a table of well-formed substrings. .Xr "well-formed substring table" Through ingenious bit manipulation the table is made to fit in a small memory. Time gains are considerable (as expected). %A Susumu Kuno %T An augmented predicative analyzer for context-free languages \- its relative efficiency %J Commun. ACM %V 9 %N 11 %D Nov 1966 %P 810-823 %K done .La Converting a CF grammar to Greibach Normal Form .Xr "Greibach Normal Form" often greatly distorts its structure. To keep track of the structure, the right-hand side of each rule in the CF grammar is prefixed with a marker, a special non-terminal which produces $epsilon$. A conversion algorithm is described that results in rules of the form $ A -> M sup "+" a B C ... $, where $ M sup "+" $ is a non-empty sequence of markers. The Kuno predictive analyser .Xr "predictive analyser" (see Kuno [CF 1965]) .Au Kuno CF 1965 is extended with a second stack on which the marker parts of the rules are kept. When a parsing is found, the marker stack allows easy reconstruction of the parsing according to the original CF grammar. The parser is compared to two other parsers, using a large number of criteria. %A D.H. Younger %T Recognition of context-free languages in time $ n sup 3 $ %J Inform. Control %V 10 %N 2 %D Feb 1967 %P 189-208 %K done .Xr "cubic time dependency" .La A Boolean recognition matrix $R$ is constructed in a bottom-up fashion, in which $ R[i, l, p] $ indicates that the segment of the input string starting at position $i$ with length $l$ is a production of non-terminal $p$. This matrix can be filled in $ O ( n sup 3 ) $ actions, where $n$ is the length of the input string. If $ R[0, n, 0] $ is set, the whole string is a production of non-terminal 0. Many of the bits in the matrix can never be used in any actual parsing; these can be removed by doing a top-down scan starting from $ R[0, n, 0] $ and removing all bits not reached this way. If the matrix contains integer rather than Boolean elements, it is easy to fill it with the number of ways a given segment can be produced by a given non-terminal; this yields the ambiguity rate. .Xr "ambiguity rate" %A S.H. Unger %T A global parser for context-free phrase structure grammars %J Commun. ACM %V 11 %N 4 %D April 1968 %P 240-247 %K done .La The Unger parser .Xr "Unger parser" (as described in Section 4.1) is extended with a series of tests to avoid partitionings that could never lead to success. For instance, a section of the input is never matched against a non-terminal if it begins with a token no production of the non-terminal could begin with. Several such tests are described and ways are given to statically derive the necessary information (FIRST sets, .Xr "FIRST set" LAST sets, .Xr "LAST set" EXCLUDE sets) .Xr "EXCLUDE set" from the grammar. Although none of this changes the exponential character of the algorithm, the tests do result in a considerable speed-up in practice. (There is an essential correction to one of the flowcharts given in \fICommun. ACM\fR, vol. 11, no. 6, p. 427, June 1968.) %A B.A. Chartres %A J.J. Florentin %T A universal syntax-directed top-down analyzer %J J. ACM %V 15 %N 3 %D July 1968 %P 447-464 %K done .La The non-deterministic two-stack top-down parser of Griffiths and Petrick [CF 1965] .Au Griffiths CF 1965 .Au Petrick CF 1965 is extended with a third stack and a status variable. One stack holds the rest of the input, the second holds the prediction that should match that input and the third holds a tracing of the outline of the production tree .Xr "production tree" constructed so far; when input and prediction stack .Xr "prediction stack" are empty, the third stack holds the completed parse tree. This three-stack mechanism can be run both forward and backward; the status variable keeps track of the direction. By properly reacting to the values on the tops of the stacks and the direction variable, it is possible to make the mechanism perform a full backtracking .Xr "backtracking" exhaustive search. Much work is spent on handling left recursion and $epsilon$-rules. .Xr "$epsilon$-rule" %A B\('alint D\(:om\(:olki %T A universal compiler system based on production rules %J BIT %V 8 %N 4 %P 262-275 %D Oct 1968 %K done .La The heart of the compiler system described here is a production system consisting of an ordered set of production rules, which are the inverses of the grammar rules; note that the notions \(l"left-hand side\(r" (lhs) and \(l"right-hand side\(r" (rhs) are reversed from their normal meanings in this abstract. The system attempts to derive the start symbol, by always applying the first applicable production rule (first in two respects: from the left in the string processed, and in the ordered set of production rules). This resolves shift/reduce conflicts in favour of reduce, and reduce/reduce conflicts by length and by the order of the production rules. When a reduction is found, the lhs of the reducing rule is offered for semantic processing and the rhs is pushed back into the input stream, to be reread. Since the length of the rhs is not restricted, the method can handle non-CF grammars. .Lp The so-called \(l"Syntactic Filter\(r" uses a bitvector .Xr "bitvector" technique to determine if, and if so which, production rule is applicable: for every symbol $i$ in the alphabet, .Xr "alphabet" there is a bitvector $B[i]$, with one bit for each of the positions in each lhs; this bit set to 1 if this position contains symbol $i$. There is also a bitvector $U$ marking the first symbol of each lhs, and a bitvector $V$ marking the last symbol of each lhs. Now, a stack of bitvectors $Q sub t$ is maintained, with $Q sub 0~=~0$ and $Q sub t~=~( ( Q sub t-1 ">>" 1 )$ \(OR $U )$ \(AN $B [i sub t ]$, where $i sub t$ is the $t$-th input symbol. $Q sub t$ contains the answer to the question whether the last $j$ symbols received are the first $j$ symbols of some lhs, for any lhs and $j$. A 1 \(l"walks\(r" through an lhs part of the $Q$ vector, as this lhs is recognized. An occurrence of a lhs is found if $Q sup t$ \(AN $V~!=~0$. After doing a replacement, $t$ is set back $k$ places, where $k$ is the length of the applied lhs, so a stack of $Q sub t$-s must be maintained. If some $Q sub t~=~0$, we have an error. An interesting implementation of the D\(:om\(:olki algorithm is given by Hext and Roberts [CF 1970]. .Au Hext CF 1970 .Au "Roberts, P.S." CF 1970 %A T. Kasami %A K. Torii %T A syntax-analysis procedure for unambiguous context-free grammars %J J. ACM %V 16 %N 3 %D July 1969 %P 423-431 %K done .La A rather complicated presentation of a variant of the CYK algorithm, .Xr "CYK parser" including the derivation of a $ O ( n sup 2 log ~ n ) $ .Xr "quadratic time dependency" time bound for unambiguous Chomsky Normal Form grammars. .Xr "Chomsky Normal Form" %A J. Earley %T An efficient context-free parsing algorithm %J Commun. ACM %V 13 %N 2 %D Feb 1970 %P 94-102 %K done .La This famous paper gives an informal description of the Earley algorithm. .Xr "Earley parser" The algorithm is compared both theoretically and experimentally with some general search techniques and with the CYK algorithm. .Xr "CYK parser" It easily beats the general search techniques. Although the CYK algorithm has the same worst-case efficiency as Earley's, .Xr "Earley parser" it requires $ O ( n sup 3 ) $ .Xr "cubic time dependency" on any grammar, whereas Earley's requires $ O ( n sup 2 ) $ .Xr "quadratic time dependency" on unambiguous grammars and $ O ( n ) $ on bounded-state grammars. The algorithm is easily modified to handle Extended CF grammars. .Xr "extended context-free grammar" (Also reprinted by Grosz, Sparck\ Jones and Webber [NatLang 1986]) .Au Grosz NatLang 1986 .Au "Sparck\\\\ Jones" NatLang 1986 .Au Webber NatLang 1986 %A J.B. Hext %A P.S. Roberts %T Syntax analysis by Dom\(:olki's algorithm %J Computer J. %V 13 %N 3 %D Aug 1970 %P 263-271 %K done .La D\(:om\(:olki's algorithm is a bottom-up parser in which the item sets are represented as bitvectors. .Xr "bitvector" A backtracking version is presented which can handle any grammar. To reduce the need for backtracking .Xr "backtracking" a 1-character look-ahead is introduced and an algorithm for determining the actions on the look-ahead is given. Since the internal state is recalculated by vector operations for each input character, the parse table .Xr "parse table" is much smaller than usual and its entries are one bit each. This, and the fact that it is all bitvector operations, makes the algorithm suitable for implementation in hardware. %A Bernard Lang %T Parallel non-deterministic bottom-up parsing %J ACM SIGPLAN Notices %V 6 %N 12 %D Dec 1971 %P 56-57 %K done .La The full breadth-first search .Xr "breadth-first search" of an Earley parser .Xr "Earley parser" is limited through the use of weak-precedence .Xr "weak precedence" relations, .Xr "precedence relation" in so far as these are unique. Abstract of a larger technical report. %A F.P. Kaminger %T Generation, recognition and parsing of context-free languages by means of recursive graphs %J Computing %V 11 %N 1 %D 1973 %P 87-96 %K done .La Formal description of the use of recursive graphs instead of CF grammars to describe, generate and parse context-free languages. %A Bernard Lang %T Deterministic techniques for efficient non-deterministic parsers %B Automata, languages and programming %S Lecture Notes in Computer Science #14 %E J. Loeckx %I Springer-Verlag %C Berlin %D 1974 %P 255-269 %K done .Xr "deterministic parser" .La Explores the theoretical properties of doing breadth-first search .Xr "breadth-first search" to resolve the non-determinism in a bottom-up automaton with conflicts. See Tomita .Xr "Tomita parser" [CF 1986] .Au Tomita CF 1986 for a practical realization. %A M. Bouckaert %A A. Pirotte %A M. Snelling %T Efficient parsing algorithms for general context-free parsers %J Inform. Sci. %V 8 %N 1 %D Jan 1975 %P 1-26 %K done .La The authors observe that the Predictor in an Earley parser .Xr "Earley parser" will often predict items that start with symbols that can never match the first few symbols of the present input; such items will never bear fruit and could as well be left out. To accomplish this, they extend the $k$-symbol reduction look-ahead .Xr "reduction look-ahead" Earley parser .Xr "Earley parser" with a $t$-symbol prediction mechanism; this results in very general $ M sub k sup t $ parsing machines, the properties of which are studied, in much formal detail. Three important conclusions can be drawn. Values of $k$ or $t$ larger than one lose much more on processing than they will normally gain on better prediction and sharper reduction; such parsers are better only for asymptotically long input strings. The Earley parser without look-ahead ($ M sub 0 sup 0 $) performs better than the parser with 1 symbol look-ahead; Earley's recommendation to use always 1 symbol look-ahead is unsound. The best parser is $ M sub 0 sup 1 $; i.e. use a one symbol predictive look-ahead and no reduction look-ahead. .Xr "reduction look-ahead" %A L. Valiant %T General context-free recognition in less than cubic time %J J. Comput. Syst. Sci. %V 10 %D 1975 %P 308-315 %K done .Xr "Valiant parser" .Xr "cubic time dependency" .La .Xr "time requirements" Reduces CYK .Xr "CYK parser" to bit matrix multiplication and then applies .Fs Strassen's Volker Strassen, \(l"Gaussian elimination is not optimal\(r", @i[Numerische Mathematik], vol. 13, p. 354-356, 1969. Shows how to multiply two 2\(mu2 matrices using 7 multiplications rather than 8 and extends the principle to larger matrices. .Fe algorithm. %A C.H.A. Koster %T A technique for parsing ambiguous grammars %B GI-4. Jahrestagung %S Lecture Notes in Computer Science #26 %E D. Siefkes %I Springer-Verlag %C New York %D 1975 %P 233-246 %K done .La .Xr "ambiguity" Three recursive-descent .Xr "recursive descent" parsing techniques are described: no backtrack, .Xr "backtracking" partial backtrack and full backtrack. %A B. Sheil %T Observations on context-free parsing %J Statistical Methods in Linguistics %D 1976 %P 71-109 %K done .La The author proves that any CF backtracking .Xr "backtracking" parser will have polynomial .Xr "polynomial time dependency" time requirements if provided with a @i[well-formed substring table] .Xr "well-formed substring table" (WFST), which holds the well-formed substrings recognized so far and which is consulted before each attempt to recognize a substring. The time requirements .Xr "time requirements" of the parser is $ O ( n sup { c + 1 } ) $ where $c$ is the maximum number of non-terminals in any right-hand side. A .Df "2-form grammar" is a CF grammar such that no production rule in the grammar has more than two non-terminals on the right-hand side; nearly all practical grammars are already 2-form. 2-form grammars, of which Chomsky Normal Form .Xr "Chomsky Normal Form" grammars are a subset, can be parsed in $ O ( n sup 3 ) $. .Xr "cubic time dependency" An algorithm for a dividing top-down parser with a WFST is supplied. Required reading for anybody who wants to write or use a general CF grammar. Many practical hints and opinions (controversial and otherwise) are given. %A Susan L. Graham %A Michael A. Harrison %T Parsing of general context-free languages %B Advances in Computing, Vol. 14 %I Academic Press %C New York %D 1976 %P 77-185 %K done .La The 109 page article describes three algorithms in a more or less unified manner: CYK, .Xr "CYK parser" Earley's .Xr "Earley parser" and Valiant's. .Xr "Valiant parser" The main body of the paper is concerned with bounds for time and space requirements. .Xr "space requirements" .Xr "time requirements" Sharper bounds than usual are derived for special grammars, for instance, for linear grammars. .Xr "linear grammar" %A Jaroslav Kr\('al %T A top-down no backtracking parsing of general context-free languages %B Mathematical Foundations of Computer Science %S Lecture Notes in Computer Science #53 %E J. Gruska %I Springer-Verlag %C Berlin %D 1977 %P 333-341 %K done .La .Xr "backtracking" The states of a top-down breadth-first general CF parser are combined whenever possible, resulting in an Earley-like parser .Xr "Earley parser" without the bottom-up component. %A G.K. Manacher %T An improved version of the Cocke-Younger-Kasami algorithm %J Comput. Lang. (Elmsford, NY) %V 3 %D 1978 %P 127-133 %K done .La This paper discusses some modifications to the CYK algorithm .Xr "CYK parser" that make it more efficient. First, the \(l"length induction\(r" iteration of CYK is replaced by an iteration that combines sets of non-terminals that derive strings of length $ j - 1 $ with sets of non-terminals that derive strings of length $ k <= j - 1 $. Then, the recognition table .Xr "recognition table" of CYK .Xr "CYK parser" is replaced by three tables of lists, where each table has a list for each non-terminal/number pair. The first table maps a non-terminal/length pair to a list of positions, indicating where substrings of this length start that are derived by this non-terminal. The second table maps a non-terminal/position pair to a list of lengths, indicating the lengths of the substrings starting at this position that are derived by this non-terminal. The third table maps a non-terminal/position pair to a list of lengths, indicating the lengths of the substrings ending at this position that are derived by this non-terminal. With these modifications a time bound $ O ( s ( n ) ) $ is established for unambiguous grammars, where $ s ( n ) $ is the number of triplets $ ( A , i , j ) $ for which the non-terminal $A$ derives the substring starting at position $i$, with length $j$. This is at worst $ O ( n sup 2 ) $. .Xr "quadratic time dependency" %A W.L. Ruzzo %T On the complexity of general context-free language parsing and recognition %B Automata, Languages and Programming %S Lecture Notes in Computer Science #71 %E Hermann A. Maurer %I Springer-Verlag %C Berlin %D 1979 %P 489-497 %K done .La This is an extended abstract, summarizing some time requirement .Xr "time requirements" results: it is shown that parsing strings of length $n$ is only $ O ( log ~ n ) $ harder than just recognizing them. Also, the time to multiply $ { sqrt n } * { sqrt n } $ Boolean matrices is a lower bound on the time needed to recognize all prefixes of a string, and this, in turn, is a lower bound on the time needed to generate a convenient representation of all parses of a string (basically the CYK recognition table, .Xr "CYK parser" .Xr "recognition table" but reduced so that a non-terminal only is present in the recognition table .Xr "recognition table" if it can be used to derive the sentence). %A S.L. Graham %A M.A. Harrison %A W.L. Ruzzo %T An improved context-free recognizer %J ACM Trans. Prog. Lang. Syst. %V 2 %N 3 %D July 1980 %P 415-462 %K done .La The well-formed substring table .Xr "well-formed substring table" of the CYK parser .Xr "CYK parser" is filled with dotted items as in an LR parser rather than with the usual non-terminals. This allows the number of objects in each table entry to be reduced considerably. Special operators are defined to handle $epsilon$- and unit rules. .Lp The authors do not employ any look-ahead in their parser; they claim that constructing the recognition triangle .Xr "recognition table" is pretty fast already and that probably more time will be spent in enumerating and analysing the resulting parse trees. They speed up the latter process by removing all useless entries before starting to generate parse trees. To this end, a top-down sweep through the triangle is performed, similar to the scheme to find all parse trees, which just marks all reachable entries without following up any of them twice. The non-marked entries are then removed (p. 443). .Lp Much attention is paid to efficient implementation, using ingenious data structures. %A A. Bossi %A N. Cocco %A L. Colussi %T A divide-and-conquer approach to general context-free parsing %J Inform. Process. Lett. %V 16 %N 4 %D May 1983 %P 203-208 %K done .La The proposed parsing method yields for a string $T$ two sets: a set of partial parse trees that may be incomplete at their left edge (which then coincides with the left end of $T$), called $L$, and a similar right-edge set called $R$. To parse a string, it is cut in two pieces, each is parsed and the $R$ set of the left-hand piece is combined with the $L$ set of the right-hand piece. %A Masaru Tomita %T Efficient parsing for natural language %I Kluwer Academic Publishers %C Boston %D 1986 %P 201 %K done .La Tomita .Xr "Tomita parser" describes an efficient parsing algorithm to be used in a \(l"natural-language setting\(r": input strings of some tens of words and considerable but not pathological ambiguity. .Xr "ambiguity" The algorithm is essentially LR, starting parallel parses when an ambiguity is found in the LR-table. .Xr "parse table" Full examples are given of handling ambiguities, lexical elements with multiple meanings and unknown lexical elements. .Lp The algorithm is compared extensively to Earley's algorithm .Xr "Earley parser" by measurement and it is found to be consistently five to ten times faster than the latter, in the domain for which it is intended. Earley's algorithm is better in pathological cases; Tomita's fails on unbounded ambiguity. .Xr "ambiguity" No time bounds are given explicitly, but graphs show a behaviour better than $ O ( n sup 3 ) $. .Xr "cubic time dependency" Bouckaert's algorithm (Bouckaert, Pirotte and Snelling [CF 1975]) .Au Bouckaert CF 1975 .Au Pirotte CF 1975 .Au Snelling CF 1975 is shown to be between Earley's .Xr "Earley parser" and Tomita's in speed. .Lp MacLisp programs of the various algorithms are given and the application in the Nishida and Doshita Machine Translation System is described. %A Eiichi Tanaka %A Mitsuru Ikeda %A Kimio Ezure %T Direct parsing %J Patt. Recog. %V 19 %N 4 %D 1986 %P 315-323 %K done .La Variants of Unger's .Xr "Unger parser" and Earley's parser .Xr "Earley parser" are compared in a chromosome recognition .Xr "chromosome recognition" situation. The possibility of stopping the Unger parser after the first parsing has been found is exploited. %A Jacques Cohen %A Timothy J. Hickey %T Parsing and compiling using Prolog %J ACM Trans. Prog. Lang. Syst. %V 9 %N 2 %D April 1987 %P 125-164 %K done .Xr "Prolog" .La Several methods are given to convert grammar rules into Prolog clauses. .Xr "Prolog clause" In the bottom-up method, a rule $ E -> E + T $ corresponds to a clause $ reduce([n(t),t(+),n(e)|X], [n(e)|X]) $ where the parameters represent the stack before and after the reduction. In the top-down method, a rule $ T prime -> * F T prime $ corresponds to a clause $ rule(n(tprime), [t(*),n(f),n(tprime)]). $ A recursive descent .Xr "recursive descent" parser is obtained by representing a rule $ S -> a S b $ by the clause $ s(ASB) $ $:-$ $ append(A,SB,ASB), append(S,B,SB), a(A), s(S), b(B). $ which attempts to cut the input list $ASB$ into three pieces $A$, $S$ and $B$, which can each be recognized as an $a$, an $s$ and a $b$, respectively. A fourth type of parser results if ranges in the input list are used as parameters: $ s(X1,X4) $ $:-$ $ link(X1,a,X2), s(X2,X3), link(X3,b,X4) $ in which $ link(P,x,Q) $ describes that the input contains the token $x$ between positions $P$ and $Q$. For each of these methods, ways are given to limit non-determinism and backtracking, .Xr "backtracking" resulting among others in LL(1) .Xr "LL(1)" parsers. .Lp By supplying additional parameters to clauses, context conditions can be constructed and carried around, much as in a VW grammar (although this term is not used). It should be noted that the resulting Prolog .Xr "Prolog" programs are actually not parsers at all: they are just logic systems that connect input strings to parsings. Consequently they can be driven both ways: supply a string and it will produce the parsing; supply a parsing and it will produce the string; supply nothing and it will produce all strings with their parsings in the language. .Lp As a separate topic, it is shown that Prolog .Xr "Prolog" is an effective language to do grammar manipulation in: calculation of FIRST .Xr "FIRST set" and FOLLOW sets, .Xr "FOLLOW set" etc. As an equally unrelated topic, examples of code generation in Prolog .Xr "Prolog" are shown. %A Masaru Tomita %T An efficient augmented-context-free parsing algorithm %J Am. J. Comput. Linguist. %V 13 %N 1-2 %D Jan-June 1987 %P 31-46 %K done .La Tomita's parser .Xr "Tomita parser" [CF 1986] .Au Tomita CF 1986 is extended with Boolean functions for the non-terminals that decide if a proposed reduce is applicable given the context. A method for deriving these functions in Lisp from more abstract specifications is given. .EQ delim $$ .EN .P1 "LL parsing" %A R. Kurki-Suonio %T On top-to-bottom recognition and left recursion %J Commun. ACM %V 9 %N 7 %D July 1966 %P 527-528 %K done .La Gives a good account of Greibach's algorithm for the removal of left-recursion .Xr "left-recursion" from a grammar. The resulting distortion of the parsing process is countered by leaving ($epsilon$-producing) markers in the grammar at the original ends of the right-hand sides in a left-recursive rule. This 2-page paper also gives an algorithm for removing $epsilon$-rules. .Xr "$epsilon$-rule" Again, these leave markers behind, which can interfere with the markers from a possibly subsequent removal of left-recursion. .Xr "left-recursion" Rules for solving this interference are given. %A K. \*(vCulik\ II %T Contribution to deterministic top-down analysis of context-free languages %J Kybernetica %V 5 %N 4 %P 422-431 %D 1968 %K done .La This paper introduces .Df LL(f) "" "LL($f$)" grammars where $f$ is a function mapping strings of terminals to an arbitrary range, always uniquely determining a right-hand side. $f$ is called a @i[distinctive function]. %A P.M. Lewis\ II %A R.E. Stearns %T Syntax-directed transduction %J J. ACM %V 15 %N 3 %D 1968 %P 465-488 %K done .La .Xr "transduction grammar" Although this article is about transduction, it is often given as a reference for LL($k$), .Xr "LL($k$)" because it is one of the first articles discussing the LL($k$) .Xr "LL($k$)" property, and it has an appendix on the recognition of LL($k$) languages. %A D. Wood %T The theory of left factored languages, Part I %J Computer J. %V 12 %N 4 %D 1969 %P 349-356 %K done .La A description of a variant of LL(1) .Xr "LL(1)" grammars and parsing. %A R. Kurki-Suonio %T Notes on top-down languages %J BIT %V 9 %N %D 1969 %P 225-238 %K done .La Gives several variants of the LL($k$) .Xr "LL($k$)" condition. Also demonstrates the existence of an LL($k$) language which is not LL($k - 1$). %A D. Wood %T The theory of left factored languages, Part II %J Computer J. %V 13 %N 1 %D 1970 %P 55-62 %K done .La More results about LL(1) .Xr "LL(1)" and LL($k$) .Xr "LL($k$)" grammars, including a recursive-descent .Xr "recursive descent" parser in pseudo-Algol 60. .Xr "Algol 60" %A D.J. Rosenkrantz %A R.E. Stearns %T Properties of deterministic top-down grammars %J Inform. Control %V 17 %N %D 1970 %P 226-256 %K done .La Many formal properties of LL($k$) .Xr "LL($k$)" grammars are derived and tests for LL($k$) and strong-LL($k$) .Xr "strong-LL($k$)" are given. %A Donald E. Knuth %T Top-down syntax analysis %J Acta Inform. %V 1 %N %D 1971 %P 79-110 %K done .La A @i[Parsing Machine] (PM) is defined, which is effectively a set of mutually recursive Boolean functions which absorb input if they succeed and absorb nothing if they fail. Properties of the languages accepted by PMs are examined. This leads to CF grammars, dependency graphs, the null string problem, back-up, LL($k$), .Xr "LL($k$)" follow-function, LL(1), .Xr "LL(1)" s-languages and a comparison of top-down versus bottom-up parsing. The author is one of the few scientists who provide insight in their thinking process. %A Paul W. Abrahams %T A syntax-directed parser for recalcitrant grammars %J Intern. J. Comput. Math. %V A3 %N %D 1972 %P 105-115 %K done .La LL(1) .Xr "LL(1)" parsing with conflict resolvers, .Xr "conflict resolver" called @i[oracles]. %A M. Griffiths %T LL(1) grammars and analyzers %B Compiler Construction: an advanced course %S Lecture Notes in Computer Science #21 %E F.L. Bauer & J. Eickel %I Springer-Verlag %C New York %D 1974 %P 57-84 %K done .La A discussion of the LL(1) .Xr "LL(1)" property, including a decision algorithm and the production of an analyser in the form of executable text. These lecture notes also discuss some grammar transformations, .Xr "transformations on grammars" including elimination of left-recursion, .Xr "eliminating left recursion" factorization, and substitution. Semantic insertions (or hooks for semantic actions) .Xr "semantic action" are also given some attention. %A T. Komor %T A note on left-factored languages %J Computer J. %V 17 %N 3 %D 1974 %P 242-244 %K done .La Points out an error in a paper by Wood on left-factored languages [LL 1970], .Au Wood LL 1970 and suggests an extension to Fosters SID [Transform 1968] .Xr "transformations on grammars" .Au Foster Transform 1968 involving $epsilon$-rules. .Xr "$epsilon$-rule" %A S. Jarzabek %A T. Krawczyk %T LL-regular grammars %J Inform. Process. Lett. %V 4 %N 2 %D 1975 %P 31-37 %K done .La Introduces LL-regular .Xr "LL-regular" (LLR) .Xs "LLR" "LL-regular" grammars: for every rule $A -> alpha sub 1 | ... | alpha sub n$, a partition ($R sub 1 , ... , R sub n$) of disjoint regular sets must be given such that the rest of the input sentence is a member of exactly one of these sets. A parser can then be constructed by creating finite-state .Xr "finite-state automaton" automata for these sets, and letting these finite state automata .Xr "finite-state automaton" determine the next prediction. %A A. Nijholt %T On the parsing of LL-regular grammars %B Mathematical Foundations of Computer Science %S Lecture Notes in Computer Science #45 %E A. Mazurkiewicz %I Springer-Verlag %C Berlin %D 1976 %P 446-452 %K done .La Derives a parsing algorithm for LL-regular .Xr "LL-regular" grammars with a regular pre-scan from right to left that leaves markers, and a subsequent scan which consists of an LL(1)-like .Xr "LL(1)" parser. %A D. Wood %T A bibliography of top-down parsing %J ACM SIGPLAN Notices %V 13 %N 2 %D Feb 1978 %P 71-76 %K done .La Contains some 90 literature references up to 1978 on deterministic top-down parsing and related issues. %A J. Lewi %A K. de\ Vlaminck %A J. Huens %A M. Huybrechts %T The ELL(1) parser generator and the error-recovery mechanism %J Acta Inform. %V 10 %P 209-228 %D 1978 %K done .La See same paper [ErrHandl 1978]. %A V.W. Setzer %T Non-recursive top-down syntax analysis %J Softw. Pract. Exper. %V 9 %N 1 %D 1979 %P 237-245 %K done .La Compares recursive and non-recursive implementations of table-driven .Xr "table-driven" top-down parsers. The introduction of actions is facilitated by implementing the driver and the tables as a loop over a case statement (on the states) over case statements (on the input token). %A Stephan Heilbrunner %T On the definition of ELR($k$) and ELL($k$) grammars %J Acta Inform. %V 11 %N %D 1979 %P 169-176 %K done .La Comparison and analysis of various definitions of extended LL($k$) .Xr "extended LL($k$)" and extended LR($k$), .Xr "extended LR($k$)" based on the transformations involved. %A D.R. Milton %A L.W. Kirchhoff %A B.R. Rowland %T An ALL(1) compiler generator %J ACM SIGPLAN Notices %V 14 %N 8 %D Aug 1979 %P 152-157 %K done .La Presents an LL(1) .Xr "LL(1)" parser generator .Xr "parser generator" and attribute .Xr "attribute" evaluator which allows LL(1) conflicts to be solved by examining attribute values; the generated parsers use the error correction .Xr "error correction" algorithm of Fischer, Milton and Quiring [ErrHandl 1980]. .Au Fischer ErrHandl 1980 .Au Milton ErrHandl 1980 .Au Quiring ErrHandl 1980 %A D.A. Poplawski %T On LL-regular grammars %J J. Comput. Syst. Sci. %V 18 %N %D 1979 %P 218-227 %K done .La Presents proof that, given a regular partition, it is decidable whether a grammar is LL-regular .Xr "LL-regular" with respect to this partition; it is undecidable whether or not such a regular partition exists. The paper then discusses a two-pass parser; the first pass works from right to left, marking each terminal with an indication of the partition that the rest of the sentence belongs to. The second pass then uses these indications for its predictions. %A V.N. Glushkova %T Lexical analysis of LL($k$) languages %J Program. Comput. Softw. %V 5 %N %D 1979 %P 166-172 %K done .La Examines the reduction of LL($k$) .Xr "LL($k$)" grammars to simple-LL(1) .Xr "simple LL(1)" grammars by combining terminal symbols into new terminal symbols. %A J. Cohen %A R. Sitver %A D. Auty %T Evaluating and improving recursive descent parsers %J IEEE Trans. Softw. Eng. %V SE-5 %N 5 %D Sept 1979 %P 472-480 %K done .La Derives formulas which express the execution time of systematically generated recursive descent parsers, .Xr "recursive descent" and uses these formulas to estimate the gain of various optimizations, such as the elimination of some routine calls and merging of common code. %A S. Sippu %A E. Soisalon-Soininen %T On constructing LL($k$) parsers %B Automata, Languages and Programming %S Lecture Notes in Computer Science #71 %E H.A. Maurer %I Springer-Verlag %C Berlin %D 1979 %P 585-595 %K done .La Presents a method for constructing canonical LL($k$) .Xr "LL($k$)" parsers that can be regarded as the dual to the LR($k$) .Xr "LR($k$)" technique of items and viable prefixes. In the LL($k$) .Xr "LL($k$)" method we have LL($k$) items .Xr "LL($k$) item" and viable suffixes. .Xr "viable suffix" Like in the LR case, the LL($k$) method also has LA($p$)LL($k$) and SLL($k$) .Xr "SLL(1) (Simple LL(1))" variants; the SLL($k$) variant coincides with the strong-LL($k$) .Xr "strong-LL($k$)" grammars. Note that, although the S of SLL stands for Simple, this is not the same Simple LL as the simple LL discussed in chapter 8. %A A. Nijholt %T LL-regular grammars %J Intern. J. Comput. Math. %V A8 %N %D 1980 %P 303-318 %K done .La This paper discusses strong-LL-regular grammars, which are a subset of the LL-regular .Xr "LL-regular" grammars, exactly as the strong-LL($k$) .Xr "strong-LL($k$)" grammars are a subset of the LL($k$) .Xr "LL($k$)" grammars, and derives some properties. %A Seppo Sippu %A Eljas Soisalon-Soininen %T On LL($k$) parsing %J Inform. Control %V 53 %N 3 %D June 1982 %P 141-164 %K done .Xr "LL($k$)" .La Theoretical background to Sippu and Soisalon-Soininen [LL 1979]. .Au Sippu LL 1979 .Au Soisalon-Soininen LL 1979 %A K. John Gough %T A new method of generating LL(1) look-ahead sets %J ACM SIGPLAN Notices %V 20 %N 6 %D June 1985 %P 16-19 %K done .La Presents an efficient method for computing the FIRST .Xr "FIRST set" and FOLLOW sets, .Xr "FOLLOW set" using \(l"begun-by\(r", \(l"precedes\(r", and \(l"ends\(r" relations. %A Thomas J. Sager %T A technique for creating small fast compiler front ends %J ACM SIGPLAN Notices %V 20 %N 10 %D Oct 1985 %P 87-94 %K done .La Presents a predictive parser that has its tables compacted .Xr "table compression" .Xs "table compaction" "table compression" through the use of a minimal perfect hash function, thus making them very small. An example is given for the Pascal .Xr "Pascal" language. %A Barry Dwyer %T Improving Gough's LL(1) look-ahead generator %J ACM SIGPLAN Notices %V 20 %N 11 %D Nov 1985 %P 27-29 %K done .La Refer to Gough [LL 1985]. .Au Gough LL 1985 Improves on Gough's algorithm by not computing those FIRST .Xr "FIRST set" and FOLLOW sets .Xr "FOLLOW set" that are not needed for the LL(1) .Xr "LL(1)" parser generation. %A David R. Hanson %T Compact recursive-descent parsing of expressions %J Softw. Pract. Exper. %V 15 %N 12 %D Dec 1985 %P 1205-1212 %K done .La Discusses recursive descent .Xr "recursive descent" parsing of expressions by using a precedence table .Xr "precedence table" for the operators instead of a parsing routine for each precedence level. There is for instance only one routine for expressions involving binary operators; the precedence of the expression to be parsed is a parameter. %A Reinhold Heckmann %T An efficient ELL(1)-parser generator %J Acta Inform. %V 23 %N %D 1986 %P 127-148 %K done .La The problem of parsing with an ELL(1) .Xr "extended LL(1)" grammar is reduced to finding various FIRST .Xr "FIRST set" and FOLLOW sets. .Xr "FOLLOW set" Theorems about these sets are derived and very efficient algorithms for their calculation are supplied. %A Dick Grune %A Ceriel J.H. Jacobs %T A programmer-friendly LL(1) parser generator %J Softw. Pract. Exper. %V 18 %N 1 %D Jan 1988 %P 29-38 %K done .La Presents a practical ELL(1) .Xr "extended LL(1)" parser generator, .Xr "parser generator" called @i[LLgen], that generates fast error correcting recursive descent parsers. .Xr "recursive descent" In addition to the error correction, .Xr "error correction" @i[LLgen] features static as well as dynamic conflict resolvers .Xr "conflict resolver" and a separate compilation facility. The grammar can be viewed as a program, allowing for a natural positioning of semantic actions. .Xr "semantic action" %A Keiichi Yoshida %A Yoshiko Takeuchi %T Some properties of an algorithm for constructing LL(1) parsing tables using production indices %J J. Inform. Process. %V 11 %N 4 %D 1988 %P 258-262 %K done .La Presents an LL(1) .Xr "LL(1)" parse table .Xr "parse table" algorithm that, rather than first computing FIRST .Xr "FIRST set" and FOLLOW sets, .Xr "FOLLOW set" computes a so-called FIRST-table and FOLLOW-table, which are indexed by a (non-terminal, symbol) pair, and deliver a grammar rule number. %A H. Dobler %A K. Pirklbauer %T Coco-2, \- A new compiler compiler %J ACM SIGPLAN Notices %V 25 %N 5 %D May 1990 %P 82-90 %K done .La The authors present an integrated system consisting of a lexical phase using a heavily reduced FS automaton, .Xr "finite-state automaton" and a syntactic phase which uses a table-driven .Xr "table-driven" LL(1) .Xr "LL(1)" parser. Semantic actions .Xr "semantic action" are interspersed in the syntactic phase. .EQ delim $$ .EN .P1 "LR parsing" %A D.E. Knuth %T On the translation of languages from left to right %J Inform. Control %V 8 %D 1965 %P 607-639 %K done .La This is the original paper on LR($k$). .Xr "LR($k$)" It defines the notion as an abstract property of a grammar and gives two tests for LR($k$). .Xr "LR($k$)" The first works by constructing for the grammar a regular grammar which generates all possible already reduced parts (= stacks) plus their look-aheads; if this grammar has the property that none of its words is a prefix to another of its words, the original grammar was LR($k$). .Xr "LR($k$)" The second consists of implicitly constructing all possible item sets (= states) and testing for conflicts. Since none of this is intended to yield a reasonable parsing algorithm, notation and terminology differs from that in later papers on the subject. Several theorems concerning LR($k$) grammars are given and proved. %A A.J. Korenjak %T A practical method for constructing LR($k$) processors %J Commun. ACM %V 12 %N 11 %D Nov 1969 %P 613-623 %K done .Xr "LR($k$)" .La The huge LR(1) .Xr "LR(1)" parsing table is partitioned as follows. A non-terminal $Z$ is chosen judiciously from the grammar, and two grammars are constructed, $ G sub 0 $, in which $Z$ is considered to be a terminal symbol, and $ G sub 1 $, which is the grammar for $Z$ (i.e. which has $Z$ as the start symbol). If both grammars are LR(1) and moreover a master LR(1) parser can be constructed that controls the switching back and forth between $ G sub 0 $ and $ G sub 1 $, the parser construction succeeds (and the original grammar was LR(1) too). The three resulting tables together are much smaller than the LR(1) table for the original grammar. It is also possible to chose a set of non-terminals $ Z sub 1 ... Z sub n $ and apply a variant of the above technique. %A David Pager %T A solution to an open problem by Knuth %J Inform. Control %V 17 %N %D 1970 %P 462-473 %K done .La Highly mathematical paper concerning the properties of certain partitions of the states of an LR(1) .Xr "LR(1)" parser with a view to reducing the size of the LR automaton. %A H. Langmaack %T Application of regular canonical systems to grammars translatable from left to right %J Acta Inform. %V 1 %N %D 1971 %P 111-114 %K done .La Different proof of the decidability of LR($k$). .Xr "LR($k$)" %A Franklin L. DeRemer %T Simple LR($k$) grammars %J Commun. ACM %V 14 %N 7 %D July 1971 %P 453-460 %K done .La SLR($k$) .Xr "SLR($k$)" explained by its inventor. Several suggestions are made on how to modify the method; use a possibly different $k$ for each state; use possibly different lengths for each look-ahead string. The relation to Korenjak's approach [LR 1969] .Au Korenjak LR 1969 is also discussed. %A A.V. Aho %A J.D. Ullman %T Optimization of LR($k$) parsers %J J. Comput. Syst. Sci. %V 6 %N %D 1972 %P 573-602 %K done .La An algorithm is given to determine which entries in an LR($k$) .Xr "LR($k$)" table can never be accessed; the values of these entries are immaterial (so-called .Df "don't-care entries" ) "don't-care entry" and can be merged with other values. A second algorithm is given to determine which error entries could be merged with which reduce entry, with the only result that error detection .Xr "error detection" is postponed. Both algorithms and a merging technique are used to reduce table size. It is proved that using these algorithms, one can produce SLR(1) .Xr "SLR(1)" and LALR(1) .Xr "LALR(1)" tables. It is also proved that SLR(1) .Xr "SLR(1)" is identical to Korenjak's method [LR 1969] .Au Korenjak LR 1969 with all non-terminals selected. See also Soisalon-Soininen [LR 1982]. .Au Soisalon-Soininen LR 1982 %A David S. Wise %T Generalized overlap resolvable grammars and their parsers %J J. Comput. Syst. Sci. %V 6 %N %D Dec 1972 %P 538-572 %K done .La See same paper [Precedence 1972]. %A T. Anderson %A J. Eve %A J.J. Horning %T Efficient LR(1) parsers %J Acta Inform. %V 2 %N %D 1973 %P 12-39 %K done .La Coherent explanation of SLR(1), .Xr "SLR(1)" LALR(1), .Xr "LALR(1)" elimination of unit rules .Xr "unit rule" and table compression, .Xr "table compression" with good advice. %A Karel \*(vCulik\ II %A Rina Cohen %T LR-regular grammars \- an extension of LR($k$) grammars %J J. Comput. Syst. Sci. %V 7 %N %D 1973 %P 66-96 %K done .La The input is scanned from right to left by a FS automaton .Xr "finite-state automaton" which records its state at each position. Next this sequence of states is parsed from left to right using an LR(0) .Xr "LR(0)" parser. If such a FS automaton .Xr "finite-state automaton" and LR(0) parser exist, the grammar is LR-regular. .Xr "LR-regular" The authors conjecture, however, that it is unsolvable to construct this automaton and parser. Examples are given of cases in which the problem can be solved. %A A.V. Aho %A J.D. Ullman %T A technique for speeding up LR($k$) parsers %J SIAM J. Computing %V 2 %N 2 %D June 1973 %P 106-127 %K done .Xr "LR($k$)" .La Describes two detailed techniques to eliminate unit rules, .Xr "unit rule" one by recognizing particular stack configurations and one by merging shifts on non-terminals (GOTO's). .Xr "GOTO-action" %A Shoji Sekimoto %A Kuniaki Mukai %A Masaru Sudo %T A method of minimizing LR($k$) parsers %J Systems, Computers and Control %V 4 %N 5 %D 1973 %P 73-80 %K done .Xr "LR($k$)" .La The states of an LR(1) .Xr "LR(1)" parser are grouped into classes by one of several equivalence relations. The parser records only in which class it is, not in which state. When a reduction is called for, additional computation is required to determine which reduction. The tables for the class transitions are much smaller than those for the state transitions. .Xr "state transition" %A Jaroslav Kr\('al %A Ji\*(vr\('i Demner %T A note on the number of states of the DeRemer's recognizer %J Inform. Process. Lett. %V 2 %N %D 1973 %P 22-23 %K done .La Gives a formula for the number of states of an SLR(1) .Xr "SLR(1)" parser for an LL(1) .Xr "LL(1)" grammar. %A A.V. Aho %A S.C. Johnson %T LR parsing %J Computing Surveys %V 6 %N 2 %D 1974 %P 99-124 %K done .La LR parsing explained in a readable fashion, by the experts. Required reading. %A Matthew M. Geller %A Susan L. Graham %A Michael A. Harrison %T Production prefix parsing %B Automata, languages and programming %S Lecture Notes in Computer Science #14 %E J. Loeckx %I Springer-Verlag %C Berlin %D 1974 %P 232-241 %K done .La The items in a non-deterministic LR(0|1) automaton .Xr "non-deterministic automaton" are simplified in that rather than $A -> beta "\*(Dt" gamma$ only $beta$ (the .Df "production prefix" ) is recorded. If the corresponding deterministic automaton .Xr "deterministic automaton" is free of conflicts and has no compound states (that is, each state contains only one production prefix) .Xr "production prefix" the grammar is a @i[production prefix grammar]. Table size is proportional to grammar size. Production prefix(1) .Xr "production prefix" is between simple precedence .Xr "simple precedence" and SLR(1) .Xr "SLR(1)" in power. %A David Pager %T On eliminating unit productions from LR($k$) parsers %B Automata, languages and programming %S Lecture Notes in Computer Science #14 %E J. Loeckx %I Springer-Verlag %C Berlin %D 1974 %P 242-254 %K done .Xr "LR($k$)" .La The unit rules (and only the unit rules) .Xr "unit rule" of the grammar are collected in a directed graph, which is a set of multi-rooted trees (no cycles allowed). For each leaf, the states of all its predecessors are contracted. %A J.J. Horning %T LR grammars and analyzers %B Compiler Construction, an Advanced Course %S Lecture Notes in Computer Science #21 %E F.L. Bauer & J. Eickel %I Springer-Verlag %C New York %D 1974 %P 85-108 %K done .La These lecture notes present a concise discussion of LR($k$) .Xr "LR($k$)" grammars and LR(0), .Xr "LR(0)" SLR(1) .Xr "SLR(1)" (more restrictive adding of reduce entries by using FOLLOW sets), .Xr "FOLLOW set" LALR(1) .Xr "LALR(1)" (using shift entries to determine state after reduce), and LR(1) .Xr "LR(1)" (adding look-ahead to items) constructor algorithms. Also some attention is given to the representation of LR tables, including some compactification techniques. .Xr "table compression" %A Paul Purdom %T The size of LALR(1) parsers %J BIT %V 14 %N %D 1974 %P 326-337 %K done .Xr "LALR(1)" .La Experimental size analysis for LALR(1) parsers. Although parser size can be exponential in the grammar size, it is found in practice to be linear in the grammar size. %A Hans H. Kron %A Hans-J\(:urgen Hoffman %A Gerhard Winkler %T On a SLR($k$)-based parser system which accepts non-LR($k$) grammars %B GI-4. Jahrestagung %S Lecture Notes in Computer Science #26 %E D. Siefkes %I Springer-Verlag %C New York %D 1975 %P 214-223 %K done .Xr "SLR($k$)" .Xr "LR($k$)" .La For each inadequate state .Xr "inadequate state" in an LR(0) automaton, .Xr "LR(0) automaton" a resolution tree is constructed of maximum depth $k$. If this construction succeeds, the grammar is of type .Df FSLR($k$) \&. If it fails, a parser is generated that performs breadth-first search .Xr "breadth-first search" to resolve the remaining inadequacies. .Xr "inadequate state" Detailed algorithms are given. %A A.J. Demers %T Elimination of single productions and merging of non-terminal symbols in LR(1) grammars %J Comput. Lang. (Elmsford, NY) %V 1 %N 2 %D April 1975 %P 105-119 %K done .Xr "LR(1)" .La The unit rules .Xr "unit rule" are used to define subsets of the non-terminals, the members of which can be treated as equivalent, similar to Aho and Ullman [LR 1973]. .Au Aho LR 1973 .Au Ullman LR 1973 Explicit proofs are given. %A Harry B. Hunt\ III %A Thomas G. Szymanski %A Jeffrey D. Ullman %T On the complexity of LR($k$) testing %J Commun. ACM %V 18 %N 12 %D Dec 1975 %P 707-716 %K done .Xr "LR($k$)" .La .Xr "time requirements" Time bounds as a function of the grammar size are derived for testing many properties of grammars. A practical result is that both the LL($k$) .Xr "LL($k$)" and the LR($k$) .Xr "LR($k$)" properties can be tested in $ O ( n sup { k + 2 } ) $. These and other bounds given in the paper are upper bounds, and actual testing is often much faster. %A O.L. Madsen %A B.B. Kristensen %T LR-parsing of extended context-free grammars %J Acta Inform. %V 7 %N 1 %D 1976 %P 61-73 %K done .La The right parts are allowed to contain choices {$omega sub 1$|$...$|$omega sub n$} and repetitions $roman "{" omega roman "}" sup "*" $. In addition to the dotted items in the LR sets, there are also @i[marked] items, which have a $#$ rather than a \*(Dt. .Xr "dot" The $#$ means one of three things: here starts a repetition, one element of a repetition has just been recognized or one member of a choice has just been recognized. Upon reduction, these marked items will tell how to unstack the entire right-hand side. %A R.C. Backhouse %T An alternative approach to the improvement of LR($k$) parsers %J Acta Inform. %V 6 %N 3 %D 1976 %P 277-296 %K done .Xr "LR($k$)" .La Traditionally, the field of bottom-up parsing is described in terms of handle-finding .Xr "handle" automata. The author describes it in terms of left-contexts, in which a .Df "left-context" is a set of stack configurations of the LR($k$) .Xr "LR($k$)" parser. Other bottom-up techniques are explained as approximations to these sets. %A Thomas G. Szymanski %A John H. Williams %T Non-canonical extensions of bottom-up parsing techniques %J SIAM J. Computing %V 5 %N 2 %D June 1976 %P 231-250 %K done .La Theory of non-canonical .Xr "non-canonical parsing" versions of several bottom-up parsing techniques, with good informal introduction. %A Marc L. Joliat %T A simple technique for partial elimination of unit productions from LR($k$) parsers %J IEEE Trans. Comput. %V C-25 %N 7 %D July 1976 %P 763-764 %K done .Xr "LR($k$)" .La A very simple algorithm is given that alters some of the transitions in an LR parse table .Xr "parse table" to bypass unit rules. .Xr "unit rule" %A M.M. Geller %A M.A. Harrison %T Characteristic parsing: a framework for producing compact deterministic parsers %J J. Comput. Syst. Sci. %V 14 %N 3 %D June 1977 %P 265-317 %K done .Xr "characteristic parsing" .La Given a deterministic LR(1) automaton, .Xr "LR(1) automaton" .Xr "deterministic automaton" suppose we add some (arbitrary) items to some states. This will have two effects: the discriminatory power of the automaton will weaken and its minimum size will decrease (since now some states will coincide). For a large number of grammars there is a @i[characteristic] item addition technique that will minimize automaton size while preserving just enough power. This requires a heavy mathematical apparatus. %A Matthew M. Geller %A Michael A. Harrison %T On LR($k$) grammars and languages %J Theoret. Comput. Sci. %V 4 %N %D 1977 %P 245-276 %K done .Xr "LR($k$)" .La Theoretical groundwork for the \(l"characteristic parsing technique\(r" .Xr "characteristic parsing" of Geller and Harrison [LR June 1977]. %A D. Pager %T The lane-tracing algorithm for constructing LR($k$) parsers and ways of enhancing its efficiency %J Inform. Sci. %V 12 %N %D 1977 %P 19-42 %K done .Xr "LR($k$)" .La An item $A -> beta "\*(Dt" X gamma$ .Xr "dot" in an LR parser (called a \(l"configuration\(r" here) has in general two kinds of successors: .Xr "successor" a set of \(l"immediate successors\(r" .Xs "immediate successor" "successor" $X -> "\*(Dt" xi sub n$ and the \(l"transition successor\(r" .Xs "transition successor" "successor" $A -> beta X "\*(Dt" gamma$. An item together with a sequence of its successive successors is called a .Df "lane" \&. Lanes are used 1) to collect enough look-ahead context to convert an LR(0) automaton .Xr "LR(0) automaton" to LALR(1); .Xr "LALR(1)" 2) to determine which LALR(1) states should be split to resolve remaining LALR(1) conflicts. The required algorithms are of considerable complexity. %A Wilf R. LaLonde %T Regular right part grammars and their parsers %J Commun. ACM %V 20 %N 10 %D Oct 1977 %P 731-741 %K done .La The notion of regular right part .Xr "regular right part grammar" grammars and its advantages are described in detail. A parser is proposed that does LR($k$) .Xr "LR($k$)" parsing to find the right end of the handle .Xr "handle" and then, using different parts of the same table, scans the stack backwards using a look-ahead (to the left!) of $m$ symbols to find the left end; this is called .Df "LR($m$, $k$)" "." The corresponding parse table .Xr "parse table" construction algorithm is given by LaLonde [LR 1979]. .Au LaLonde LR 1979 %A David Pager %T A practical general method for constructing LR($k$) parsers %J Acta Inform. %V 7 %N 3 %D 1977 %P 249-268 %K done .Xr "LR($k$)" .La When during the construction of an LR(1) .Xr "LR(1)" parser a state has to be added, one can consider merging it with an already existing state, if no conflict can arise from this. The problem is that it is not easy to tell whether conflicts may arise from a certain merge. To this end, the notions .Df "weak compatibility" and .Df "strong compatibility" are defined. Algorithms for the efficient construction of conflict-free small full LR(1) parse tables .Xr "parse table" are given. %A D. Pager %T Eliminating unit productions from LR($k$) parsers %J Acta Inform. %V 9 %N %D 1977 %P 31-59 %K done .Xr "LR($k$)" .La Very detailed description of a unit rule .Xr "unit rule" elimination algorithm. %A A. Celentano %T Incremental LR parsers %J Acta Inform. %V 10 %N %D 1978 %P 307-321 %K done .Xr "incremental parsing" .La Very clear exposition of how the Ghezzi and Mandrioli algorithm [LR 1979] .Au Ghezzi LR 1979 .Au Mandrioli LR 1979 can be made to work on parse sequences rather than on parse trees, thus improving efficiency. %A Stephen C. Johnson %T YACC: yet another compiler-compiler %I Bell Laboratories %C Murray Hill, New Jersey 07974 %D 1978 %P 34 %K done .La In spite of its title, @i[yacc] .Xr "@i[yacc]" is one of the most widely used parser generators. .Xr "parser generator" It generates LALR(1) .Xr "LALR(1)" parsers from a grammar with embedded semantic actions .Xr "semantic action" and features a number of disambiguating .Xr "disambiguating rule" and conflict-resolving mechanisms. The generated parser is in C. .Xr "C" %A Akifumi Makinouchi %T On single production elimination in simple LR($k$) environment %J J. Inform. Process. %V 1 %N 2 %D 1978 %P 76-80 %K done .Xr "LR($k$)" .La An SLR(1) .Xr "SLR(1)" parser is extended with the possibility of specifying grammar rules of the form $ \(no roman "{" C sub l roman "}" A \(no roman "{" C sub r roman "}" -> ... $, which can only be applied when the symbol before the $A$ cannot produce a member of $ roman "{" C sub l roman "}" $ as its last token, and the token after $A$ is not in $ roman "{" C sub r roman "}" $. Such rules allow some convenient ambiguities to be resolved without loosing the generative power of the system. %A W.R. LaLonde %T Constructing LR parsers for regular right part grammars %J Acta Inform. %V 11 %N %D 1979 %P 177-193 %K done .La Describes the algorithms for the regular right part .Xr "regular right part grammar" parsing technique explained by LaLonde [LR 1977]. The back scan is performed using so-called .Df "read-back tables" \&. Compression techniques .Xr "table compression" for these tables are given. %A Stephan Heilbrunner %T On the definition of ELR($k$) and ELL($k$) grammars %J Acta Inform. %V 11 %N %D 1979 %P 169-176 %K done .La See same paper [LL 1979]. %A Otto Mayer %T On deterministic canonical bottom-up parsing %J Inform. Control %V 43 %N %D 1979 %P 280-303 %K done .La A general framework is presented for deterministic canonical bottom-up parsers, from which well-known parsers arise as special cases. %A Carlo Ghezzi %A Dino Mandrioli %T Incremental parsing %J ACM Trans. Prog. Lang. Syst. %V 1 %N 1 %D July 1979 %P 58-70 %K done .Xr "incremental parsing" .La The authors observe that when a grammar allows bottom-up parsing using some technique $T$ and is at the same time RL($k$) for any $k$, then any modification to the input text can only affect nodes that produce the modified part. By keeping the entire parse tree in a both left-most and right-most threaded form, these nodes can be located and updated quickly. The case LR(1) \(AN RL(1) .Xr "RL(1)" is treated in full. %A Kai Koskimies %A Eljas Soisalon-Soininen %T On a method for optimizing LR parsers %J Intern. J. Comput. Math. %V A7 %N %D 1979 %P 287-295 %K done .La Defines criteria under which Pager's algorithm for the elimination of unit rules [LR 1977] .Au Pager LR 1977 .Xr "unit rule" can be safely applied to SLR(1) .Xr "SLR(1)" parsers. %A Kuo-Chung Tai %T Noncanonical SLR(1) grammars %J ACM Trans. Prog. Lang. Syst. %V 1 %N 2 %D Oct 1979 %P 295-320 %K done .La A survey of non-canonical parsing methods .Xr "non-canonical parsing" is given and two non-canonical variants of SLR(1) .Xr "SLR(1)" parsing are described. %A Gerald A. Fisher\ Jr. %A Manfred Weber %T LALR(1) parsing for languages without reserved words %J ACM SIGPLAN Notices %V 14 %N 11 %D Nov 1979 %P 26-30 %K done .Xr "reserved word" .La A heuristic is given for designing an LALR(1) .Xr "LALR(1)" programming language without reserved words. .Xr "reserved word" First design the LALR(1) language @i[with] reserved words, using a non-terminal @t[identifier] for the identifiers. Now allow @t[identifier] to also produce all reserved words and modify the grammar (or the language) until the grammar is LALR(1) again, using feedback from an LALR(1) parser generator. .Xr "parser generator" %A Eljas Soisalon-Soininen %T On the space-optimizing effect of eliminating single productions from LR parsers %J Acta Inform. %V 14 %N %D 1980 %P 157-174 %K done .La Improvement of Pager's unit rule .Xr "unit rule" elimination algorithm [LR 1977]. .Au Pager LR 1977 %A Carlo Ghezzi %A Dino Mandrioli %T Augmenting parsers to support incrementality %J J. ACM %V 27 %N 3 %D 1980 %P 564-579 %K done .Xr "incremental parsing" .La The algorithm of Ghezzi and Mandrioli [LR 1979] .Au Ghezzi LR 1979 .Au Mandrioli LR 1979 is extended to all LR($k$) .Xr "LR($k$)" grammars. %A Jacek Witaszek %T The LR(k) parser %B Mathematical Foundations of Computer Science %S Lecture Notes in Computer Science #88 %E P. Dembi\*('nski %I Springer-Verlag %C New York %D 1980 %P 686-697 %K done .La Three size-reducing transformations on LR($k$) .Xr "LR($k$)" tables are defined that leave the LR($k$) property undisturbed. One is similar to minimising a FS automaton, .Xr "finite-state automaton" one removes unused look-ahead and one allows delaying error detection. .Xr "error detection" No full algorithms given, but see Witaszek [LR 1988]. .Au Witaszek LR 1988 %A Bent Bruun Kristensen %A Ole Lehrmann Madsen %T Methods for computing LALR($k$) lookahead %J ACM Trans. Prog. Lang. Syst. %V 3 %N 1 %D Jan 1981 %P 60-82 %K done .La The LALR($k$) .Xr "LALR($k$)" look-ahead sets are seen as the solution to a set of equations, which are solved by recursive traversal of the LR(0) automaton. .Xr "LR(0) automaton" Full algorithms plus proofs are given. %A R. Kemp %T LR(0) grammars generated by LR(0) parsers %J Acta Inform. %V 15 %N %D 1981 %P 265-280 %K done .Xr "LR(0)" .La Theoretical analysis of the set of LR(0) grammars that produce a given LR(0) parser. %A Theodore P. Baker %T Extending look-ahead for LR parsers %J J. Comput. Syst. Sci. %V 22 %N 2 %D 1981 %P 243-259 %K done .La A FS automaton .Xr "finite-state automaton" is derived from the LR automaton as follows: upon a reduce to $A$ the automaton moves to all states that have an incoming arc marked $A$. This automaton is used for analysing the look-ahead as in an LR-regular .Xr "LR-regular" parser (\*(vCulik and Cohen [LR 1973]). %A Stephan Heilbrunner %T A parsing automata approach to LR theory %J Theoret. Comput. Sci. %V 15 %N %D 1981 %P 117-157 %K done .La Parsing is explained in terms of .Df "item grammars" , "item grammar" which describe the stack configurations of the parser. The theory is first developed for LR and then applied uniformly to LL and LC. %A Wilf R. LaLonde %T The construction of stack-controlling LR parsers for regular right part grammars %J ACM Trans. Prog. Lang. Syst. %V 3 %N 2 %D April 1981 %P 168-206 %K done .La Traditional LR parsers shift each input token onto the stack; often, this shift could be replaced by a state transition, .Xr "state transition" indicating that the shift has taken place. Such a parser is called a .Df "stack-controlling LR parser" , and will do finite-state recognition without stack manipulation whenever possible. Algorithms for the construction of stack-controlling LR parse tables .Xr "parse table" are given. The paper is complicated by the fact that the new feature is introduced not in a traditional LR parser, but in an LR parser for regular right parts .Xr "regular right part grammar" (for which see LaLonde [LR 1977]). .Au LaLonde LR 1977 %A Charles Wetherell %A A. Shannon %T LR \- automatic parser generator and LR(1) parser %J IEEE Trans. Softw. Eng. %V SE-7 %N 3 %D May 1981 %P 274-278 %K done .La This short paper discusses a full LR(1) .Xr "LR(1)" parser generator .Xr "parser generator" and parser, written in ANSI 66 Fortran for portability, and using an algorithm by Pager [LR 1977]. .Au Pager LR 1977 %A Augusto Celentano %T An LR parsing technique for extended context-free grammars %J Comput. Lang. (Elmsford, NY) %V 6 %N 2 %D 1981 %P 95-107 %K done .La The results of repetitions or selections are popped off the parsing stack before the entire right-hand side has been recognized. Remarkably, this can be done for any extended LR(1) .Xr "LR(1)" grammar. Explicit algorithms are given. %A Paul W. Purdom %A Cynthia A. Brown %T Parsing extended LR($k$) grammars %J Acta Inform. %V 15 %N %D 1981 %P 115-127 %K done .La An LR state is stacked only at the beginning of a right-hand side; all other work is done on a global state. At a reduce, the reduced non-terminal is already on the top of the stack and needs only to be unstacked. This does not work for all extended LR($k$) .Xr "extended LR($k$)" grammars, but any extended LR($k$) can be converted into one for which the method works. %A Takehiro Tokuda %T Eliminating unit reductions from LR($k$) parsers using minimum contexts %J Acta Inform. %V 15 %N %D 1981 %P 447-470 %K done .Xr "LR($k$)" .La Very densely written analysis of algorithms for the elimination of unit rules .Xr "unit rule" from a special class of LR($k$) parsers. %A Colin Burgess %A Laurence James %T An indexed bibliography for LR grammars and parsers %J ACM SIGPLAN Notices %V 16 %N 8 %D Aug 1981 %P 14-26 %K done .La Useful, detailed and structured bibliography containing around 115 entries. %A David Spector %T Full LR(1) parser generation %J ACM SIGPLAN Notices %V 16 %N 8 %D Aug 1981 %P 58-66 %K done .Xr "LR(1)" .La A heuristic algorithm for enlarging an LR(0) .Xr "LR(0)" table to full LR(1) is given and demonstrated on two examples. With letter of correction (vol. 16, no. 11, Nov 1981, p. 2). See also Ancona, Dodero and Gianuzzi [LR 1982] .Au Ancona LR 1982 .Au Dodero LR 1982 .Au Gianuzzi LR 1982 and Spector [LR 1988]. .Au Spector LR 1988 %A M. Ancona %A V. Gianuzzi %T A new method for implementing LR($k$) tables %J Inform. Process. Lett. %V 13 %N 4/5 %D 1981 %P 171-176 %K done .Xr "LR($k$)" .La For each inadequate state .Xr "inadequate state" there is a separate automaton handling that inadequacy by doing a look-ahead of one token. If this automaton has inadequate states the process is repeated. A tables construction algorithm is given. %A Eljas Soisalon-Soininen %T Inessential error entries and their use in LR parser optimization %J ACM Trans. Prog. Lang. Syst. %V 4 %N 2 %D April 1982 %P 179-195 %K done .La More sophisticated and general algorithms are given for the techniques described by Aho and Ullman [LR 1972]. .Au Aho LR 1972 .Au Ullman LR 1972 %A M. Ancona %A G. Dodero %A V. Gianuzzi %T Building collections of LR($k$) items with partial expansion of lookahead strings %J ACM SIGPLAN Notices %V 17 %N 5 %D May 1982 %P 24-28 %K done .Xr "LR($k$)" .La In addition to the usual terminals, non-terminals are allowed in the look-ahead sets, leading to very substantial savings in the number of states. Only if an inadequate state .Xr "inadequate state" turns up the non-terminals are developed as far as needed to resolve the inadequacy. The algorithm will also work reasonably for $k>1$. %A J.C.H. Park %T A new LALR formalism %J ACM SIGPLAN Notices %V 17 %N 7 %D July 1982 %P 47-61 %K done .La Simplified operators corresponding to Predict and Accept are defined precisely and applied to LR and LALR parser generation. Difficult to read. %A Frank DeRemer %A Thomas J. Pennello %T Efficient computation of LALR(1) look-ahead sets %J ACM Trans. Prog. Lang. Syst. %V 4 %N 4 %D Oct 1982 %P 615-649 %K done .La 1. The LALR(1) .Xr "LALR(1)" look-ahead sets are calculated by four linear sweeps over the LR(0) automaton, .Xr "LR(0) automaton" calculating the sets Direct Read, Read, Follow and Look-Ahead, respectively. 2. An obvious simplification leads to \(l"Not Quite LALR(1)\(r", .Df "NQLALR(1)" , and is shown to be inadequate. 3. The debugging of non-LALR(1) grammars is treated. %A Colin Burgess %A Laurence James %T A revised index bibliography for LR grammars and parsers %J ACM SIGPLAN Notices %V 17 %N 12 %D Dec 1982 %P 18-26 %K nieuw .La A revision of Burgess and James [LR 1981], .Au Burgess LR 1981 .Au James 1981 extending the number of entries to about 160. %A Jorma Tarhio %T LR parsing of some ambiguous grammars %J Inform. Process. Lett. %V 14 %N 3 %D 1982 %P 101-103 %K done .La .Xr "ambiguity" The reduction items in all inadequate states .Xr "inadequate state" are collected. The rules in them are extended at the end with \(l"synchronization symbols\(r", to make the shift/reduce .Xr "shift/reduce conflict" and reduce/reduce conflicts .Xr "reduce/reduce conflict" go away. These synchronization symbols are context-dependent; for instance each identifier could be followed by a token indicating its type. The synchronization symbols are inserted in the input stream by the lexical analyser while parsing. %A Rakesh Agrawal %A Keith D. Detro %T An efficient incremental LR parser for grammars with epsilon productions %J Acta Inform. %V 19 %N 4 %D 1983 %P 369-376 %K done .La A linear time and space .Xr "space requirements" implementation of Celentano's algorithm [LR 1978] .Au Celentano LR 1978 is described, which can also handle $epsilon$-rules. .Xr "$epsilon$-rule" %A Takehiro Tokuda %T A fixed-length approach to the design and construction of bypassed LR($k$) parsers %J J. Inform. Process. %V 6 %N 1 %D 1983 %P 23-30 %K done .La The idea of removing unit reductions is extended to removing @i[all] reductions that do not involve semantic actions; .Xr "semantic action" this leads to .Df "bypassed LR($k$) parsers" \&. "bypassed LR($k$) parser" Full algorithms are given. Some of the literature on removing unit rules .Xr "unit rule" is analysed critically. %A Dashing Yeh %T On incremental shift-reduce parsing %J BIT %V 23 %N 1 %D 1983 %P 36-48 %K done .Xr "incremental parsing" .La The input tokens to an LR parser are stored in a linked list; each node in this list also holds a pointer to a stack pertinent for the token in the node. These stacks can be merged and are in fact also stored in the nodes. This arrangement greatly simplifies incremental parsing. Very clear explanation. %A Kenzo Inoue %A Fukumi Fujiwara %T On LLC($k$) parsing method of LR($k$) grammars %J J. Inform. Process. %V 6 %N 4 %D 1983 %P 206-217 %K done .La Assume an LR($k$) .Xr "LR($k$)" grammar. Start parsing using the (full) LL($k$) .Xr "full LL($k$)" method, until an LL($k$) conflict .Xr "LL(1) conflict" is encountered, say on non-terminal $A$. $A$ is then parsed with the LR($k$) .Xr "LR($k$)" method, using the proper predicted look-ahead set. If during the LR (sub)parsing the number of items narrows down to one, an LL($k$) (sub-sub)parsing is started; etc. Full algorithms for all tables are given. LLC means \(l"Least Left Corner\(r". .Xr "LLC" .Xr "Least Left Corner" %A Lothar Schmitz %T On the correct elimination of chain productions from LR parsers %J Intern. J. Comput. Math. %V 15 %N 2 %D 1984 %P 99-116 %K done .La Rigorous proofs of some claims about unit-free LR($k$) .Xr "LR($k$)" parsers. %A N.P. Chapman %T LALR(1,1) parser generation for regular right part grammars %J Acta Inform. %V 21 %N %D 1984 %P 29-45 %K done .Xr "regular right part grammar" .La Efficient construction algorithm for LALR(1,1) .Xr "LALR(1,1)" parse tables, .Xr "parse table" which find the right end of the handle .Xr "handle" by traditional LALR(1) .Xr "LALR(1)" parsing and then scan the stack backwards using a look-ahead of 1 symbol to find the left end. %A Joseph C.H. Park %A K.M. Choe %A C.H. Chang %T A new analysis of LALR formalisms %J ACM Trans. Prog. Lang. Syst. %V 7 %N 1 %D Jan 1985 %P 159-175 %K done .La The recursive closure .Xr "closure" operator $CLOSURE$ of Kristensen and Madsen [LR 1981] .Au Kristensen LR 1981 .Au Madsen LR 1981 is abstracted to an iterative $delta$-operator such that $ CLOSURE == delta sup "*" $. This operator allows the formal derivation of four algorithms for the construction of LALR look-ahead sets. %A Esko Ukkonen %T Upper bounds on the size of LR($k$) parsers %J Inform. Process. Lett. %V 20 %N 2 %D Feb 1985 %P 99-105 %K done .La Upper bounds for the number of states of an LR($k$) .Xr "LR($k$)" parser are given for several types of grammars. %A S. Heilbrunner %T Truly prefix-correct chain-free LR(1) parsers %J Acta Inform. %V 22 %N 5 %D 1985 %P 499-536 %K done .La A unit-free LR(1) .Xr "LR(1)" parser generator .Xr "parser generator" algorithm, rigorously proven correct. %A Fred Ives %T Unifying view of recent LALR(1) lookahead set algorithms %J ACM SIGPLAN Notices %V 21 %N 7 %D July 1986 %P 131-135 %K done .La A common formalism is given in which the LALR(1) .Xr "LALR(1)" look-ahead set construction algorithms of DeRemer and Pennello [LR 1982], .Au DeRemer LR 1982 .Au Pennello LR 1982 Park, Choe and Chang [LR 1985] .Au Park LR 1985 .Au Choe LR 1985 .Au Chang LR 1985 and the author can be expressed. See also Park and Choe [LR 1987]. .Au Park LR 1987 .Au Choe LR 1987 %A Manuel E. Bermudez %A Karl M. Schimpf %T A practical arbitrary look-ahead LR parsing technique %J ACM SIGPLAN Notices %V 21 %N 7 %D July 1986 %P 136-144 %K done .La To resolve LR(0) .Xr "LR(0)" conflicts at run time, for each conflict state a FS automaton .Xr "finite-state automaton" is developed that will do arbitrary look-ahead. Grammars for which parsers can be constructed by this technique are called .Df "LAM($m$)" where $m$ in some way limits the size of the look-ahead FS automata. .Xr "finite-state automaton" It can handle some non-LR($k$) grammars. See also Baker [LR 1981]. .Au Baker LR 1981 %A Thomas J. Pennello %T Very fast LR parsing %J ACM SIGPLAN Notices %V 21 %N 7 %D July 1986 %P 145-151 %K done .La The tables and driver of a traditional LALR(1) .Xr "LALR(1)" parser are replaced by assembler code performing linear search for small fan-out, binary search for medium and a calculated jump for large fan-out. This modification gained a factor of 6 in speed at the expense of a factor 2 in size. %A Ikuo Nakata %A Masataka Sassa %T Generation of efficient LALR parsers for regular right part grammars %J Acta Inform. %V 23 %N %D 1986 %P 149-162 %K done .Xr "regular right part grammar" .La The stack of an LALR(1) .Xr "LALR(1)" parser is augmented with a set of special markers that indicate the start of a right-hand side; adding such a marker during the shift is called a .Df "stack-shift" "." Consequently there can now be a shift/stack-shift conflict, abbreviated to .Df "stacking conflict" "." The stack-shift is given preference and any superfluous markers are eliminated during the reduction. Full algorithms are given. %A A.M.M. Al-Hussaini %A R.G. Stone %T Yet another storage technique for LR parsing tables %J Softw. Pract. Exper. %V 16 %N 4 %D 1986 %P 389-401 %K done .La Excellent introduction to LR table compression .Xr "table compression" in general. The .Df "submatrix technique" introduced in this paper partitions the rows into a number of submatrices, the rows of each of which are similar enough to allow drastic compressing. The access cost is $O (1)$. A heuristic partitioning algorithm is given. %A Masataka Sassa %A Ikuo Nakata %T A simple realization of LR-parsers for regular right part grammars %J Inform. Process. Lett. %V 24 %N 2 %D Jan 1987 %P 113-120 %K done .Xr "regular right part grammar" .La For each item in each state on the parse stack of an LR parser, a counter is kept indicating how many preceding symbols on the stack are covered by the recognized part in the item. Upon reduction, the counter of the reducing item tells us how many symbols to unstack. The manipulation rules for the counters are simple. The counters are stored in short arrays, one array for each state on the stack. %A Joseph C.H. Park %A Kwang-Moo Choe %T Remarks on recent algorithms for LALR lookahead sets %J ACM SIGPLAN Notices %V 22 %N 4 %D April 1987 %P 30-32 %K done .La Careful analysis of the differences between the algorithms of Park, Choe and Chang [LR 1985] .Au Park LR 1985 .Au Choe LR 1985 .Au Chang LR 1985 and Ives [LR 1986]. .Au Ives LR 1986 See also Ives [LR 1987]. .Au Ives LR 1987 %A Fred Ives %T Response to remarks on recent algorithms for LALR lookahead sets %J ACM SIGPLAN Notices %V 22 %N 8 %D 1987 %P 99-104 %K done .La Remarks by Park and Choe [LR 1987] .Au Park LR 1987 .Au Choe LR 1987 are refuted and a new algorithm is presented that is significantly better than that of Park, Choe and Chang [LR 1985] .Au Park LR 1985 .Au Choe LR 1985 .Au Chang LR 1985 and that previously presented by Ives [LR 1986]. .Au Ives LR 1986 %A Nigel P. Chapman %T LR Parsing: Theory and Practice %I Cambridge University Press %C New York, NY %D 1987 %P 228 %K done .La Detailed treatment of the title subject. Highly recommended for anybody who wants to acquire in-depth knowledge about LR parsing. Good on size of parse tables .Xr "parse table" and attribute grammars. .Xr "attribute grammar" %A Eljas Soisalon-Soininen %A Jorma Tarhio %T Looping LR parsers %J Inform. Process. Lett. %V 26 %N 5 %D Jan 1988 %P 251-253 %K done .La For some (non-LR) grammars it is true that there are ways to resolve the conflicts in an LR parser for them that will make the parser loop on some inputs (executing an endless sequence of reduces). A test is given to detect such grammars. %A Jacek Witaszek %T A practical method for finding the optimum postponement transformation for LR($k$) parsers %J Inform. Process. Lett. %V 27 %N 2 %D Feb 1988 %P 63-67 %K done .La By allowing the LR($k$) .Xr "LR($k$)" automaton to postpone error checking, the size of the automaton can be reduced dramatically. Finding the optimum postponement transformation is, however, a large combinatorial problem. A good heuristic algorithm for finding a (sub)optimal transformation is given. %A Dashing Yeh %A Uwe Kastens %T Automatic construction of incremental LR(1) parsers %J ACM SIGPLAN Notices %V 23 %N 3 %D March 1988 %P 33-42 %K done .Xr "incremental parsing" .La Detailed algorithms for an incremental LR(1) .Xr "LR(1)" parser that allows multiple modifications and $epsilon$-rules. .Xr "$epsilon$-rule" %A Manuel E. Bermudez %A Karl M. Schimpf %T On the (non-)relationship between SLR(1) and NQLALR(1) grammars %J ACM Trans. Prog. Lang. Syst. %V 10 %N 2 %D April 1988 %P 338-342 %K done .La Shows a grammar that is SLR(1) .Xr "SLR(1)" but not NQLALR(1). .Xr "NQLALR(1)" %A Pierpaolo Degano %A Stefano Mannucci %A Bruno Mojana %T Efficient incremental LR parsing for syntax-directed editors %J ACM Trans. Prog. Lang. Syst. %V 10 %N 3 %D July 1988 %P 345-373 %K done .Xr "incremental parsing" .La The non-terminals of a grammar are partitioned by hand into sets of \(l"incrementally compatible\(r" non-terminals, meaning that replacement of one non-terminal by an incrementally compatible one is considered a minor structural change. Like in Korenjak's method [LR 1969], .Au Korenjak LR 1969 for a partitioning in $n$ sets $n+1$ parse tables .Xr "parse table" are constructed, one for each set and one for the grammar that represents the connection between the sets. The parser user is allowed interactively to move or copy the string produced by a given non-terminal to a position where an incrementally compatible one is required. This approach keeps the text (i.e. the program text) reasonably correct most of the time and uses rather small tables. %A George H. Roberts %T Recursive ascent: an LR analog to recursive descent %J ACM SIGPLAN Notices %V 23 %N 8 %D Aug 1988 %P 23-29 %K done .Xr "recursive ascent" .La Each LR state is represented by a subroutine. The shift is implemented as a subroutine call, the reduction is followed by a subroutine return possibly preceded by a return stack adjustment. The latter prevents the generation of genuine subroutines since it requires explicit return stack manipulation. A small and more or less readable LR(0) .Xr "LR(0)" parser is shown, in which conflicts are resolved by means of the order in which certain tests are done, like in a recursive descent parser. .Xr "recursive descent" %A F.E.J. Kruseman\ Aretz %T On a recursive ascent parser %J Inform. Process. Lett. %V 29 %N 4 %D Nov 1988 %P 201-206 %K done .Xr "recursive ascent" .La Each state in an LR automaton is implemented as a subroutine. A shift calls that subroutine. A reduce to $X$ is effected as follows. $X$ and its length $n$ are stored in global variables; all subroutines are rigged to decrement $n$ and return as long as $n > 0$, and to call the proper GOTO .Xr "GOTO-action" state of $X$ when $n$ hits $0$. This avoids the explicit stack manipulation of Roberts [LR 1988]. .Au "Roberts, G.H." LR 1988 %A David Spector %T Efficient full LR(1) parser generation %J ACM SIGPLAN Notices %V 23 %N 12 %D Dec 1988 %P 143-150 %K done .La A relatively simple method is given for extending an LR(0) .Xr "LR(0)" table to full LR(1). .Xr "LR(1)" The method isolates the inadequate states, .Xr "inadequate state" constructs the full look-ahead sets for them and then splits them (and possible predecessor states). The algorithm is described informally. %A Manuel E. Bermudez %A George Logothetis %T Simple computation of LALR(1) look-ahead sets %J Inform. Process. Lett. %V 31 %N 5 %D 1989 %P 233-238 %K done .La The original LALR(1) .Xr "LALR(1)" grammar is replaced by a not much bigger grammar that has been made to incorporate the necessary state splitting through a simple transformation. .Xr "transformations on grammars" The SLR(1) .Xr "SLR(1)" automaton of this grammar is the LALR(1) automaton .Xr "LALR(1) automaton" of the original grammar. %A George H. Roberts %T Another note on recursive ascent %J Inform. Process. Lett. %V 32 %N 5 %D 1989 %P 263-266 %K done .Xr "recursive ascent" .La The fast parsing methods of Pennello [LR 1986], .Au Pennello LR 1986 Kruseman\ Aretz [LR 1988] .Au "Kruseman\\\\ Aretz" LR 1988 and Roberts .Au "Roberts, G.H." LR 1988 are compared. A special-purpose optimizing compiler can select the appropriate technique for each state. %A James Kanze %T Handling ambiguous tokens in LR parsers %J ACM SIGPLAN Notices %V 24 %N 6 %D June 1989 %P 49-54 %K done .La .Xr "ambiguity" It may not always be possible to infer from the appearance of an input symbol the terminal symbol it corresponds to in the parser. In that case a default assumption can be made and the error recovery .Xr "error recovery" mechanism of the parser can be rigged to try alternatives. A disadvantage is that an LALR parser may already have made reductions (or a strong-LL .Xr "strong-LL($k$)" .Xr "strong-LL(1)" parser may have made $epsilon$-moves) .Xr "$epsilon$-move" that have ruined the context. An implementation in UNIX's @i[yacc] .Xr "@i[yacc]" is given. %A Daniel J. Salomon %A Gordon V. Cormack %T Scannerless NSLR(1) parsing of programming languages %J ACM SIGPLAN Notices %V 24 %N 7 %D July 1989 %P 170-178 %K done .La The traditional CF syntax .Xr "syntax" is extended with two rule types: $A "\*(!>" B$, .Xr "\*(!>" which means that any sentential form in which $A$ generates a terminal production of $B$ (with $B$ regular) is illegal, and $A "\*(!-" B$, .Xr "\*(!-" which means that any sentential form in which terminal productions of $A$ and $B$ are adjacent, is illegal. The authors show that the addition of these two types of rules allow one to incorporate the lexical phase of a compiler into the parser. The system uses a non-canonical .Xr "non-canonical parsing" SLR(1) .Xr "SLR(1)" parser. %A J. Heering %A P. Klint %A J. Rekers %T Incremental generation of parsers %J ACM SIGPLAN Notices %V 24 %N 7 %D July 1989 %P 179-191 %K done .Xr "incremental parser generation" .La In a very unconventional approach to parser generation, the initial information for an LR(0) .Xr "LR(0)" parser consists of the grammar only. As parsing progresses, more and more entries of the LR(0) table (actually a graph) become required and are constructed on the fly. LR(0) inadequacies .Xr "inadequate state" are resolved using Tomita's method. .Xr "Tomita parser" All this greatly facilitates handling (dynamic) changes to the grammar. %A R. Nigel Horspool %T ILALR: an incremental generator of LALR(1) parsers %B Compiler Compilers and High-Speed Compilation %S Lecture Notes in Computer Science #371 %E D. Hammer %I Springer-Verlag %C Berlin %D 1989 %P 128-136 %K done .Xr "incremental parser generation" .La Grammar rules are checked as they are typed in. To this end, LALR(1) .Xr "LALR(1)" parse tables .Xr "parse table" are kept and continually updated. When the user interactively adds a new rule, the sets FIRST .Xr "FIRST set" and NULLABLE .Xr "nullable" are recalculated and algorithms are given to distribute the consequences of possible changes over the LR(0) .Xr "LR(0)" and look-ahead sets. Some serious problems are reported and practical solutions are given. %A Daniel J. Salomon %A Gordon V. Cormack %T Corrections to the paper: Scannerless NSLR(1) parsing of programming languages %J ACM SIGPLAN Notices %V 24 %N 11 %D Nov 1989 %P 80-83 %K done .La More accurate time measurements and corrections to the algorithms are supplied. See same authors [LR July 1989]. %A Stylianos D. Pezaris %T Shift-reduce conflicts in LR parsers %J ACM SIGPLAN Notices %V 24 %N 11 %D Nov 1989 %P 94-95 %K done .La It is shown that if an LR(1) .Xr "LR(1)" parser either has no shift/reduce conflicts .Xr "shift/reduce conflict" or has shift/reduce conflicts that have been decided to be solved by shifting, the same parsing behaviour can be obtained from the corresponding LR(0) .Xr "LR(0)" parser (which will have no reduce/reduce conflicts) .Xr "reduce/reduce conflict" in which all shift/reduce conflicts .Xr "shift/reduce conflict" are resolved in favour of the shift. With this resolution principle, for instance the programming language C .Xr "C" can be parsed with an LR(0) parser. %A Gregor Snelting %T How to build LR parsers which accept incomplete input %J ACM SIGPLAN Notices %V 25 %N 4 %D April 1990 %P 51-58 %K done .La When an LR parser finds a premature end-of-file, the incomplete parse tree is completed using some heuristics on the top state of the stack. The heuristics mainly favour reduction over shift and their application is repeated until the parse tree is complete or further completion would involve too much guessing. The technique is explained in the setting of a language-based editor. %A George H. Roberts %T From recursive ascent to recursive descent: via compiler optimizations %J ACM SIGPLAN Notices %V 25 %N 4 %D April 1990 %P 83-89 %K done .La Shows a number of code transformations that will turn an LR(1) .Xr "LR(1)" recursive ascent .Xr "recursive ascent" parser (see Roberts [LR 1988] and [LR 1989]) .Au "Roberts, G.H." LR 1988 .Au "Roberts, G.H." LR 1989 for an LL(1) .Xr "LL(1)" grammar into a recursive descent parser. .Xr "recursive descent" .EQ delim $$ .EN .P1 "Left-corner parsing" .Xr "left-corner parsing" This section also covers a number of related techniques: production-chain, .Xr "production chain" LLP($k$), .Xr "LLP($k$)" PLR($k$), .Xr "PLR($k$)" etc. %A D.J. Rosenkrantz %A P.M. Lewis\ II %T Deterministic left-corner parsing %B IEEE Conference Record 11th Annual Symposium on Switching and Automata Theory %V 11 %D 1970 %P 139-152 %K done .La An LC($k$) .Xr "LC($k$)" parser decides the applicability of a rule when it has seen the initial non-terminal of the rule if it has one, plus a look-ahead of $k$ symbols. Identifying the initial non-terminal is done by bottom-up parsing, the rest of the rule is recognized top-down. A canonical LC pushdown machine .Xr "pushdown automaton" can be constructed in which the essential entries on the pushdown stack are pairs of non-terminals, one telling what non-terminal has been recognized bottom-up and the other what non-terminal is predicted top-down. As with LL, there is a difference between LC and strong-LC. There is a simple algorithm to convert an LC($k$) .Xr "LC($k$)" grammar into LL($k$) .Xr "LL($k$)" form; the resulting grammar may be large, though. %A Y. Eric Cho %T Simple left-corner grammars %B Proc. Seventh Princeton Conference on Information Sciences and Systems %E %C Princeton %D 1973 %P 557 %K done .La LC parsing is simplified by requiring that each right-hand side be recognizable (after LC reduction) by its first two symbols and by handling left recursion as a special case. The required tables are extremely small. %A David B. Lomet %T Automatic generation of multiple exit parsing subroutines %B Automata, languages and programming %S Lecture Notes in Computer Science #14 %E J. Loeckx %I Springer-Verlag %C Berlin %D 1974 %P 214-231 %K done .La A .Df "production chain" is a chain of production steps .Xr "production step" $X sub 0 -> X sub 1 alpha sub 1$, $X sub 1 -> X sub 2 alpha sub 2$, $ ... $ $X sub n-1 -> "@t[t]" alpha sub n$, with $X sub 0 , ... , X sub n-1$ non-terminals and @t[t] a terminal. If the input is known to derive from $X sub 0$ and starts with @t[t], each production chain .Xr "production chain" from $X sub 0$ to @t[t] is a possible explanation of how @t[t] was produced. The set of all production chains .Xr "production chain" connecting $X sub 0$ to @t[t] is called a .Df "production expression" \&. An efficient algorithm for the construction and compression .Xr "table compression" of production expressions .Xr "production expression" is given. Each production expression .Xr "production expression" is then implemented as a subroutine which contains the production expression .Xr "production expression" as a FS automaton. .Xr "finite-state automaton" %A Michael Hammer %T A new grammatical transformation into LL($k$) form %B Proceedings Sixth Annual ACM Symposium on Theory of Computing %D 1974 %P 266-275 %K done .La Each left corner in a left-corner parser .Xr "left-corner parsing" is described as a FS automaton .Xr "finite-state automaton" and implemented as a subroutine. Parsing is then performed by recursive descent .Xr "recursive descent" using these subroutines. The FS automata can be incorporated into the grammar to yield an LL($k$) .Xr "LL($k$)" grammar. %A J. Kr\('al %A J. Demner %T Parsing as a subtask of compiling %B Mathematical Foundations of Computer Science %S Lecture Notes in Computer Science #32 %E J. Be\*(vcv\('a\*(vr %I Springer-Verlag %C Berlin %D 1975 %P 61-74 %K done .La Various considerations that went into the design of a variant of left-corner parsing, .Xr "left-corner parsing" called .Df "semi-top-down" \&. %A E. Soisalon-Soininen %A E. Ukkonen %T A characterization of LL($k$) grammars %B Automata, Languages and Programming %E S. Michaelson & R. Milner %I Edinburgh University Press %C Edinburgh %D 1976 %P 20-30 %K done .La Introduces a subclass of the LR($k$) .Xr "LR($k$)" grammars called predictive LR($k$) (PLR($k$)). .Xr "PLR($k$)" The deterministic LC($k$) .Xr "LC($k$)" grammars are strictly included in this class, and a grammatical transformation .Xr "transformations on grammars" is presented to transform a PLR($k$) .Xr "PLR($k$)" into an LL($k$) .Xr "LL($k$)" grammar. PLR($k$) .Xr "PLR($k$)" grammars can therefore be parsed with the LL($k$) parser of the transformed grammar. .X