Related Content: CS606 - VU Lectures, Handouts, PPT Slides, Assignments, Quizzes, Papers & Books of Compiler Construction
The number of potential handles for the grammar is simply the sum of the lengths of the right-hand side of all the productions. The number of complete handles is simply the number of productions. These two facts lead to the critical insight behind LR parsers:
A given grammar generates a finite set of handles (and potential handles) that the parser must recognize
However, it is not a simple matter of putting the placeholder in right-hand side to generate handles. The parser needs to recognize the correct handle by different right contexts. Consider the parser’s action at step 9.
word | Stack | Handle | Action | |
9 | × | Expr – Term ► | - none - | shift |
10 | id | Expr – Term × ► | - none - | shift |
11 | $ | Expr – Term × id► | 〈Factor → id,5〉 | reduce |
12 | $ | Expr–Term × Factor► | 〈Term→Term×Factor,5〉 | reduce |
13 | $ | Expr – Term ► | 〈Expr → Expr – Term,3〉 | reduce |
14 | $ | Expr ► | 〈Goal → Expr,1〉 | reduce |
15 | $ | Goal | - none - | accept |
The frontier is Expr – Term, suggesting a handle 〈Expr → Expr – Term〉•. However, the parser decides to extend the frontier by shifting × on to the stack rather than reducing frontier to Expr. Clearly, this the correct move for the parser. No potential handle contains Expr followed by ×. At step 9, the set of potential handles is
〈Expr → Expr – Term•〉
〈Term →Term• × Factor〉
〈Term →Term• / Factor 〉
The next input symbol clearly matches the second choice. The parser needs a basis for deciding between first (reduce) and second (shift) choices:
〈Expr → Expr – Term•〉
〈Term →Term• × Factor〉
This requires more context than the parser has in the frontier (stack). To choose between reducing and shifting, the parser must recognize which symbols can occur to the right of Expr and Term in valid phrases.
The LR(1) parsers can recognize precisely those languages in which one-symbol lookahead suffices to determine whether to shift or reduce. The LR(1) construction algorithm builds a handle-recognizing DFA. The parsing algorithm uses this DFA to recognize handles and potential handles on the parse stack
In order to remember the state the DFA goes into on a symbol, the parser stores the DFA state in the stack along with the symbol. Initial entry in the stack will be ‘<dummy,0>’.
Parsers represent DFA as a 2D table. The rows correspond to DFA states and columns correspond to terminals and non-terminals. The columns with terminals and the rows form the action table while the columns with non-terminals and rows are called the goto table. It is customary to show these tables together.
To construct Action and Goto tables, the LR(1) parser generator builds a model of handle-recognizing DFA. The model is used to fill in the tables. The LR(1)-table construction needs a concrete representation for the handles and their associated lookahead symbols. We call this representation an LR(1) item.