  # CS606 - Compiler Construction - Lecture Handout 24

User Rating:     / 0
PoorBest

An LR(1) item is a pair [X → α•β, a] where X → αβ is a production and a∈T (terminals) is look-ahead symbol. The model uses a set of LR(1) items to represent each parser state. The model is called the canonical collection (CC) of set of LR(1) items.

## Canonical Collection

Each set in CC represents a state in the eventual parser DFA. The construction of CC begins by building a model of parser’s initial state. The initial state consists of the set of LR(1) items that represent the parser’s initial state, along with any items that must also hold in the initial state. To simplify the task of building this initial state, the construction requires that the grammar have a unique goal symbol. The convention is to add a new start symbol S to grammar and a production

S → E

This leads to the augmented grammar

S → E
E → E + (E) | int

## The Closure Procedure

The item [S → •E, \$] describes the parser’s initial state. It represents a configuration in which recognizing S followed by \$ would be a valid parse. This item, i.e., [S → •E, \$] becomes the core of the first state in CC, labeled I0. If the grammar has several distinct productions for the start symbol, each of them generates an item in this initial core of I0.
The procedure closure does this.

closure(s) =
repeat
for each [X → α•Yβ, a] ∈ s
for each production Y → α
for each b ∈ FIRST(βa)
s ← s ∪ [Y → • γ, b]
until s is unchanged

Let’s apply this procedure to the augmented grammar.

The first set is I0 = closure({[S → •E, \$] }). Equating the terms in the procedure,
s = {[S → •E, \$]}, [X → α •Yβ, a] ⇔ [S → •E, \$], X = S, α = ε, Y = E,
β = ε, a = \$, Y → γ ⇔E → E + (E) and E → intFIRST(βa) = FIRST(\$) = \$.

This leads to expansion of s.

s = { [S → •E, \$] } ∪ { [E → •E + (E), \$] } ∪ { [E → •int, \$] }
= { [S → •E, \$] , [E → •E + (E), \$] , [E → •int, \$] }

The set s changed so we repeat. The item [S → •E, \$] is already processed. The for loop considers [X → α•Yβ, a] ⇔ [E→•E+(E),\$], which leads to the match up X = E,
α= ε, Y = E, β= +(E), a = \$, Y → γ ⇔ E → E + (E), ⇔ E → int
FIRST(βa) = FIRST(+(E)\$) = +. The set s is extended

s = s ∪ { [E → •E+(E), +] } ∪ { [E → •int, +] }