Why do we need FOLLOW set in LL(1) grammar parser?-CodePudding

In generated parsing function we use an algorithm which looks on a peek of a tokens list and chooses rule (alternative) based on the current non-terminal FIRST set. If it contains an epsilon (rule is nullable), FOLLOW set is checked as well.

Consider following grammar [not LL(1)]:

B : A term
A : N1 | N2
N1 :
N2 :

During calculation of the FOLLOW set terminal term will be propagated from A to both N1 and N2, so FOLLOW set won't help us decide.

On the other hand, if there is exactly one nullable alternative, we know for sure how to continue execution, even in case current token doesn't match against anything from the FIRST set (by choosing epsilon production).

If above statements are true, FOLLOW set is redundant. Is it needed only for error-handling?

CodePudding user response：

Yes, it is not necessary.

I was asked precisely this question on the colloquium, and my answer that FOLLOW set is used

to check that grammar is LL(1)
to fail immediately when an error occurs, instead of dragging the ill-formatted token to some later production, where generated fail message may be unclear
and for nothing else

was accepted

CodePudding user response：

While you can certainly find grammars for which FOLLOW is unnecessary (i.e., it doesn't play a role in the calculation of the parsing table), in general it is necessary.

For example, consider the grammar

S : A | C
A : B a
B : b | epsilon
C : D c
D : d | epsilon

You need to know that

 Follow(B) = {a}
 Follow(D) = {c}

to calculate

 First(A) = {b, a}
 First(C) = {d, c}

in order to make the correct choice at S.