A bison grammar for parsing bison grammars
The ll1 tool performs a small set of analyses on a
context-free grammar
in order to determine whether it is
LL(1), and thus suitable
for implementation via a
predictive parser
algorithm. Running a grammar through ll1 can help you catch
left recursion and other
conflicts
that could ruin your day if they pop up later during parser implementation.
ll1 parses a CFG in an ‘un-adorned’
GNU Bison format. The use of
%empty is absolutely mandatory for writing epsilon productions. Character
literals are also accepted as tokens, so the following grammar:
%%
/* left-recursive! */
expr : expr '+' term
| expr '-' term
| term ;
/* left-recursive! */
term : term '*' factor
| term '/' factor
| factor ;
factor : ID | NUM ;
%%
would be perfectly acceptable input, even though it will cause ll1
to pitch a fit about predict set overlaps between productions of the
same nonterminal. ;)
However, ll1 will be tickled pink to accept this input:
%%
/* tail-recursive... :) */
expr : term expr_next ;
term : factor term_next ;
expr_next : '+' expr | '-' expr | %empty ;
term_next : '*' term | '/' term | %empty ;
factor : ID | NUM ;
%%
In addition to LL(1) conflict reports, ll1 will also output information
about:
%empty
It’s neither perfect nor complete, but it helps provide some basic insights.
I wrote ll1 in a day, and only passed it through valgrind a handful of
times to check for serious errors. Given the fact that most of the data
structures are patterned after the null-termination patterns of C strings,
there has to be a bug or two in there… somewhere…
Anyways, if you run into one, let me know and I’ll push a patch.
The ll1 source code is released under the
MIT license. See the
LICENSE.md file for the complete license terms.
And as always, enjoy!
~ Brad.