Can you help me find the way to solve this problem?
I'm realizing a parser with JavaCC able to read and put in a tree structure this kind of search expression
(keyphrase1[field1] AND (keyphrase2[field2] OR keyphrase3[field3])) OR (keyphrase4 AND keyphrase5*[field5])
I need to set the precedence to the left when I have this kind of ambiguity, for example in a case like
keyphrase[field] AND keyphrase[field] OR keyphrase[field]
I want to obtain a tree that corresponds to
(keyphrase[field] AND keyphrase[field]) OR keyphrase[field]
So, firstly I've tried to write a grammar where Expression() was defined this way
void Expression(): {} {
<LROUND> Expression() <RROUND>
| Expression() Operator() Expression()
| Term()
}
But JavaCC cries for the left-recursion, so I've tried to change it. This is the grammar rules I've implemented now.
SKIP : {
" "
| "\n"
| "\r"
| "\t"
}
TOKEN : {
<LROUND: "(">
| <RROUND: ")">
| <KEYWORD: ( ["a"-"z","0"-"9"] ) >
| <WILDCARD: "*">
| <LSQUARE: "[">
| <RSQUARE: "]">
| <AND: "AND">
| <OR: "OR">
| <NOT: "NOT">
}
ASTStart Start(): {} {
Expression() <EOF>
{ return jjtThis; }
}
void Expression(): {} {
<LROUND> Expression() <RROUND> (Operator() Expression())*
| Term() (Operator() Expression())*
}
void Term(): {} {
KeyPhrase() [Field()]
}
void KeyPhrase():
{
Token k;
Token w;
String keyPhrase = "";
}
{
(k=<KEYWORD> {keyPhrase = (keyPhrase.equals("") ? "" : " ") k.image;}) [w=<WILDCARD> {keyPhrase = w.image;}]
{jjtThis.jjtSetValue(keyPhrase);}
}
void Field():
{
Token f;
}
{
<LSQUARE> f=<KEYWORD> {jjtThis.jjtSetValue(f.image);} <RSQUARE>
}
void Operator():
{
Token op;
}
{
op=<AND>
{
jjtThis.jjtSetValue(op.image);
}
| op=<OR>
{
jjtThis.jjtSetValue(op.image);
}
| op=<NOT>
{
jjtThis.jjtSetValue(op.image);
}
}
The problem is that this set precedence to the right in case of ambiguity, how can I solve this?
P.S. I've had the option LOOKAHEAD=2
EDIT: I've also tried this way, with repetition first, but parser hits EOF attempting to find operator after last expression
ASTStart Start(): {} {
Expression() <EOF>
{ return jjtThis; }
}
void Expression(): {} {
(Operand() Operator())* Operand()
}
void Operand(): {} {
ParenthesizedExpression()
| Term()
}
void ParenthesizedExpression(): {} {
<LROUND> Expression() <RROUND>
}
void Term(): {} {
KeyPhrase() [Field()]
}
CodePudding user response:
This is really a JJTree question. The trick is to use a definite node. ( https://javacc.github.io/javacc/documentation/jjtree.html#introduction .)
void Expression() #void :
{ }
{
SimpleExpression()
(
Operator()
SimpleExpression()
#BinOp(3)
)*
}
void SimpleExpression() #void :
{ }
{
Term()
|
"(" Expression() ")"
}
Why I think this will work: Suppose you have as input
W OR X AND Y NOT Z
where W, X, Y, and Z are terms
When expression is entered it first parses a term, an operator and a term. Now the stack will contain 3 nodes:
(top) X OR W (bottom)
When BinOp(3) is encountered these 3 nodes are popped and new node is pushed. Now the stack is
(top) BinOp(W, OR, X) (bottom)
The loop is restarted and another two nodes will be pushed. The stack is now
(top) Y AND BinOp(W, OR, X) (bottom)
BinOp(3) is encountered. It pops 3 nodes and pushes one to give
(top) BinOp(BinOp(W, OR, X), AND, Y) (bottom)
Another operator and term are parsed, so the stack is
(top) Z NOT BinOp(BinOp(W, OR, X), AND, Y) (bottom)
Again BinOp(3) is encountered and we get
(top) BinOp( BinOp(BinOp(W, OR, X), AND, Y), NOT, Z) (bottom)
See my Java parser at https://github.com/theodore-norvell/the-teaching-machine-and-webwriter/blob/master/trunk/tm/src/tm/javaLang/parser/JavaParser.jjt for a full example. Start at line 1155, but also look at the options at the top.
CodePudding user response:
How about something like this:
void Expression(): {} {
Expression1() (<OR> Expression1())*
}
void Expression1(): {} {
Term() (<AND> Term())*
}
void Term(): {} {
(<NOT>)? Term1()
| <LROUND> Expression() <RROUND>
}
void Term1(): {} {
KeyPhrase() <LSQUARE> Field() <RSQUARE>
}