Home > Net >  How can I handle left association for all operator in JavaCC for this grammar?
How can I handle left association for all operator in JavaCC for this grammar?

Time:09-29

Can you help me find the way to solve this problem?

I'm realizing a parser with JavaCC able to read and put in a tree structure this kind of search expression

(keyphrase1[field1] AND (keyphrase2[field2] OR keyphrase3[field3])) OR (keyphrase4 AND keyphrase5*[field5])

I need to set the precedence to the left when I have this kind of ambiguity, for example in a case like

keyphrase[field] AND keyphrase[field] OR keyphrase[field]

I want to obtain a tree that corresponds to

(keyphrase[field] AND keyphrase[field]) OR keyphrase[field]

So, firstly I've tried to write a grammar where Expression() was defined this way

void Expression(): {} {
  <LROUND> Expression() <RROUND>
| Expression() Operator() Expression()
| Term()
}

But JavaCC cries for the left-recursion, so I've tried to change it. This is the grammar rules I've implemented now.

SKIP : {
  " "
| "\n"
| "\r"
| "\t"
}

TOKEN : {
  <LROUND: "(">
| <RROUND: ")">
| <KEYWORD: ( ["a"-"z","0"-"9"] ) >
| <WILDCARD: "*">
| <LSQUARE: "[">
| <RSQUARE: "]">
| <AND: "AND">
| <OR: "OR">
| <NOT: "NOT">
}

ASTStart Start(): {} {
  Expression() <EOF>
  { return jjtThis; }
}

void Expression(): {} {
  <LROUND> Expression() <RROUND> (Operator() Expression())*
| Term() (Operator() Expression())*
}

void Term(): {} {
  KeyPhrase() [Field()]
}

void KeyPhrase(): 
{
  Token k;
  Token w;
  String keyPhrase = "";
} 
{
  (k=<KEYWORD> {keyPhrase  = (keyPhrase.equals("") ? "" : " ")   k.image;})  [w=<WILDCARD> {keyPhrase  = w.image;}] 
  {jjtThis.jjtSetValue(keyPhrase);}
}

void Field():
{
  Token f;
} 
{
    <LSQUARE> f=<KEYWORD> {jjtThis.jjtSetValue(f.image);} <RSQUARE>
}

void Operator(): 
{
  Token op;
} 
{
  op=<AND>
  {
    jjtThis.jjtSetValue(op.image);
  }
  | op=<OR>
  {
    jjtThis.jjtSetValue(op.image);
  }
  | op=<NOT>
  {
    jjtThis.jjtSetValue(op.image);
  }
}

The problem is that this set precedence to the right in case of ambiguity, how can I solve this?

P.S. I've had the option LOOKAHEAD=2

EDIT: I've also tried this way, with repetition first, but parser hits EOF attempting to find operator after last expression

ASTStart Start(): {} {
  Expression() <EOF>
  { return jjtThis; }
}

void Expression(): {} {
  (Operand() Operator())* Operand()
}

void Operand(): {} {
  ParenthesizedExpression()
| Term()
}

void ParenthesizedExpression(): {} {
  <LROUND> Expression() <RROUND>
}

void Term(): {} {
  KeyPhrase() [Field()]
}

CodePudding user response:

This is really a JJTree question. The trick is to use a definite node. ( https://javacc.github.io/javacc/documentation/jjtree.html#introduction .)

void Expression() #void :
{ }
{
   SimpleExpression()
   (
       Operator()
       SimpleExpression() 
       #BinOp(3)
   )*
}

void SimpleExpression() #void :
{ }
{
   Term()
|
   "(" Expression() ")"
}

Why I think this will work: Suppose you have as input

W OR X AND Y NOT Z 

where W, X, Y, and Z are terms

When expression is entered it first parses a term, an operator and a term. Now the stack will contain 3 nodes:

  (top)   X   OR   W  (bottom)

When BinOp(3) is encountered these 3 nodes are popped and new node is pushed. Now the stack is

(top)  BinOp(W, OR, X)  (bottom)

The loop is restarted and another two nodes will be pushed. The stack is now

(top) Y   AND   BinOp(W, OR, X)  (bottom)

BinOp(3) is encountered. It pops 3 nodes and pushes one to give

(top) BinOp(BinOp(W, OR, X), AND, Y) (bottom)

Another operator and term are parsed, so the stack is

(top) Z   NOT   BinOp(BinOp(W, OR, X), AND, Y) (bottom)

Again BinOp(3) is encountered and we get

(top)   BinOp( BinOp(BinOp(W, OR, X), AND, Y), NOT, Z)  (bottom)

See my Java parser at https://github.com/theodore-norvell/the-teaching-machine-and-webwriter/blob/master/trunk/tm/src/tm/javaLang/parser/JavaParser.jjt for a full example. Start at line 1155, but also look at the options at the top.

CodePudding user response:

How about something like this:

void Expression(): {} {
    Expression1() (<OR> Expression1())*
}

void Expression1(): {} {
    Term() (<AND> Term())*
}

void Term(): {} {
    (<NOT>)? Term1()
    | <LROUND> Expression() <RROUND>
}

void Term1(): {} {
    KeyPhrase() <LSQUARE> Field() <RSQUARE>
}
  • Related