I am attempting to write an interpreter for a Backus Naur Form grammar. The following is the grammar:
<statement> ::= <assignment> | “PRINT” “(” <expression> “)”
<assignment> ::= <variable> = <expression>
<expression> ::= <term> <expression*>
<expression*> ::= “ ” <expression> | “-” <expression> | “”
<term> ::= <factor> <term*>
<term*> ::= “*” <term> | “/” <term> | “”
<factor> ::= <number> | <variable> | “(” <expression> “)”
<variable> ::= <lowercase> <variable*>
<variable*> ::= <variable> | “”
<number> ::= <digit> <number*>
<number*> ::= <number> | “”
I am attempting to write the code that determines if a string is an expression. My idea is that I take the string and use the split function to get all of the separate words and symbols apart. I do that by doing something like this String[] words = line.split("\\s ");
When I do this, it will turn a string like String line = "y = x 12 * z"
into String[] words = ["y", "=", "x", " ", "12", "z"]
. This is not a problem for me. The problem arises, though, when I have an expression as such String line = "x=12 z"
When I try and split this with my regex, it will give me String[] words = ["x=12 z"]
. Is there any way that I can have a string split into words where it will make any of the following characters into their own words in the array: , -, *, /, =, (, )
For example, if I have the string String line = "x = z 12 * y -(z *var )"
it will become String[] words = ["x", "=", "z", " ", "12", "*", "y", "-", "(", "z", "*","var", ")"]
CodePudding user response:
You can try pass regex like this into split()
method:
String[] words=line.split("((?=[= \\-*/()])|(?<=[= \\-*/()]))");
Example:
"z=x (y-56)/(4 2*x)" => String[18] { "z", "=", "x ", " ", " ", "(", "y", "-", "56", ")", "/", "(", "4", " ", "2", "*", "x", ")" }
As you see it handle spaces too but you have to remove them after split.