Home > Net >  Java Split Regex Names next to Symbols
Java Split Regex Names next to Symbols

Time:09-25

I am attempting to write an interpreter for a Backus Naur Form grammar. The following is the grammar:

<statement> ::= <assignment> | “PRINT” “(” <expression> “)”
<assignment>    ::= <variable> = <expression>
<expression>    ::= <term> <expression*>
<expression*>   ::= “ ” <expression> | “-” <expression> | “”
<term>  ::= <factor> <term*>
<term*> ::= “*” <term> | “/” <term> | “”
<factor>    ::= <number> | <variable> | “(” <expression> “)”
<variable>  ::= <lowercase> <variable*>
<variable*> ::= <variable> | “”
<number>    ::= <digit> <number*>
<number*>   ::= <number> | “”

I am attempting to write the code that determines if a string is an expression. My idea is that I take the string and use the split function to get all of the separate words and symbols apart. I do that by doing something like this String[] words = line.split("\\s ");

When I do this, it will turn a string like String line = "y = x 12 * z" into String[] words = ["y", "=", "x", " ", "12", "z"]. This is not a problem for me. The problem arises, though, when I have an expression as such String line = "x=12 z" When I try and split this with my regex, it will give me String[] words = ["x=12 z"]. Is there any way that I can have a string split into words where it will make any of the following characters into their own words in the array: , -, *, /, =, (, )

For example, if I have the string String line = "x = z 12 * y -(z *var )" it will become String[] words = ["x", "=", "z", " ", "12", "*", "y", "-", "(", "z", "*","var", ")"]

CodePudding user response:

You can try pass regex like this into split() method:

String[] words=line.split("((?=[= \\-*/()])|(?<=[= \\-*/()]))");

Example:

"z=x   (y-56)/(4 2*x)" => String[18] { "z", "=", "x ", " ", " ", "(", "y", "-", "56", ")", "/", "(", "4", " ", "2", "*", "x", ")" }

As you see it handle spaces too but you have to remove them after split.

  • Related