I build Java Parser using ANTLR in python. Here Is the main code I used to parse JAVA code.
def ASTconversion(file_path):
code = open(file_path, 'r').read()
lexer = JavaLexer(antlr4.InputStream(code))
stream = antlr4.CommonTokenStream(lexer)
parser = JavaParser(stream)
tree = parser.compilationUnit()
is_syntax_errors = tree.parser._syntaxErrors #Binary
return tree.toStringTree(recog=parser),is_syntax_errors
ast, is_syntax_errors = ASTconversion(code_path)
print(ast)
The following contains the output of the above code snippet.
(compilationUnit (typeDeclaration (classOrInterfaceModifier public) (classDeclaration class Number (classBody { (classBodyDeclaration (block { (blockStatement (statement (expression (expression (expression (primary System)) . out) . (methodCall println ( (expressionList (expression (primary (literal "Printing Numbers")))) ))) ;)) (blockStatement (statement for ( (forControl (forInit (localVariableDeclaration (typeType (primitiveType int)) (variableDeclarators (variableDeclarator (variableDeclaratorId i) = (variableInitializer (expression (primary (literal (integerLiteral 1))))))))) ; (expression (expression (primary i)) <= (expression (primary (literal (integerLiteral 10))))) ; (expressionList (expression (expression (primary i)) ))) ) (statement (block { (blockStatement (statement (expression (expression (expression (primary System)) . out) . (methodCall println ( (expressionList (expression (primary i))) ))) ;)) })))) })) }))) )
Based on this output I have 2 questions to ask.
1. How I can visualize this parser output as a Graphical AST?
2. If the code contains any syntax error I can find that as in the code. But How can I track the syntax error?
CodePudding user response:
It does not appear that the Python target has the ability for you to produce the gun parse tree view from within your Python code. This makes sense as that's a Java app in the Java runtime.
If you follow the ANTLR Quick start, you should have the grun
command installed, and (assuming you have no Python-specific semantic predicates, headers, etc. in your grammar), you would be able to use the -gui
option with grun
to see a graphical representation of your parse tree.
Also, both the Visual Studio Code, and IntelliJ ANTLR plugins, give you the ability to test input against a grammar and see a visualization of your parse tree in the IDE (VS Code plugin) (IntelliJ plugin).
In order to get your errors back from a parse, you should probably take a look at the ctest.py test for the Python runtime. It will show you how to implement your own errorListener:
an excerpt:
class ErrorListener(antlr4.error.ErrorListener.ErrorListener):
def __init__(self):
super(ErrorListener, self).__init__()
self.errored_out = False
def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):
self.errored_out = True
def sub():
# Parse the input file
input_stream = antlr4.FileStream("c.c")
lexer = CLexer(input_stream)
token_stream = antlr4.CommonTokenStream(lexer)
parser = CParser(token_stream)
errors = ErrorListener()
parser.addErrorListener(errors)
Of course, you'd substitute your own code to gather and retain errors in your extension of the ErrorListener
class. The key take-away is that you need to implement your own error listener and then use addErrorlistener()
to use it (you may also want to call removeErrorListner()
to remove the default listener that probably just writes out to sysout).