Home > Software design >  Parsing python files using ANTLR
Parsing python files using ANTLR

Time:02-05

Using these steps I'm trying to generate the parse tree for Antlr4 Python3.g4 grammar file, to parse python3 code, I've generated my python parser using ANTLR. But I'm unsure how to pass in a python file as the InputStream doesn't accept this.

I've currently managed to pass it in as a text file:

def main():
    with open('testingText.txt') as file:
        data = file.read()


    input_stream = InputStream(data)
    lexer = Python3Lexer(input_stream)
    stream = CommonTokenStream(lexer)
    parser = Python3Parser(stream)
    tree = parser.single_input()
    print(tree.toStringTree(recog=parser))

But I get errors to do with 'mismatched input <EOF>' and sometimes 'no viable alternative as input <EOF>'

I would like to pass in a .py file, and I'm not sure what to do about the <EOF> issues

CodePudding user response:

To be sure, I'd need to know what testingText.txt contains, but I'm pretty sure that the parser expects a line break at the end of the file and testingText.txt does not contains a trailing line break.

You could try this:

with open('testingText.txt') as file:
    data = f'{file.read()}\n'

EDIT

And if testingText.txt contains:

class myClass:
    x=5
    print("hello world")

parse it like this:

from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser


def main():
    with open('testingText.txt') as file:
        data = f'{file.read()}\n'

    input_stream = InputStream(data)
    lexer = Python3Lexer(input_stream)
    stream = CommonTokenStream(lexer)
    parser = Python3Parser(stream)
    tree = parser.file_input()
    print(tree.toStringTree(recog=parser))


if __name__ == '__main__':
    main()

E.g. use tree = parser.file_input() and not tree = parser.single_input().

  • Related