Home > database >  Preserving whitespace in Rascal when transforming Java code
Preserving whitespace in Rascal when transforming Java code

Time:12-19

I am trying to add instrumentation (e.g. logging some information) to methods in a Java file. I am using the following Rascal code which seems to work mostly:

import ParseTree;
import lang::java::\syntax::Java15;
// .. more imports

// project is a loc
M3 model = createM3FromEclipseProject(project);
set[loc] projectFiles = { file | file <- files(model)} ;

for (pFile <- projectFiles) {
    CompilationUnit cunit = parse(#CompilationUnit, pFile);

    cUnitNew = visit(cunit) { 
        case (MethodBody) `{<BlockStm* post>}` 
            => (MethodBody) `{
            'System.out.println(new Throwable().getStackTrace()[0]);
            '<BlockStm* post>
            '}`
    }
            
    writeFile(pFile, cUnitNew);
}

I am running into two issues regarding whitespace, which might be unrelated.

The line of code that I am inserting does not preserve whitespace that was there previously. If there was a tab character, it will now be removed. The same is true for the line directly following the line I am inserting and the closing brace. How can I 'capture' whitespace in my pattern?

Example before transforming (all lines start with a tab character, line 2 and 3 with two):

    void beforeFirst() throws Exception {
            rowIdx = -1;
            rowSource.beforeFirst();
    }

Example after transforming:

    void beforeFirst() throws Exception {
System.out.println(new Throwable().getStackTrace()[0]);
rowIdx = -1;
            rowSource.beforeFirst();
}

An additional issue regarding whitespace; if a file ends on a newline character, the parse function will throw a ParseError without further details. Removing this newline from the original source will fix the issue, but I'd rather not 'manually' have to fix code before parsing. How can I circumvent this issue?

CodePudding user response:

Alas, capturing whitespace with a concrete pattern is not a feature of the current version of Rascal. We used to have it, but now it's back on the TODO list. I can point you to papers about the topic if you are interested. So for now you have to deal with this "damage" later.

You could write a Tree to Tree transformation on the generic level (see ParseTree.rsc), to fix indentation issues in a parse tree after your transformation, or to re-insert the comments that you lost. This is about matching the Tree data-type and appl constructors. The Tree format is a form of reflection on the parse trees of Rascal that allow any kind of transformation, including whitespace and comments.

The parse error you talked about is caused by not using the start non-terminal. If you use parse(#start[CompilationUnit], ...) then whitespace and comments before and after the CompilationUnit are accepted.

  • Related