Home > OS >  perl regex for matching multiline calls to c function
perl regex for matching multiline calls to c function

Time:08-17

I'm looking to have a regex to match all potentially multiline calls to a variadic c function. The end goal is to print the file, line number, and the fourth parameter of each call, but unfortunately I'm not there yet. So far, I have this:

 perl -ne 'print if s/^.*?(func1\s*\(([^\)\(,] ||,|\((?2)\))*\)).*?$/$1/s' test.c

with test.c:

int main() {
        func1( a, b, c, d);
        func1( a, b,
               c, d);
        func1( func2(), b, c, d, e );
        func1( func2(a), b, c, d, e );
        return 1;
}

-- which does not match the second call. The reason it doesn't match is that the s at the end of the expression allows . to match newlines, but doesn't seem to allow [..] constructs to match newlines. I'm not sure how to get past this.

I'm also not sure how to reference the fourth parameter in this... the $2, $3 do not get populated in this (and even if they did I imagine I would get some issues due to the recursive nature of the regex).

CodePudding user response:

Not Perl but perhaps simpler:

$ cat >test2.c <<'EOD'
int main() {
    func1( a, b, c, d1);
    func1( a, b,
           c, d2);
    func1( func2(), "quotes\"),(", /*comments),(*/ g(b,
c), "d3", e );
    func1( func2(a), b, c, d4(p,q,r), e );
    func1( a, b, c, func2( func1(a,b,c,d5,e,f) ), g, h);
    return 1;
}
EOD

$ cpp -D'func1(a,b,c,d,...)=SHOW(__FILE__,__LINE__,d,)' test2.c |
  grep SHOW
    SHOW("test2.c",2,d1);
    SHOW("test2.c",3,d2)
    SHOW("test2.c",5,"d3")
    SHOW("test2.c",7,d4(p,q,r));
    SHOW("test2.c",8,func2( SHOW("test2.c",8,d5) ));
$

As the final line shows, a bit more work is needed if the function can take itself as an argument.

CodePudding user response:

This should catch your functions, with caveats

perl -0777 -wnE'@f = /(func1\s*\( [^;]* \))\s*;/xg; s/\s / /g, say for @f' tt.c

I use the fact that a statement must be terminated by ;. Then this excludes an accidental ; in a comment and it excludes calls to this being nested inside another call. If that is possible then quite a bit more need be done to parse it.

However, further parsing the captured calls, presumably by commas, is complicated by the fact that a nested call may well, and realistically, contain commas. How about

func1( a, b, f2(a2, b2), c );

This becomes a far more interesting little parsing problem. Or, how about macros?

Can you clarify what kinds of things one doesn't have to account for?

  • Related