Home > Mobile >  How to use Perl to replace multiple lines containing character '/' and new line?
How to use Perl to replace multiple lines containing character '/' and new line?

Time:09-26

I'm trying to modify block of several lines in several files. Initially, I tried sed but read that Perl might be a better choice. However, my Perl is very basic and I'm not sure how to deal with an empty (new) line and the special character '/'. To sum things up, I'd like to have a one-liner, something like ($perl -i -pe ...), to convert

(new line)
#include <item_b/item_bC.h>

into

#include <item_a/item_aC.h>
#include <item_b/item_bC.h>

Thanks.

CodePudding user response:

One way -- slurp the file into a string, then match a line with only possibly spaces followed by a line starting with #include..., and replace what's matched with that #include line twice

perl -0777 -wpe's{ ^\s*\n ( \#include.*\n ) }{$1$1}mxg' file.c

With -0777 it slurps the whole file into $_ and with -p it prints $_ one every line (once when used with -0777 since hte whole file is in $_ so there is one such "line"); see switches in perlrun. The /m modifier makes ^ (and $) also match line boundaries inside a (multiline) string.

Or, in the same general approach (slurp the file) but use a lookahead

perl -0777 -wpe's{ ^\s*\n (?= (\#include.*\n) ) }{$1}mxg' file.c

Matches an empty line after which a lookahead finds a line starting with #include, which is also captured so to replace the empty line with it. Since lookarounds don't consume anything there is no need to replace that line (with itself).

Note, the .* is greedy and matches as much as possible up to the pattern that follows it, and here we have the whole file ahead of it so it may appear that .*\n will match all the way to the very last \n in the file! However, . doesn't match a line-feed (with /s modifier it does) so .*\n here stops at the first newline, so it matches the rest of the line.

If a more specific include statement need be matched add details following the #include pattern.

Otherwise, one can process line by line, by copying the current line and printing it when on the next line, depending on what's on the saved and next line. There are some picky details to straighten there, not super amenable to one-liners.

Both tested with input file.c

#include&lt;item_b/item_bC.h>
#include&lt;item_a/item_aC.h>

#include&lt;item_c/item_cC.h>

int main() {

    return 1;
}

where we end up with two item_b and one item_a and two itewm_c includes and no empty lines, and the rest of the file is unaffected.


Special characters are mentioned so I'll comment. But please consult more complete resources, like tutorial perlretut and reference perlre. See also perlrebackslash

Characters special for regex can mostly be matched as literal characters in a pattern when escaped with \. But in this case that's not needed: the role of / in a regex is only to delimit the pattern, commonly given as /.../, but here I use {}{} as delimiters; so / isn't special here and can be used freely. For example

perl -0777 -wpe's{ ^\s*\n (?= (\#include<item_./.*\n) ) }{$1}mxg' file.c

matches lines from the input file I used, shown above.

There is clearly a more general pattern instead of item in the actual problem, and it's a filename. Most characters that are allowed in a filename can be used literally in a regex. Exceptions, like ., can be escaped, like \. to match a literal ..

For example, a string item_bC.h, where bC characters vary but item and .h are always the same, can be matched with the pattern /item_..\.h/.

  • Related