I'm trying to modify block of several lines in several files. Initially, I tried sed but read that Perl might be a better choice. However, my Perl is very basic and I'm not sure how to deal with an empty (new) line and the special character '/'. To sum things up, I'd like to have a one-liner, something like ($perl -i -pe ...
), to convert
(new line)
#include <item_b/item_bC.h>
into
#include <item_a/item_aC.h>
#include <item_b/item_bC.h>
Thanks.
CodePudding user response:
One way -- slurp the file into a string, then match a line with only possibly spaces followed by a line starting with #include...
, and replace what's matched with that #include
line twice
perl -0777 -wpe's{ ^\s*\n ( \#include.*\n ) }{$1$1}mxg' file.c
With -0777
it slurps the whole file into $_
and with -p
it prints $_
one every line (once when used with -0777
since hte whole file is in $_
so there is one such "line"); see switches in perlrun. The /m
modifier makes ^
(and $
) also match line boundaries inside a (multiline) string.
Or, in the same general approach (slurp the file) but use a lookahead
perl -0777 -wpe's{ ^\s*\n (?= (\#include.*\n) ) }{$1}mxg' file.c
Matches an empty line after which a lookahead finds a line starting with #include
, which is also captured so to replace the empty line with it. Since lookarounds don't consume anything there is no need to replace that line (with itself).
Note, the .*
is greedy and matches as much as possible up to the pattern that follows it, and here we have the whole file ahead of it so it may appear that .*\n
will match all the way to the very last \n
in the file! However, .
doesn't match a line-feed (with /s
modifier it does) so .*\n
here stops at the first newline, so it matches the rest of the line.
If a more specific include statement need be matched add details following the #include
pattern.†
Otherwise, one can process line by line, by copying the current line and printing it when on the next line, depending on what's on the saved and next line. There are some picky details to straighten there, not super amenable to one-liners.
Both tested with input file.c
#include<item_b/item_bC.h>
#include<item_a/item_aC.h>
#include<item_c/item_cC.h>
int main() {
return 1;
}
where we end up with two item_b
and one item_a
and two itewm_c
includes and no empty lines, and the rest of the file is unaffected.
† Special characters are mentioned so I'll comment. But please consult more complete resources, like tutorial perlretut and reference perlre. See also perlrebackslash
Characters special for regex can mostly be matched as literal characters in a pattern when escaped with \
. But in this case that's not needed: the role of /
in a regex is only to delimit the pattern, commonly given as /.../
, but here I use {}{}
as delimiters; so /
isn't special here and can be used freely. For example
perl -0777 -wpe's{ ^\s*\n (?= (\#include<item_./.*\n) ) }{$1}mxg' file.c
matches lines from the input file I used, shown above.
There is clearly a more general pattern instead of item
in the actual problem, and it's a filename. Most characters that are allowed in a filename can be used literally in a regex. Exceptions, like .
, can be escaped, like \.
to match a literal .
.
For example, a string item_bC.h
, where bC
characters vary but item
and .h
are always the same, can be matched with the pattern /item_..\.h/
.