I have a set of files with sample data like the one below, I need to transform the data to put all the object files, following -o
, in 1st column and linked libs, following -l
, in 2nd column. This format is consistent in the entire make output.
hello there -o one two three four -labc -lfoo -lbar
something useless -o abc doo zoo -lkoo -lfoo -lmoo
I am trying to parse it to a simpler format for further processing:
one two three four, abc foo bar
abc doo zoo, koo foo moo
I am trying this, clearly this is not what I was trying to get:
perl -ne '/-o(.*?)-/m; @libs = /-l([^ ] )/gs; printf "%s %s\n", $1 , join(", ", @libs);' inputfile
bar
abc, foo, bar
moo
koo, foo, moo
Here, I am trying to store all the objects into $1
and all the libs in @libs
array. Only libs are correctly printed, but objects are incorrect, can someone help fixing it? I seperatly verified that $1
is holding the correct value.
perl -wne '/-o(.*?)-/m; printf "%s %s\n", $1, " "' inputfile
one two three four
abc doo zoo
Similarly, when I am printing the 2nd part(libs) seperatly, its also works.
perl -ne '@libs = /-l([^ ] )/gs; printf "%s\n", join(", ", @libs);' x
abc, foo, bar
koo, foo, moo
So, it only messes up when I combine the two together.
CodePudding user response:
perl -wnlE'
($o, @l) = /-(?:o|l) \s* ([^-] ) /gx;
s/\s $// for $o, @l;
say join ", ", $o, "@l"
' file
On the file
with given two lines it prints
one two three four, abc foo bar
abc doo zoo, koo foo moo
For this to work as intended it is critical that there is first one -o
option, then follow -l
ones (possibly multiple), so that -(o|l)
capture in the right order and ($o, @l)
store correctly.†
Since multiple files can be listed after -o
, with spaces in between, we have to allow spaces in the pattern and so will catch the trailing ones as well; so trailing space cleanup is necessary.
(I'd expect that by tweaking the pattern one should be able to correct this so that post-capture cleanup isn't needed but I can't see it right now.)
† This format has been confirmed in a comment, but if there are mutliple -o
entries or the order is different then the easiest way is probably to break it up into two regex
# Capture all `-o` entries, then all `-l` entries (order doesn't matter)
@o = /-o\s ([^-] )/g; @l = /-l\s*([^-] )/g;
or, perhaps, for libs rather use
@l = /-l(\S )/g;
Then print them all as
say join ", ", "@o", "@l";
Comments on the code in the question, which practically gets it right except for one snag
Why it doesn't work: The second regex fills its own capture variables so $1
from the first one is overwritten. A simple way to fix that is by assigning the capture in the first regex like it's done in the second, ($o) = /-o(.*?)-/
(or so). Need ()
around $o
to impose the list context so that the capture(s) is/are returned, not just success/failure (1
/''
)
A few other notes
Don't need
/m
nor/s
, those are for multi-line strings[^ ]
can be written as\S
(non-space :), and is clearer that way I think. So/-l(\S )/g
printf
is extremely powerful and useful when we need to format the print. Here you don't so there's no reason for it, while it's much slower and error-prone; can doprint join(...), "\n";
.Or use
say
, enabled by-E
as opposed to-e
. Since-E
enables all other features, and may not be future-proof, it's better really to useCORE::say
with-e
. In a program you'd douse feature 'say';
at the beginning