I'm looking for a regex for removing line X when immediately followed by line Y.
I've tried this in a foreach
(files previously gathered)
foreach my $file (@files) {
say "Processing $file in $dir";
open( my $fh, "<", "$file" )
or die "Can't open < $file: $!";
my $data = {};
my $start = "X:";
my $end = "Y:";
my $contents = do { local $/; <$fh> };
my $count = 1;
my $transformed = $contents;
while ( $contents =~ /$start(.*?)$end/sg ) {
say $1 if $1;
my $formatted = $1;
$formatted =~ s/\s //g if $formatted;
$data->{$file}->{$formatted} = $count if $formatted;
$transformed =~ /($start.*?$end)/sg;
my $removed = quotemeta($1) if $1;
$transformed =~ s/$removed/$end/sg if $removed;
}
push @results, $data if ($data);
path($file)->spew_utf8($transformed);
}
but cannot get it to ignore the 'ignore example' below.
Text to process (Y immediately after X):
{
X: John Smith,
Y: 1234 Main Street
}
becomes
{
Y: 1234 Main Street
}
Text to ignore (Y not immediately after X):
{
X: John Smith,
X2: 1234567890,
Y: 1234 Main Street
}
CodePudding user response:
You can use the following:
s/^X:.*\n(?=Y:)//m;
This is short for
s/^X:[^\n]*\n(?=Y:)//m;
g
can be used to remove multiple occurrences.