/* start of maker a_b.c[0] */
/* start of maker a_b.c[1] */
maker ( "a_b.c[0]" )
maker ( "a_b.c[1]" )
How to extract the strings inside double quotes and store them into an array? I I have no idea of how to do this using perl. Any help would be appreciated. Here's what i have tried.
open(file, "P2.txt");
@A = (<file>) ;
foreach $str(@A)
{
if($str =~ /"a_b.c"/)
{
print "$str \n";
}
}
Note: Only content inside double quotes have to be stored into an array. If you see the 1st line of example you'll be able to see the same String in double quotes. So i can't use matching operator here. It'll also print that String from the 1st line.
CodePudding user response:
It's not about looking for strings in double quotes. It's about defining a pattern (a regular expression) that matches the lines that you want to find.
Here's the smallest change that I can make to your code in order to make this work:
open(file, "P2.txt");
@A = (<file>) ;
foreach $str(@A)
{
if($str =~ /"a_b.c/) # <=== Change here
{
print "$str \n";
}
}
All I've done is to remove the closing double-quote from your match expression. Because you don't care what comes after that, you don't need to specify it in the regular expression.
I should point out that this isn't completely correct. In a regular expression, a dot has a special meaning (it means "match any character here") so to match an actual dot (which is what you want), you need to escape the dot with a backslash. So it should be:
if($str =~ /"a_b\.c/)
Rewriting to use a few more modern Perl practices, I would do something like this:
# Two safety nets to find problems in your code
use strict;
use warnings;
# say() is a better print()
use feature 'say';
# Use a variable for the filehandle (and declare it with 'my')
# Use three-arg version of open()
# Check return value from open() and die if it fails
open(my $file, '<', "P2.txt") or die $!;
# Read data directly from filehandle
while ($str = <$file>)
{
if ($str =~ /"a_b\.c/)
{
say $str;
}
}
You could even use the implicit variable ($_
) and statement modifiers to make your loop even simpler.
while (<$file>) {
say if /"a_b\.c/;
}
CodePudding user response:
Looking at the sample input you provided, the task can be paraphrased as "extract single string arguments to things that look like function invocations". It seems like there is the added complication not matching in C-style comments. For that, note perlfaq -q comment.
As the FAQ entry demonstrates, ignoring content in arbitrary C-style comments is generally not trivial. I decided to try C::Tokenize to help:
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
use C::Tokenize qw( tokenize );
use Const::Fast qw( const );
use Path::Tiny qw( path );
sub is_open_paren {
($_[0]->{type} eq 'grammar') && ($_[0]->{grammar} eq '(');
}
sub is_close_paren {
($_[0]->{type} eq 'grammar') && ($_[0]->{grammar} eq ')');
}
sub is_comment {
$_[0]->{type} eq 'comment';
}
sub is_string {
$_[0]->{type} eq 'string';
}
sub is_word {
$_[0]->{type} eq 'word';
}
sub find_single_string_args_in_invocations {
my ($source) = @_;
my $tokens = tokenize(path( $source )->slurp);
for (my $i = 0; $i < @$tokens; $i) {
next if is_comment( $tokens->[$i] );
next unless is_word( $tokens->[$i] );
next unless is_open_paren( $tokens->[$i 1] );
next unless is_string( $tokens->[$i 2] );
next unless is_close_paren( $tokens->[$i 3]);
say $tokens->[$i 2]->{string};
$i = 3;
}
}
find_single_string_args_in_invocations($ARGV[0]);
which, with your input, yields:
C:\Temp> perl t.pl test.c
"a_b.c[0]"
"a_b.c[1]"