I am new to Perl and was wondering if you guys can help me in regards to passing more than one files in the below code;
my @files=<data/j*.*.txt>;
if (@ARGV) {
my $test=$ARGV[0];
$test=lc($test);
print "Using $test instead\n";
@files=</data/$test*.*.txt>;
print "Found @files instead\n";
}
my $outfile='/data/w_c.txt';
my $lotfile='/data/completed.txt';
if (-e $outfile) {
unlink $outfile;
}
In the above code (my @files=<data/j*.*.txt>;
) is currently having all the files starting with j*.*
, But I would like to pass all the below files only;
j*.1.txt
c*.3.1.txt
a*.a.b.txt
- etc..
How could I pass the list of files in the program itself? I am trying to read all those files and extract information from them..!
Thank you in advance..
CodePudding user response:
You can use something like this:
<data/j*.*.txt data/j*.1.txt data/a*.a.b.txt>
There comes a point where it might be best to use <data/*.txt>
and use a regex to filter out all but those you want.
CodePudding user response:
Rather than using globs this way I'd be tempted to switch to opendir
and readdir
and to use an array of patterns in a regex with alternation to select my files. That way you're not using two different text wildcard syntaxes (for glob and for regex) in the same short snippet of code, which I've seen confuse programmers new to Perl before.
# Set your data directory.
my $dir = '/data';
# Take the whole array of arguments on the command line as patterns to
# match in the regex, or default to a short list of patterns if there
# are none.
# (Consider using an options library later rather than messing
# with @ARGV directly if the program becomes more complex.)
my @filespecs = ( scalar @ARGV ? @ARGV : qw( j.*?\.1\.txt c.*?\.3\.1\.txt ) );
# Join the multiple patterns with the regex alternation character.
# This makes them multiple matching options in a single regex.
my $re = join '|', @filespecs;
# Open the directory for reading, or terminate with an error.
opendir my $d, $dir or die "Cannot open directory $dir : $!\n";
# Select into the @files array things read from the directory
# entry that are regular files (-f), do not start with '.',
# and which match the regex.
my @files = grep { (-f) && (!/^\./) && (/$re/) } readdir $d;
# Close the directory handle now that we're done using it.
closedir $d;
Without the overly verbose comments, that boils down to just this.
my $dir = '/data';
my @filespecs = ( scalar @ARGV ? @ARGV : qw( j.*?\.1\.txt c.*?\.3\.1\.txt ) );
my $re = join '|', @filespecs;
opendir my $d, $dir or die "Cannot open directory $dir : $!\n";
my @files = grep { (-f) && (!/^\./) && (/$re/) } readdir $d;
closedir $d;
I elided the last few lines of your original code because it doesn't seem directly related to your question.
Some sources for you to read that may help make sense of this solution.:
perldoc perlop
for the Conditional Operator https://perldoc.perl.org/perlop#Conditional-Operator , and forqw()
https://perldoc.perl.org/perlop#qw/STRING/perldoc perlre
to learn about Perl regexes, especially in this case alternation https://perldoc.perl.org/perlre#Metacharactersperldoc perlfunc
for the-f
file test https://perldoc.perl.org/perlfunc#-X-FILEHANDLE ,opendir
https://perldoc.perl.org/perlfunc#opendir-DIRHANDLE,EXPR ,readdir
https://perldoc.perl.org/perlfunc#readdir-DIRHANDLE ,closedir
https://perldoc.perl.org/perlfunc#closedir-DIRHANDLE , andgrep
https://perldoc.perl.org/perlfunc#grep-BLOCK-LIST