Home > Net >  Filtering a list of files with a perl one-liner
Filtering a list of files with a perl one-liner

Time:09-17

I'm trying to filter at the command line using grep's -P option, which is supposed to be using perl's regex

ls | grep -P ZZZZZTYT.vcf.gz works

but

ls | grep -P ZZZZZTYT.vcf.gz$

does not work. It appears that anchors don't work with grep -P for GNU grep 3.4.

These examples are trivial, of course.

I've also tried filtering with one-liners, a la perl one-liner like grep?

ls | perl -ne 'print $1 if not $_ =~ m/\.gz$/'

but that didn't work either.

ls | perl -ne 'print $1 if not /\.gz$/'

my guess is that the best bet is the perl one-liner.

How can I re-write the above one-liner to grep on a list of files?

CodePudding user response:

Despite some issues in your examples, I couldn't reproduce your problem.

For command ls | grep -P ZZZZZTYT.vcf.gz works and ls | grep -P ZZZZZTYT.vcf.gz$ don't, my first guess it's you have whitespaces or other "invisible" characters at the end your file. You can try ls | cat -A (or cat -veT) to see if in fact there more than you can see. Anyway your regex can be better written with literal dots (\.), as . alone matches anything.

In your perl onliners, you are trying to print $1 and this variable is empty, from the perldoc perlvar:

$<digits> ($1, $2, ...)
     Contains the subpattern from the corresponding set of capturing
     parentheses from the last successful pattern match, not counting
     patterns matched in nested blocks that have been exited already.

     These variables are read-only and dynamically-scoped.

     Mnemonic: like \digits.

I think you want print $_, this variable holds the content of current line when you use -n switch (references in perlvar and perlfunc). Than you could rewrite your perl oneliner as:

ls | perl -ne'/\.gz$/ or print' # for not .gz files

or

ls | perl -ne'/\.gz$/ and print' # list .gz files

Using your examples, it is enough to remove the $1 from the online.

As already pointed you need to check if there is something at the end of your filenames.

If there "bad characters" at the and of your filenames, this oneliner will work for listing .gz files:

ls | perl -ne'/\.gz.*$/ and print'
  • Related