Home > Blockchain >  gzip files based on regular expression
gzip files based on regular expression

Time:10-04

I would like to gzip all files in a directory if the file name matches a regular expression. Is there a way I could do something similar to:

gzip \b[^2\W]{2,}\b

Right now, when I do that it gives me an error because it does not know that I want to match a regex.

CodePudding user response:

It's not clear which shell you are asking about. Bash has extended glob patterns which however are still not regular expressions. For proper regex, you will want to try looping over the files:

pat='\b[^_[:alnum:]]{2,}\b'
for file in ./*; do
    if [[ "$file" =~ $pat ]]; then
        gzip "$file"
    fi
done

Bash does not support the Perl-compatible \W (which includes 2 anyway) so I switched to the POSIX character class [^_[:alnum:]] which is basically equivalent. Perhaps see also Bash Regular Expression -- Can't seem to match any of \s \S \d \D \w \W etc

In the general case, you can always use a separate regular expression tool.

printf '%s\0' ./* |
perl -0lne 'print if /\b[^2\W]{2,}\b/' |
xargs -0 gzip

The shenanigans with \0 bytes is to support arbitrary file names robustly (think file names with newlines in them, etc); see also http://mywiki.wooledge.org/BashFAQ/020

CodePudding user response:

find -maxdepth 1 -regex '.*\b[^2[^_[:alnum:]]]{2,}\b.*' -exec gzip {}  

The -maxdepth 1 prevents find from traversing subdirectories, which is otherwise its default behavior and primary purpose.

The -regex argument needs to match the whole file name, so I added .* on both sides.

  • Related