Home > Back-end >  Bash: match in a list of files, cases including and excluding pattern
Bash: match in a list of files, cases including and excluding pattern

Time:02-12

I'd need to find the cleanest way in bash to extract from hundred of files, the files matching some patterns AND NOT matching some others.

for instance:

  for transaction in "TXNA" "TXNB" "TXNC" "TXND" "TXNE" ; do   
      echo "--> ${transaction}"   
      grep -L "EXCLUDE_PATTERN1" $(grep -lL "EXCLUDE_PATTERN2" $(grep -Rl --include \*.txt " ${transaction}:" myDir/))   >> myReport.txt
    done 

so here:

grep -Rl --include \*.txt " ${transaction}:" myDir/ 

grep in myDir recursively all the files.txt matching the TXNA..B

Then

$(grep -lL "EXCLUDE_PATTERN2" $(grep -Rl --include \*.txt " ${transaction}:" myDir/)

Exclude in the list found before the files containing the patterns EXCLUDE_PATTERN2

and finally:

grep -L "EXCLUDE_PATTERN1"

Exclude in the list found before the files containing the patterns EXCLUDE_PATTERN1

This is quite ugly as I have around 10 patterns to exclude it will become not readable at all.

Any idea for making this code more readable and easy to debug?

Thanks a lot.

CodePudding user response:

I'm not sure I fully understand your question; however, seeing pattern matching while searching for files definitely suggests the use of find.

for transaction in "TXNA" "TXNB" "TXNC" "TXND" "TXNE" ; do 
    find ./myDir -name "${yes_pattern}" ! -name "${no_pattern}" -print >> my report.txt
done

Find is a sophisticated tool designed to do what you want -- use man find to see additional options, including the -exec switch.

CodePudding user response:

You could use grep and xargs command to get your result, i.e.:

grep -lHZR -e 'firstpattern' yourdir |xargs -0 grep -lHZ 'secondpattern' |xargs -0 grep -lHZ 'thirdpattern' |xargs -0 grep -LHZ  firstantipattern ... |xargs -0 grep -LH  lastantipattern

All but first and last grep have same switches (-lHZ, and optionally -LHZ for antipatterns). First also have R, to list files in your directory, and last one does not have Z, to your final output is not null-terminated.

Z options enables passing output as null-terminated, to allow work with files containing blanks in names, and H enforces grep to print filename even if only one file is found.

  • Related