I'd need to find the cleanest way in bash to extract from hundred of files, the files matching some patterns AND NOT matching some others.
for instance:
for transaction in "TXNA" "TXNB" "TXNC" "TXND" "TXNE" ; do
echo "--> ${transaction}"
grep -L "EXCLUDE_PATTERN1" $(grep -lL "EXCLUDE_PATTERN2" $(grep -Rl --include \*.txt " ${transaction}:" myDir/)) >> myReport.txt
done
so here:
grep -Rl --include \*.txt " ${transaction}:" myDir/
grep in myDir recursively all the files.txt matching the TXNA..B
Then
$(grep -lL "EXCLUDE_PATTERN2" $(grep -Rl --include \*.txt " ${transaction}:" myDir/)
Exclude in the list found before the files containing the patterns EXCLUDE_PATTERN2
and finally:
grep -L "EXCLUDE_PATTERN1"
Exclude in the list found before the files containing the patterns EXCLUDE_PATTERN1
This is quite ugly as I have around 10 patterns to exclude it will become not readable at all.
Any idea for making this code more readable and easy to debug?
Thanks a lot.
CodePudding user response:
I'm not sure I fully understand your question; however, seeing pattern matching while searching for files definitely suggests the use of find
.
for transaction in "TXNA" "TXNB" "TXNC" "TXND" "TXNE" ; do
find ./myDir -name "${yes_pattern}" ! -name "${no_pattern}" -print >> my report.txt
done
Find is a sophisticated tool designed to do what you want -- use man find
to see additional options, including the -exec switch.
CodePudding user response:
You could use grep and xargs command to get your result, i.e.:
grep -lHZR -e 'firstpattern' yourdir |xargs -0 grep -lHZ 'secondpattern' |xargs -0 grep -lHZ 'thirdpattern' |xargs -0 grep -LHZ firstantipattern ... |xargs -0 grep -LH lastantipattern
All but first and last grep have same switches (-lHZ
, and optionally -LHZ
for antipatterns). First also have R, to list files in your directory, and last one does not have Z
, to your final output is not null-terminated.
Z
options enables passing output as null-terminated, to allow work with files containing blanks in names, and H
enforces grep to print filename even if only one file is found.