I'm learning about grep
commands.
I want to make a program that when a user enters more than one word, outputs a line containing the word in the data file.
So I connected the words that the user typed with '|' and put them in the grep
command to create the program I intended.
But this is OR operation. I want to make AND operation.
So I learned how to use AND operation with grep
commands as follows.
cat <file> | grep 'pattern1' | grep 'pattern2' | grep 'pattern3'
But I don't know how to put the user input in the 'pattern1', 'pattern2', 'pattern3' position. Because the number of words the user inputs is not determined.
As user input increases, grep
must be executed using more and more pipes, but I don't know how to build this part.
The user input is as follows:
$ [the name of my program] 'pattern1' 'pattern2' 'pattern3' ...
I'd really appreciate your help.
CodePudding user response:
In principle, what you are asking could be done with a loop with output to a temporary file.
file=inputfile
temp=$(mktemp -d -t multigrep.XXXXXXXXX) || exit
trap 'rm -rf "$temp"' ERR EXIT
for regex in "$@"; do
grep "$regex" "$file" >"$temp"/output
mv "$temp"/output "$temp"/input
file="$temp"/input
done
cat "$temp"/input
However, a better solution is probably to arrange for Awk to check for all the patterns in one go, and avoid reading the same lines over and over again.
Passing the arguments to Awk with quoting intact is not entirely trivial. Here, we simply pass them as command-line arguments and process those into an array within the Awk script itself.
awk 'BEGIN { for(i=1; i<ARGC; i) a[i]=ARGV[i];
ARGV[1]="-"; ARGC=1 }
{ for(n=1; n<=i; n) if ($0 !~ a[n]) next; }1' "$@" <file
In brief, in the BEGIN
block, we copy the command-line arguments from ARGV
to a
, then replace ARGV
and ARGC
to pass Awk a new array of (apparent) command-line arguments which consists of just -
which means to read standard input. Then, we simply iterate over a
and skip to the next line if the current input line from standard input does not match. Any remaining lines have matched all the patterns we passed in, and are thus printed.
CodePudding user response:
suggesting to use awk
pattern logic:
awk '/RegExp-pattern-1/ && /RegExp-pattern-2/ && /RegExp-pattern-3/ 1' input.txt
The advantages: you can play with logic operators &&
||
on RegExp patterns. And your are scanning the whole file once.
The disadvantages: must provide files list (can't traverse sub directories), and limited RegExp syntax compared to grep -E
or grep -P