Home > Enterprise >  Pass command-line arguments to grep as search patterns and print lines which match them all
Pass command-line arguments to grep as search patterns and print lines which match them all

Time:03-19

I'm learning about grep commands. I want to make a program that when a user enters more than one word, outputs a line containing the word in the data file. So I connected the words that the user typed with '|' and put them in the grep command to create the program I intended. But this is OR operation. I want to make AND operation.

So I learned how to use AND operation with grep commands as follows.

cat <file> | grep 'pattern1' | grep 'pattern2' | grep 'pattern3'

But I don't know how to put the user input in the 'pattern1', 'pattern2', 'pattern3' position. Because the number of words the user inputs is not determined. As user input increases, grep must be executed using more and more pipes, but I don't know how to build this part.

The user input is as follows:

$ [the name of my program] 'pattern1' 'pattern2' 'pattern3' ...

I'd really appreciate your help.

CodePudding user response:

In principle, what you are asking could be done with a loop with output to a temporary file.

file=inputfile
temp=$(mktemp -d -t multigrep.XXXXXXXXX) || exit
trap 'rm -rf "$temp"' ERR EXIT
for regex in "$@"; do
    grep "$regex" "$file" >"$temp"/output
    mv "$temp"/output "$temp"/input
    file="$temp"/input
done
cat "$temp"/input

However, a better solution is probably to arrange for Awk to check for all the patterns in one go, and avoid reading the same lines over and over again.

Passing the arguments to Awk with quoting intact is not entirely trivial. Here, we simply pass them as command-line arguments and process those into an array within the Awk script itself.

awk 'BEGIN { for(i=1; i<ARGC;   i) a[i]=ARGV[i];
        ARGV[1]="-"; ARGC=1 }
{ for(n=1; n<=i;   n) if ($0 !~ a[n]) next; }1' "$@" <file

In brief, in the BEGIN block, we copy the command-line arguments from ARGV to a, then replace ARGV and ARGC to pass Awk a new array of (apparent) command-line arguments which consists of just - which means to read standard input. Then, we simply iterate over a and skip to the next line if the current input line from standard input does not match. Any remaining lines have matched all the patterns we passed in, and are thus printed.

CodePudding user response:

suggesting to use awk pattern logic:

 awk '/RegExp-pattern-1/ && /RegExp-pattern-2/ && /RegExp-pattern-3/ 1' input.txt

The advantages: you can play with logic operators && || on RegExp patterns. And your are scanning the whole file once.

The disadvantages: must provide files list (can't traverse sub directories), and limited RegExp syntax compared to grep -E or grep -P

  • Related