Home > OS >  Can I do a Bash wildcard expansion (*) on an entire pipeline of commands?
Can I do a Bash wildcard expansion (*) on an entire pipeline of commands?

Time:01-26

I am using Linux. I have a directory of many files, I want to use grep, tail and wildcard expansion * in tandem to print the last occurrence of <pattern> in each file:

Input: <some command>
Expected Output: 
<last occurrence of pattern in file 1>
<last occurrence of pattern in file 2>
...
<last occurrence of pattern in file N>

What I am trying now is grep "pattern" * | tail -n 1 but the output contains only one line, which is the last occurrence of pattern in the last file. I assume the reason is because the * wildcard expansion happens before pipelining of commands, so the tail runs only once.

Does there exist some Bash syntax so that I can achieve the expected outcome, i.e. let tail run for each file?

  • I know I can always use a for-loop to solve the problem. I'm just curious if the problem can be solved with a more condensed command.

I've also tried grep -m1 "pattern" <(tac *), and it seems like the aforementioned reasoning still applies: wildcard expansion applies to only to the immediate command it is associated with, and the "outer" command runs only once.

CodePudding user response:

Wildcards are expanded on the command line before any command runs. For example if you have files foo and bar in your directory and run grep pattern * | tail -n1 then bash transforms this into grep pattern foo bar | tail -n1 and runs that. Since there's only one stream of output from grep, there's only one stream of input to tail and it prints the last line of that stream.

If you want to search each file and print the last line of grep's output separately you can use a loop:

for file in * ; do
  grep pattern "${file}" | tail -n1
done

The problem with non-loop solutions is that tail doesn't inherently know where the output of one file ends and the output of another file begins, or indeed that there are even files involved on the other end of the pipe. It just knows input is coming in from somewhere and it has to print the last line of that input. If you didn't want a loop, you'd have to use a more powerful tool like awk and perhaps use the fact that grep prepends the names of matched files (if multiple files are matched, or with -H) to delimit the start and end of outputs from each file. But, the work to write an awk program that keeps track of the current file to know when its output ends and print its last line is probably more effort than is worth when the loop solution is so simple.

CodePudding user response:

You can achieve what you want using xargs. For your example it would be:

ls * | xargs -n 1 sh -c 'grep "pattern" $0 | tail -n 1'

Can save you from having to write a loop.

  • Related