I have input files foo1.txt
, foo2.txt
, foo3.txt
, etc. I have some command munge
that processes the input files, but (for reasons that are not relevant here) the command can only process one input file at a time. I want to combine the output into a single out.txt
.
I know I can do cat foo*.txt
to concatenate all the input files, but as mentioned munge
can only work on each separate file. That is, munge
will not like it if I do cat foo*.txt | munge > out.txt
. Instead I need to perform the processing on each file before the outputs are concatenated.
I'm sure I could loop over the input files using for
, but then how could I combine the output?
Basically I'm looking for something like the equivalent of this, without enumerating all the input files beforehand.
cat foo1.txt | munge > out1.txt
cat foo2.txt | munge > out2.txt
cat foo2.txt | munge > out2.txt
cat out*.txt > out.txt
I'll bet there is some extremely simple command that can do this for me in a single line, perhaps with nested piping and wildcards. Any ideas?
CodePudding user response:
Use a loop and redirect the output of the whole loop to out.txt
. And there's no need to pipe from cat
, you can simply redirect input to the file.
for file in foo*.txt; do
munge < "$file"
done > out.txt
CodePudding user response:
Firstly, you have useless uses of cat (UUoC):
cat foo1.txt | munge > out1.txt
cat foo2.txt | munge > out2.txt
cat foo3.txt | munge > out3.txt # I assume you wanted 3 here
cat out*.txt > out.txt
is done more simply as:
< foo1.txt munge > out1.txt
< foo2.txt munge > out2.txt
< foo3.txt munge > out3.txt
cat out*.txt > out.txt
More usually, all redirections appear after the command, but this is not required:
munge < foo1.txt > out1.txt
munge < foo2.txt > out2.txt
munge < foo3.txt > out3.txt
cat out*.txt > out.txt
In Bash, process substitution could be used to combine these together, like this, but that would be overcomplicated when the final combinator is just catenation:
# useless use of process substitution plus cat (uuopspcat)
cat <(munge < foo1.txt) <(munge < foo2.txt) <(munge < foo3.txt) > out.txt
This can be done instead: run the programs in a subshell via the parentheses operator, and redirect the output of that subshell into the combined file.
(munge < foo1.txt ; munge < foo2.txt ; munge < foo3.txt) > out.txt
If munge
is a "linear operator under catenation", loosely speaking, then it should be possible to do this:
cat foo1.txt foo2.txt foo3.txt | munge > out.txt
For instance if munge
is something like grep foo
, then this transformation is valid. Catenating the grep outputs is the same as grepping the catenated inputs.
If munge
can be extended to take multiple filename arguments and iterate over them, then it can just be:
munge foo1.txt foo2.txt foo3.txt > out.txt
CodePudding user response:
Assuming munge
can take a file name argument:
printf '%s\0' foo*.txt | xargs -0 -n 1 munge > out.txt