I have a set of > 100,000 files to which I want to apply a sed script. Reading the accepted answer to this question:
Applying sed with multiple files
I see the suggestion involves invoking sed
for every single one of the files one is interested in, e.g.:
find $root_path -type f -name "whatever" -exec sed -i "somecommands" {} \;
but while this works - it's a bit silly. After all, sed is willing to work on many files, e.g.:
sed -i "somecommads" input_file_1 input_file_2
So, I would tend to prefer:
sed -i "somecommads" $(find $root_path -type f -name "whatever")
and this does work.
... except when you have a lot of files. Then, bash tells you "argument list is too long".
Is there an alternative to these two approaches for applying the same sed script to tends, or hundreds, of thousands of files?
CodePudding user response:
Do:
find $root_path -type f -name "whatever" -print0 | xargs -0 sed -i "somecommads"
The -print0
argument to find
causes file paths to be printed with a trailing \0
character rather than a newline, and the corresponding -0
argument to xargs
makes it use \0
as the separator on its input. This will allow for filenames which contain a newline.
CodePudding user response:
You could pipe to xargs
instead, but most find
support another flavour of -exec
, which is run as few times as possible:
find "$root_path" -type f -name "whatever" -exec sed -i "somecommands" {}
You have to make sure sed still treats the files individually, which is indicated using the -s
/--separate
flag – but it's implied by -i
, so you should be good.