sed on many files - can we do better than invoke-once-per-file?-CodePudding

I have a set of > 100,000 files to which I want to apply a sed script. Reading the accepted answer to this question:

I see the suggestion involves invoking sed for every single one of the files one is interested in, e.g.:

find $root_path -type f -name "whatever" -exec sed -i "somecommands" {} \;

but while this works - it's a bit silly. After all, sed is willing to work on many files, e.g.:

sed -i "somecommads" input_file_1 input_file_2

So, I would tend to prefer:

sed -i "somecommads" $(find $root_path -type f -name "whatever")

and this does work.

... except when you have a lot of files. Then, bash tells you "argument list is too long".

Is there an alternative to these two approaches for applying the same sed script to tends, or hundreds, of thousands of files?

CodePudding user response：

Do:

find $root_path -type f -name "whatever" -print0 | xargs -0 sed -i "somecommads"

The -print0 argument to find causes file paths to be printed with a trailing \0 character rather than a newline, and the corresponding -0 argument to xargs makes it use \0 as the separator on its input. This will allow for filenames which contain a newline.

CodePudding user response：

You could pipe to xargs instead, but most find support another flavour of -exec, which is run as few times as possible:

find "$root_path" -type f -name "whatever" -exec sed -i "somecommands" {}

You have to make sure sed still treats the files individually, which is indicated using the -s/--separate flag – but it's implied by -i, so you should be good.