Home > OS >  How to create symlinks in a single directory with the lowest number of forks?
How to create symlinks in a single directory with the lowest number of forks?

Time:03-15

How to create symlinks in a single directory when:

  1. The common way fails:
ln -s /readonlyShare/mydataset/*.mrc .
-bash: /bin/ln: Argument list too long
  1. The find command doesn't allow the following syntax:
find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -exec ln -s {} .  
  1. Using wild forking takes hours to complete:
find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -exec ln -s {} . ';'

CodePudding user response:

I was in a rush when I needed it so I didn't explore all possibilities but I worked-out something meanwhile

For GNU and BSD you can use find ... -print0 | xargs -0 -I {} ...:

find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -print0 |
xargs -0 -I {} -- ln -s {} .

CodePudding user response:

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -exec ln -s '{}' ' ' .

or if you prefer xargs:

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -print0 |
  xargs -0 -P0 sh -c 'ln -s "$@" .' sh

If you are using BSD xargs instead of GNU xargs, it can be simpler:

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -print0 |
  xargs -0 -J@ -P0 ln -s @ .

Why '{}' ' '?

Quoted from man find:

-exec utility [argument ...] {}  
             Same as -exec, except that “{}” is replaced with as many pathnames as possible for each invocation of utility.  This behaviour is similar
             to that of xargs(1).  The primary always returns true; if at least one invocation of utility returns a non-zero exit status, find will
             return a non-zero exit status.

find is good at splitting large number of arguments:

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -exec ruby -e 'pp ARGV.size' '{}' ' '
15925
15924
15925
15927
1835

Why not xargs -I?

It is not efficient and slow because -I executes the utility per argument, for example:

printf 'foo\0bar' | xargs -0 -I@ ruby -e 'pp ARGV' @
["foo"]
["bar"]
printf 'foo\0bar' | xargs -0 ruby -e 'pp ARGV'
["foo", "bar"]

xargs is also good at splitting large number of arguments

seq 65536 | tr '\n' '\0' | xargs -0 ruby -e 'pp ARGV.size'
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
536

Why sh -c?

Only BSD xargs have -J flag to put arguments in the middle of commands. For GNU xargs, we need the combination of sh -c and "$@" to do the same thing.

find -exec vs find | xargs

It depends but I would suggest use xargs when you want to utilize all your CPUs. xargs can execute utility parallelly by -P while find can't.

  • Related