How to create symlinks in a single directory when:

The common way fails:

ln -s /readonlyShare/mydataset/*.mrc .

-bash: /bin/ln: Argument list too long

The find command doesn't allow the following syntax:

find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -exec ln -s {} .

Using wild forking takes hours to complete:

find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -exec ln -s {} . ';'

CodePudding user response：

^{I was in a rush when I needed it so I didn't explore all possibilities but I worked-out something meanwhile}

For GNU and BSD you can use find ... -print0 | xargs -0 -I {} ...:

find /readonlyShare/mydataset -maxdepth 1 -name '*.mrc' -print0 |
xargs -0 -I {} -- ln -s {} .

CodePudding user response：

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -exec ln -s '{}' ' ' .

or if you prefer xargs:

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -print0 |
  xargs -0 -P0 sh -c 'ln -s "$@" .' sh

If you are using BSD xargs instead of GNU xargs, it can be simpler:

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -print0 |
  xargs -0 -J@ -P0 ln -s @ .

Why `'{}' ' '`?

Quoted from man find:

-exec utility [argument ...] {}  
             Same as -exec, except that “{}” is replaced with as many pathnames as possible for each invocation of utility.  This behaviour is similar
             to that of xargs(1).  The primary always returns true; if at least one invocation of utility returns a non-zero exit status, find will
             return a non-zero exit status.

find is good at splitting large number of arguments:

find readonlyShare/mydataset -name '*.mrc' -maxdepth 1 -exec ruby -e 'pp ARGV.size' '{}' ' '

Why not `xargs -I`?

It is not efficient and slow because -I executes the utility per argument, for example:

printf 'foo\0bar' | xargs -0 -I@ ruby -e 'pp ARGV' @

["foo"]
["bar"]

printf 'foo\0bar' | xargs -0 ruby -e 'pp ARGV'

["foo", "bar"]

xargs is also good at splitting large number of arguments

seq 65536 | tr '\n' '\0' | xargs -0 ruby -e 'pp ARGV.size'

Why `sh -c`?

Only BSD xargs have -J flag to put arguments in the middle of commands. For GNU xargs, we need the combination of sh -c and "$@" to do the same thing.

`find -exec` vs `find | xargs`

It depends but I would suggest use xargs when you want to utilize all your CPUs. xargs can execute utility parallelly by -P while find can't.

Why '{}' ' '?

Why not xargs -I?

Why sh -c?

find -exec vs find | xargs

Why `'{}' ' '`?

Why not `xargs -I`?

Why `sh -c`?

`find -exec` vs `find | xargs`