Home > Net >  get unique root links from a directory of symlinks
get unique root links from a directory of symlinks

Time:07-27

I have a largish directory filled with symlinks (created using ln-s) - about 1million of them. They look like so:

--img_dir
  -- img.jpg --> /path/to/some/img.jpg
  -- imgc.jpg --> /path/to/some/imgc.jpg
  -- imgd.jpg --> /path/to/some/imgd.jpg
  -- img2.jpg --> /path2/to2/some2/img2.jpg
  -- img3.jpg --> /path3/to3/some3/img3.jpg
  -- img21.jpg --> /path21/to21/some21/img2.jpg
  -- img31.jpg --> /path31/to31/some31/img3.jpg
<snip>

for record keeping purposes, I would like a list of unique base_dirs (the root directories) from which the symlinks have been created.

So, I would like the following output:

/path/to/some
/path2/to2/some2
/path3/to3/some3
/path21/to21/some21
/path31/to31/some31

I tried googling around to see how one can achieve this in bash but I am not able to find anything useful..

Any help or pointers would be much appreciated.

CodePudding user response:

  • find can list symlinks
  • realpath turns symlinks into absolute paths
  • dirname strips filename from end of a path
  • sort sorts lines and can dedupe
find img_dir -type l | xargs realpath | xargs dirname | sort -u

Or, logging errors:

find img_dir -type l 2>find-errs     |
xargs realpath       2>realpath-errs |
xargs dirname        2>dirname-errs  |
sort -u               >basedir-list

Some implementations of realpath and dirname may only allow a single argument. In that case, do

... | xargs -I@ realpath @ | xargs -I@ dirname @ | ...

The code above assumes no really wierd paths (eg. mustn't contain newlines).

  • Related