Home > Enterprise >  How do I get the list of all items in dir1 which don't exist in dir2?
How do I get the list of all items in dir1 which don't exist in dir2?

Time:08-01

I want to compute the difference between two directories - but not in the sense of diff, i.e. not of file and subdirectory contents, but rather just in terms of the list of items. Thus if the directories have the following files:

dir1 dir2
f1 f2 f4 f2 f3

I want to get f1 and f4.

CodePudding user response:

You can use comm to compare two listings:

comm -23 <(ls dir1) <(ls dir2)
  • process substitution with <(cmd) passes the output of cmd as if it were a file name. It's similar to $(cmd) but instead of capturing the output as a string it generates a dynamic file name (usually /dev/fd/###).
  • comm prints three columns of information: lines unique to file 1, lines unique to file 2, and lines that appear in both. -23 hides the second and third columns and shows only lines unique to file 1.

You could extend this to do a recursive diff using find. If you do that you'll need to suppress the leading directories from the output, which can be done with a couple of strategic cds.

comm -23 <(cd dir1; find) <(cd dir2; find)

CodePudding user response:

Edit: A naive diff-based solution improvement due to @JohnKugelamn! :

diff --suppress-common-lines <(\ls dir1) <(\ls dir2) | egrep "^<" | cut -c3-

Instead of working on directories, we switch to working on files; then we use regular diff, taking only lines appearing in the first file, which diff marks by < - then finally removing that marking.


Naturally one could beautify the above by checking for errors, verifying we've gotten two arguments, printing usage information otherwise etc.

  • Related