I want to compute the difference between two directories - but not in the sense of diff
, i.e. not of file and subdirectory contents, but rather just in terms of the list of items. Thus if the directories have the following files:
dir1 | dir2 |
---|---|
f1 f2 f4 |
f2 f3 |
I want to get f1
and f4
.
CodePudding user response:
You can use comm
to compare two listings:
comm -23 <(ls dir1) <(ls dir2)
- process substitution with
<(cmd)
passes the output ofcmd
as if it were a file name. It's similar to$(cmd)
but instead of capturing the output as a string it generates a dynamic file name (usually/dev/fd/###
). comm
prints three columns of information: lines unique to file 1, lines unique to file 2, and lines that appear in both.-23
hides the second and third columns and shows only lines unique to file 1.
You could extend this to do a recursive diff using find
. If you do that you'll need to suppress the leading directories from the output, which can be done with a couple of strategic cd
s.
comm -23 <(cd dir1; find) <(cd dir2; find)
CodePudding user response:
Edit: A naive diff-based solution improvement due to @JohnKugelamn! :
diff --suppress-common-lines <(\ls dir1) <(\ls dir2) | egrep "^<" | cut -c3-
Instead of working on directories, we switch to working on files; then we use regular diff, taking only lines appearing in the first file, which diff
marks by <
- then finally removing that marking.
Naturally one could beautify the above by checking for errors, verifying we've gotten two arguments, printing usage information otherwise etc.