Home > Net >  How to eliminate lines from a file while comparing two files
How to eliminate lines from a file while comparing two files

Time:05-18

I have two files, I need an output file which contains everything that is not in the first file but is in the second file, the second file contains everything that is in the first file with some more entries. I tried:

for j in `cat first`; do sed '/"$j"/d' second; done
cat first 
a
b
c
d
e
f
# cat second
a
1
b
22
33
c
44
d
11
e
44
f

CodePudding user response:

Converting my comment to answer so that solution is easy to find for future visitors.

You may use this grep:

grep -vFxf first second

1
22
33
44
11

Options are:

  • -v: Selected lines are those not matching any of the specified patterns
  • -F: Fixed string search
  • -x: Exact match
  • -f: Use a file for patterns

CodePudding user response:

@anubhava's comment is a great answer.

With comm, ignore what unique to first, and ignore what's common

comm --nocheck-order -13 first second

There's a straightforward solution too.

CodePudding user response:

[m/n/g]awk '
BEGIN { FS="^$" } NR==1 { 
   do { __[$-_] } while ((getline)<=(FNR==NR))

} ($-_ in __)!=!___[$-_]-- ' test_first_file.txt test_second_file.txt

————————————————————————————————

1
22
33
44
11

CodePudding user response:

I prefer the answer from @anubhava, it's great for scripting. However, if you'd just like visual aid to see the difference between two files the good old diff command can be a great help.

$  diff -y first second
a                               a
                                  > 1
b                               b
                                  > 22
                                  > 33
c                               c
                                  > 44
d                               d
                                  > 11
e                               e
                                  > 44
f                               f

-y, or --side-by-side, output in two columns.

I've seen this great one as well (full credit to @Kent):

$ awk 'NR==FNR{a[$1]  ;next;}!($0 in a)' first second
1
22
33
44
11
44

There's more commands like these:

  • colordiff - like diff but with color
  • cmp - compare files bytewise
  • vimdiff - diff using the vim editor

There's probably lots of other great ways to do this, these are just some of the ways.

  • Related