Home > other >  shell script remove duplicate column from output
shell script remove duplicate column from output

Time:11-14

Shell script only !

I have the following output of 2 columns. I would like to eliminate the duplicates in the col 2.

Example output now:

1    Sample1
1    Sample2
1    Sample3
2    Sample1
2    Sample2
2    Sample3
3    Sample1
3    Sample4

Desired output:

1    Sample1
1    Sample2
1    Sample3
3    Sample4
                

Thank you !

CodePudding user response:

This compact one-liner will give you that output:

awk '!a[$2]  ' input

CodePudding user response:

You could harness uniq for this task if it is acceptable to have output sorted lexicographically by 2nd item, as is case with your example. Let file.txt content be

1    Sample1
1    Sample2
1    Sample3
2    Sample1
2    Sample2
2    Sample3
3    Sample1
3    Sample4

then

sort -k2 -u file.txt

output

1    Sample1
1    Sample2
1    Sample3
3    Sample4

Explanation: -k2 means do use 2nd field, -u does only print unique lines.

  • Related