shell script remove duplicate column from output-CodePudding

Shell script only !

I have the following output of 2 columns. I would like to eliminate the duplicates in the col 2.

Example output now:

1    Sample1
1    Sample2
1    Sample3
2    Sample1
2    Sample2
2    Sample3
3    Sample1
3    Sample4

Desired output:

1    Sample1
1    Sample2
1    Sample3
3    Sample4

Thank you !

CodePudding user response：

This compact one-liner will give you that output:

awk '!a[$2]  ' input

CodePudding user response：

You could harness uniq for this task if it is acceptable to have output sorted lexicographically by 2nd item, as is case with your example. Let file.txt content be

1    Sample1
1    Sample2
1    Sample3
2    Sample1
2    Sample2
2    Sample3
3    Sample1
3    Sample4

then

sort -k2 -u file.txt

output

1    Sample1
1    Sample2
1    Sample3
3    Sample4

Explanation: -k2 means do use 2nd field, -u does only print unique lines.