Home > Mobile >  How do I remove duplicate lines using Sed without sorting?
How do I remove duplicate lines using Sed without sorting?

Time:04-05

I've been trying to figure out how to delete duplicate lines using only Sed and I'm having trouble figuring out how to do it.

So far I've tried this and it hasn't worked.

sed '$!N; /^\(.*\)\n\1$/!P; D'

file:

APPLE

ORANGES

BANANA

BANANA

COOKIES

FRUITS

What I got:

APPLE

ORANGES

BANANA

BANANA

COOKIES

FRUITS

What I want:

APPLE

ORANGES

BANANA

COOKIES

FRUITS

I've been trying to figure out how to do it so I won't have to manually go through each line in a file and tell it to manually delete the duplicates.

My goal is for this to eventually delete the second instance of BANANA.

Can anyone point me in the right direction?

Thanks

CodePudding user response:

mmm that is odd, that seems to work for me. Is it because you have an empty line in between each text-line ?

~$ cat test.txt
APPLES
ORANAGES
BANANA
BANANA
COOKIES
FRUITS

~$ cat test.txt |  sed '$!N; /^\(.*\)\n\1$/!P; D'
APPLES
ORANAGES
BANANA
COOKIES
FRUITS

CodePudding user response:

Using sed

$ sed -n '/^$/d;G;/^\(.*\n\).*\n\1$/d;H;P;a\ ' input_file
APPLE

ORANGES

BANANA

COOKIES

FRUITS

Remove blank lines. Append hold space. If the line is duplicated, delete it, else copy into hold space, print and insert blank lines.

  • Related