I've been trying to figure out how to delete duplicate lines using only Sed and I'm having trouble figuring out how to do it.
So far I've tried this and it hasn't worked.
sed '$!N; /^\(.*\)\n\1$/!P; D'
file:
APPLE
ORANGES
BANANA
BANANA
COOKIES
FRUITS
What I got:
APPLE
ORANGES
BANANA
BANANA
COOKIES
FRUITS
What I want:
APPLE
ORANGES
BANANA
COOKIES
FRUITS
I've been trying to figure out how to do it so I won't have to manually go through each line in a file and tell it to manually delete the duplicates.
My goal is for this to eventually delete the second instance of BANANA.
Can anyone point me in the right direction?
Thanks
CodePudding user response:
mmm that is odd, that seems to work for me. Is it because you have an empty line in between each text-line ?
~$ cat test.txt
APPLES
ORANAGES
BANANA
BANANA
COOKIES
FRUITS
~$ cat test.txt | sed '$!N; /^\(.*\)\n\1$/!P; D'
APPLES
ORANAGES
BANANA
COOKIES
FRUITS
CodePudding user response:
Using sed
$ sed -n '/^$/d;G;/^\(.*\n\).*\n\1$/d;H;P;a\ ' input_file
APPLE
ORANGES
BANANA
COOKIES
FRUITS
Remove blank lines. Append hold space. If the line is duplicated, delete it, else copy into hold space, print and insert blank lines.