Home > Software design >  Duplicate entries in file
Duplicate entries in file

Time:02-21

I have a file with content as below,

123 ABC
12345 ABC-test

In the shell script, I need an exact entry instead of two duplicate results, but unable to get the exact entry. For example: grep "ABC" returns both the entries, but I want a specific entry, i.e., if I search for "ABC", I should get only "123 ABC" and not the other entry.

CodePudding user response:

You have to forge proper regex (regular expression) - in this case you want only those lines, where ABC is not surrounded by other characters (is on boundaries):

grep -e '\bABC\b' 

should do the work. -e switch enables extended regular expressions in grep. Check also some regex tutorials, i.e. https://www.regular-expressions.info/tutorial.html.

CodePudding user response:

Since you consider words to be whitespace-separated chunks, it is easier to use awk here since it reads lines (records) and splits them into fields (non-whitespace chunks) by default:

awk '$2=="ABC"' file > newfile
awk '/([[:space:]]|^)ABC([[:space:]]|$)/' file > newfile

Here, the first awk will output all lines where the second word is ABC. The second awk outputs all lines with ABC followed/preceded with a whitespace or at start/end of the line.

See the online demo:

#!/bin/bash
s='123 ABC
12345 ABC-test'
awk '$2=="ABC"' <<< "$s"
awk '/([[:space:]]|^)ABC([[:space:]]|$)/' <<< "$s"

Output:

123 ABC
  • Related