Using e Fx and w in grep in loop-CodePudding

I have a file with IDS

cat id.csv
154,art
155,art.br

and another with with TLDs

cat tld.csv
art
abc

when I run with loop like

while read -r tld
do

cat id.csv | grep -w "$tld" >> final.csv

done < tld.csv

it catch the art.br as well

I tried

grep -E "^$tld$"

and

grep -Fx $tld

both give blank result

I only want art to be out as result

CodePudding user response：

cat id.csv | grep -E ",$tld$" >> final.csv

CodePudding user response：

-w option specifies a whole word. art is a whole word in both 154,art and 155,art.br strings.

You may add a $ in grep search string (grep -w "$tld$" id.csv), but I am unsure it will be enough (it would match 150,foo.art for example).

The best would be to simply match the full word after the last comma (the -w option becomes useless):

grep ",${tld}$" id.csv

Or, to catch also lines with some trailing spaces :

grep ",${tld}[[:blank:]]*$" id.csv

Note: You should always avoid the useless cat file | command, and either prefer command < file or command file.

CodePudding user response：

This may be what you want:

tlds=$(<tld.csv)
grep -E ",(${tlds//$'\n'/'|'})\$" id.csv

assuming tld.csv doesn't contain a regexp metacharacter.

CodePudding user response：

If the idea is that lines in tld.csv must exactly match (the entirety of) the second column of lines in id.csv, then an awk solution is quite efficient:

$ awk -F, 'NR==FNR{tld[$1];next} $2 in tld' tld.csv id.csv
154,art

-F, - delimit input fields by comma (records are delimited by newline)
NR==FNR - while reading first file:
- tld[$1] - remember column 1 value (add as array index)
- next - skip remaining commands; read next line
$2 in tld - if 2nd column (of second file) is an index of tld (ie. exactly matches a value stored from the first file), then perform default action (ie. print line)

This assumes that lines in tld.csv cannot contain commas and that any whitespace (in either file) is significant.