grep a substring in a column and print the rows that contain that substring in that column-CodePudding

example:

a,bee,a bee
c,fee,c dee
e,dee,e bee 
g,hee,d deen
h,aee,t Dee

if the block above is a 3x5 data, I want to print the rows that contain 'dee' in the third column, it should be case insensitive and only find the word that matches exactly (for example 'deen' is not acceptable), output should be as follow:

c,fee,c dee
h,aee,t Dee

how would the Bash command looks like

what I have tried is:

awk -F"," '{print $3}' filename | grep -iw 'dee'

but I still need the data at column 2.

CodePudding user response：

Assumed your data is in a file named dat, try this:

sed -ne '/[dD]ee$/p' dat

Or if you like to use awk:

awk '/[dD]ee$/' dat

Or if you like to use grep:

grep -i 'dee$' dat

then the output is

c,fee,c dee
h,aee,t Dee

Try to explore how to use regular expression to match a pattern. In your case your regular expression is [dD]ee$ that matches the pattern dee or Dee at the end $ of any line.

CodePudding user response：

c.f. https://www.gnu.org/software/gawk/manual/html_node/Regexp.html

$: awk -F, 'BEGIN{IGNORECASE=1} $3~/\<dee\>/' file # \< & \> are word boundaries
c,fee,c dee
h,aee,t Dee

Or just with grep -

$: grep -Ei '([^,] ,){2}.*\<dee\>' file # constrain string to 3rd field
c,fee,c dee
h,aee,t Dee