Home > Enterprise >  AWK column matching pattern
AWK column matching pattern

Time:12-21

I'm trying to write a command to find lines where specific column in a csv file that matches the pattern. I'm struggling with pattern matching for that column

Task: Print lines where 5th column(col5date) is June, July or Aug 2022

Sample csv file:

col1 col2 col3 col4 col5date col6
abcd asdd 2022 asdd 7/4/22 something
abcd asdd 2022 asdd 10/9/22 something
abcd asdd 2022 asdd 12/12/20 something
abcd asdd 2020 asdd 9/1/19 something
abcd asdd 2020 asdd 9/1/22 something
abcd asdd 2021 asdd 9/22/19 something
abcd asdd 2021 asdd 2/16/22 something
abcd asdd 2021 asdd 6/16/22 something

Expected output after command: first and last lines since the dates are june and july.

My awk command:

cat file | awk -F'|' '$5 ~ /(6|7|8)\/*\/22$/'

In the pattern "/(6|7|8)\/*\/22$/" I'm trying to say
m/d/Y - m is either 6, 7 or 8
* - for day
22$ - for year and column ends

CodePudding user response:

Like this, assuming the file is csv (commas) and not tsv (tabs) or even | (pipe) separated file:

awk -F, '$5 ~ /^(6|7|8)\/.*\/22$/' file

But there's no matching line in your sample input

CodePudding user response:

I would use something like

awk -F, '$5 ~ "^[6-8]/[^/] /22$"'

so you don't have to escape '/` and also can reject malformed dates

  • Related