I'm trying to write a command to find lines where specific column in a csv file that matches the pattern. I'm struggling with pattern matching for that column
Task: Print lines where 5th column(col5date) is June, July or Aug 2022
Sample csv file:
col1 | col2 | col3 | col4 | col5date | col6 |
---|---|---|---|---|---|
abcd | asdd | 2022 | asdd | 7/4/22 | something |
abcd | asdd | 2022 | asdd | 10/9/22 | something |
abcd | asdd | 2022 | asdd | 12/12/20 | something |
abcd | asdd | 2020 | asdd | 9/1/19 | something |
abcd | asdd | 2020 | asdd | 9/1/22 | something |
abcd | asdd | 2021 | asdd | 9/22/19 | something |
abcd | asdd | 2021 | asdd | 2/16/22 | something |
abcd | asdd | 2021 | asdd | 6/16/22 | something |
Expected output after command: first and last lines since the dates are june and july.
My awk command:
cat file | awk -F'|' '$5 ~ /(6|7|8)\/*\/22$/'
In the pattern "/(6|7|8)\/*\/22$/"
I'm trying to say
m/d/Y - m is either 6, 7 or 8
* - for day
22$ - for year and column ends
CodePudding user response:
Like this, assuming the file is csv
(commas) and not tsv
(tabs) or even |
(pipe) separated file:
awk -F, '$5 ~ /^(6|7|8)\/.*\/22$/' file
But there's no matching line in your sample input
CodePudding user response:
I would use something like
awk -F, '$5 ~ "^[6-8]/[^/] /22$"'
so you don't have to escape '/` and also can reject malformed dates