How to extract a specific year from a column of timestamps in a csv file-CodePudding

I have a large CSV file with many columns, and multiple columns that have timestamps. I want to filter the data for a specific year based on only 1 of those columns.

Example of what some of my input CSV looks like: (there are no headers)


17263847	11/20/2018 3:00:13 PM	11/23/2018 6:45:00 AM	Approved
19483742	12/22/2019 4:00:12 PM	1/10/2020 4:50:11 AM	Approved
38274938	10/10/2018 2:02:19 PM	02/07/2019 1:04:15 PM	Approved

I want to extract all the rows that have 2019 in the second column; so for the example here, I would want to extract the 2nd row but not the 3rd row. Then, I want all of those rows to be put into a new CSV file.

Is there a simple way to do this using grep in command line? I used this but it's not working:

awk -F, '$1=="2019"' file1.csv > file2.csv

Any help would be appreciated!

CodePudding user response：

First of all, in awk the second column is not $1, but $2 (remember that $0 refers to the whole line/register.
Second: Instead of the == literal comparison, use the regex matching ~ (first tutorial I found).

The command you need is:

awk -F, -e '$2 ~ /2019/' file1.csv > file2.csv