I have a csv file that contains a bunch of data with one of the columns being date. I am trying to extract all lines that have dates in a specific year and save it into a new file.
The format of file is like this with the date and time in the second column:
000000000,10/04/2021 02:10:15 AM,.....
So far I tried:
grep -E ^2020 data.csv >> temp.csv
But it just produced an empty temp list. Any ideas on how I can do this?
CodePudding user response:
One potential solution is with awk
:
awk -F"," '$2 ~ /\/2020 /' data.csv > temp.csv
Another potential option is with grep
:
grep "\/2020 " data.csv > temp.csv
However, the grep
solution may detect "/2020 " elsewhere in the file, rather than in column 2.
CodePudding user response:
Although awk
solution is best here, e.g.
awk -F, 'index($2, "/2021 ")' file
grep
can also be used here:
grep '^[^,]*,[^,]*/2021 ' file
See the online demo
Notes:
awk -F, 'index($2, "/2021 ")'
splits the lines (records) into fields with a comma (see-F,
), and if there is a/2021
space in the second field ($2
) the line is printed- the
^[^,]*,[^,]*/2021
pattern in thegrep
command matches^
- start of string[^,]*
- zero or more non-comma chars,[^,]*
- a,
and zero or more non-comma chars/2021
- a literal substring.