My CSV file looks as follows:
Date Value
September 03, 2021 220.32
September 02, 2021 220.43
September 01, 2021 219.74
August 31, 2021 219.34
August 30, 2021 229.55
August 27, 2021 218.33
August 26, 2021 216.26
August 25, 2021 208.34
August 24, 2021 207.43
...
May 11, 2021 200.41
May 10, 2021 302.71
May 07, 2021 304.89
May 06, 2021 303.34
May 05, 2021 301.67
May 04, 2021 301.55
May 03, 2021 302.91
April 30, 2021 302.39
April 29, 2021 303.86
April 28, 2021 302.48
April 27, 2021 302.64
April 26, 2021 302.69
April 23, 2021 302.33
April 22, 2021 300.14
...
How can I parse that CSV file in Python so that the (new) CSV looks as follows?
Date;Value
03.09.2021;220.32
02.09.2021;220.43
01.09.2021;219.74
...
...
I don't have any attempts because I don't know how to do.
CodePudding user response:
you can use inbuilt datetime module or pandas to read your csv with parsed dates.
Datetime Module:
from datetime import datetime
dates = []
values = []
with open(path_to_csv) as csv: lines = csv.readlines()
for l in lines[1:]:
dates = [datetime.strptime(' '.join(l.split()[:3]),"%B %d, %Y")]
values = [l.split()[-1]]
with open(path_to_csv,'w') as csv:
csv.write('Date;Value\n')
for d,v in zip(dates,values):
csv.write(f'{d.strftime("%d.%m.%Y")};{v}\n')
Link to format keywords: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
You can try panda read_csv (much more convenient) by yourself here: Pandas read_csv
CodePudding user response:
This function will help you to get the each valid line to the format you want. You may need to improve this function to do the validations and maybe to handle the 1st line which contains column names
from datetime import datetime
def convert(line):
arr = line.split(" ", 3)
datestr = "{d}{m},{y}".format(d=arr[1],m=arr[0],y=arr[2])
dateobj = datetime.strptime(datestr, '%d,%B,%Y')
newdatestr = dateobj.strftime("%d.%m.%Y")
valuestr = arr[3].strip()
return "{date};{value}".format(date=newdatestr, value=valuestr)