Home > other >  Parse String Date and Value from CSV in Python [closed]
Parse String Date and Value from CSV in Python [closed]

Time:09-17

My CSV file looks as follows:

Date Value
September 03, 2021  220.32
September 02, 2021  220.43
September 01, 2021  219.74
August 31, 2021 219.34
August 30, 2021 229.55
August 27, 2021 218.33
August 26, 2021 216.26
August 25, 2021 208.34
August 24, 2021 207.43
...
May 11, 2021    200.41
May 10, 2021    302.71
May 07, 2021    304.89
May 06, 2021    303.34
May 05, 2021    301.67
May 04, 2021    301.55
May 03, 2021    302.91
April 30, 2021  302.39
April 29, 2021  303.86
April 28, 2021  302.48
April 27, 2021  302.64
April 26, 2021  302.69
April 23, 2021  302.33
April 22, 2021  300.14
...

How can I parse that CSV file in Python so that the (new) CSV looks as follows?

Date;Value
03.09.2021;220.32
02.09.2021;220.43
01.09.2021;219.74
...
...

I don't have any attempts because I don't know how to do.

CodePudding user response:

you can use inbuilt datetime module or pandas to read your csv with parsed dates.

Datetime Module:

from datetime import datetime

dates = []
values = []

with open(path_to_csv) as csv: lines = csv.readlines()

for l in lines[1:]:
    dates  = [datetime.strptime(' '.join(l.split()[:3]),"%B %d, %Y")]
    values  = [l.split()[-1]]

with open(path_to_csv,'w') as csv:
    csv.write('Date;Value\n')
    for d,v in zip(dates,values):
        csv.write(f'{d.strftime("%d.%m.%Y")};{v}\n')

Link to format keywords: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior

You can try panda read_csv (much more convenient) by yourself here: Pandas read_csv

CodePudding user response:

This function will help you to get the each valid line to the format you want. You may need to improve this function to do the validations and maybe to handle the 1st line which contains column names

from datetime import datetime

def convert(line):
    arr = line.split(" ", 3)
    datestr = "{d}{m},{y}".format(d=arr[1],m=arr[0],y=arr[2])
    dateobj = datetime.strptime(datestr, '%d,%B,%Y')
    newdatestr = dateobj.strftime("%d.%m.%Y")
    valuestr = arr[3].strip()
    return "{date};{value}".format(date=newdatestr, value=valuestr)

  • Related