What's wrong with this code for sorting csv file according to date?-CodePudding

I have this csv file which look like this;

datetime_period,value
01 Apr 2021 00:00,92.85
01 Apr 2021 00:30,91.73
11 Feb 2021 19:30,88.58
11 Feb 2021 20:00,88.58
13 Jan 2021 13:00,80.49
13 Jan 2021 13:30,81.95

I want to sort this csv according to the datetime in column datetime_period.

I used this link Sorting a csv object by dates in python as a guide.

Here is my code;

data = csv.reader(open(csv_file_path, 'r'))
data = sorted(data, key=lambda row: datetime.strptime(row[0], "%d %b %Y %H:%M"))

I get the error;

ValueError: time data 'ï»¿time_period' does not match format '%d %b %Y %H:%M'
python-BaseException

I looked at the code and think my code does match the datetime format. Can someone tell me what is wrong with the code or provide an alternative solution? Thank you.

I am using python 3.8.5

CodePudding user response：

You're trying to sort your header line, which does not contain datetime values.

A way around this is to pop off the headers prior to sorting, and add them back after.

data = [line for line in csv.reader(open('test.csv', 'r'))]

headers = data.pop(0)

data = sorted(data, key=lambda row: datetime.strptime(row[0], "%d %b %Y %H:%M"))
#[['13 Jan 2021 13:00', '80.49'], ['13 Jan 2021 13:30', '81.95'], ['11 Feb 2021 19:30', '88.58'], ['11 Feb 2021 20:00', '88.58'], ['01 Apr 2021 00:00', '92.85'], ['01 Apr 2021 00:30', '91.73']]

out = [headers]   data

#[['datetime_period', 'value'], ['13 Jan 2021 13:00', '80.49'], ['13 Jan 2021 13:30', '81.95'], ['11 Feb 2021 19:30', '88.58'], ['11 Feb 2021 20:00', '88.58'], ['01 Apr 2021 00:00', '92.85'], ['01 Apr 2021 00:30', '91.73']]

CodePudding user response：

You need to skip the header row at the beginning of the file. You can easily do that by passing the csv.reader object to the next() built-in function to iterate over it. The code below shows how to do that and properly close the file after its contents have been read.

import csv
from datetime import datetime

csv_file_path = 'time_data.csv'

with open(csv_file_path, 'r', newline='') as file:
    reader = csv.reader(open(csv_file_path, 'r'))
    next(reader)  # Skip header.
    data = sorted(reader, key=lambda row: datetime.strptime(row[0], "%d %b %Y %H:%M"))

print(data)