Home > Blockchain >  Losing Minutes in Column
Losing Minutes in Column

Time:09-22

I am analysing data in csv files which is sorted by Date and Time Column (Sdate) seen as below (Note: this is all one column): Sdate 01/01/2016 00:00 01/01/2016 01:00 01/01/2016 02:00 etc

However when the data to be analysed is split into 15 minute intervals. Example seen below:

Sdate 01/01/2016 00:00 01/01/2016 00:15 01/01/2016 00:30 etc

The output then seems to group my data hourly anyway and also misses data as it continues.

Currently I am reading in all csv files in the directory and sorting them. I used pd.to_datetime function which worked for the hourly intervals but not the 15 minute ones:

for file_ in allFiles:
    df = df = pd.read_csv(file_,index_col=None, header=0,  low_memory=False) 
    df['Sdate'] = pd.to_datetime(df['Sdate'])
    df.reset_index()
    list_.append(df)

Does anyone know if this is an issue with pd.to_datetime or is it possibly an issue with the way I have grouped the contents hourly see below:

hourly = grouped.aggregate(np.sum).reset_index()

Any help would be greatly appreciated. Thank you!

CodePudding user response:

Pandas way of solving this The pandas.read_csv() function has a keyword argument called parse_dates

Using this you can on the fly convert strings, floats or integers into datetimes using the default date_parser (dateutil.parser.parser)

pd.read_csv(file, header=None, names=headers, dtype=dtypes, parse_dates='Sdate')
  • Related