Home > Blockchain >  Issue combining 5 minute time intervals into hourly time intervals
Issue combining 5 minute time intervals into hourly time intervals

Time:07-21

This is my first time posting so excuse me if my post isn't the best...

I previously looked at this to combine my 5 minute intervals into hourly intervals and show the volumes per every hour. Combine 5 minute timeframes into hourly.

Here is a snippet of the CSV data I am working with. The delimiter is a comma. Data goes all the way from June 20117 through to the end of 2021.

datestamp timestamp det_vol
2017-06-01 13:45 582
2017-06-01 13:50 592
2017-06-01 13:55 578
2017-06-01 14:00 544
2017-06-01 14:05 521

I read the data in using pd.read_csv() and sorted through the values to obtain all the volume readings from the month of August for 2017

vds2 = vds2[(vds2['datestamp'].dt.year == 2017) & (vds2['datestamp'].dt.month == 8)]
vds2 = vds2.rename(columns={'det_vol': 'Volume'})
vds2['Date'] = pd.to_datetime(vds2['datestamp']).dt.date
vds2['Time'] = pd.to_datetime(vds2['timestamp']).dt.time
vds2 = vds2[['Date','Time','Volume']]
vds2 = vds2.reset_index(drop=True)

This got me to my final dataframe:

          Date       Time    Volume
0   2017-08-01      00:00        57
1   2017-08-01      00:05        97
2   2017-08-01      00:10        79
3   2017-08-01      00:15        63
4   2017-08-01      00:20        87
... ... ... ...

However when I try to implement the following I get an error on the first line

vds2.index = pd.to_datetime(vds2['Date']   ' '   vds2['Time'])
vds2 = vds2.resample('H').agg({'Volume':'sum'})

TypeError: unsupported operand type(s) for  : 'DatetimeArray' and 'str'

I thought maybe the issue was the format of the Date and Time whic is why I tried implementing

vds2['Date'] = pd.to_datetime(vds2['Date']).dt.date
vds2['Time'] = pd.to_datetime(vds2['Time']).dt.time

But I still got a similar error...

TypeError: unsupported operand type(s) for  : 'datetime.date' and 'str'

I'm sure this is quite simple but would appreciate any help. Thanks!

CodePudding user response:

>>> df.index = pd.to_datetime(df['datestamp']   df['timestamp'])
>>> df.resample('H').det_vol.sum()
2017-06-01 13:00:00    1752
2017-06-01 14:00:00    1065
Freq: H, Name: det_vol, dtype: int64

You need to have the datetime in your index for resample to work.

CodePudding user response:

After testing out you additional code and creating a small "CSV" file out of your sample, I got the same unwanted error. I am not certain, but in searching what the appeared to be happening is that the following line of code was storing your timestamp as a string.

vds2['Time'] = pd.to_datetime(vds2['timestamp']).dt.time

The subsequent call to the "pd.to_datetime" function was falling over.

When I had been trying out various scenarios and getting output, I was not converting the date stamp or timestamp fields. I just left them as is. So what I did was just rename the columns after importing the data from the "CSV" file. Following is my sample code, based upon your program but not doing any data element conversion. I am just renaming the columns.

import pandas as pd

vds2=pd.read_csv('DateTime.csv')

print(vds2)

#vds2 = vds2[(vds2['datestamp'].dt.year == 2017) & (vds2['datestamp'].dt.month == 8)]
vds2 = vds2.rename(columns={'det_vol': 'Volume'})
vds2 = vds2.rename(columns={'datestamp': 'Date'})
vds2 = vds2.rename(columns={'timestamp': 'Time'})
vds2 = vds2[['Date','Time','Volume']]
vds2 = vds2.reset_index(drop=True)

vds2.index = pd.to_datetime(vds2['Date']   ' '   vds2['Time'])
vds2.index.name = 'Date and Time'
vds2 = vds2.resample('H').agg({'Volume':'sum'})

rslt_df = vds2[vds2['Volume'] > 0]

print(rslt_df)

Running this program over the sample data resulted in the following output on my terminal.

@Una:~/Python_Programs/DateTime$ python3 DateTime.py 
   index datestamp timestamp  det_vol
0      0  6/1/2017     13:45      582
1      1  6/1/2017     13:50      592
2      2  6/1/2017     13:55      578
3      3  6/1/2017     14:00      544
4      4  6/1/2017     14:05      521
                     Volume
Date and Time              
2017-06-01 13:00:00    1752
2017-06-01 14:00:00    1065

Perhaps someone above my pay grade can better explain what was happening when your program attempted to convert the time data from your data frame to a timestamp, but it looks like you don't need to even go through the datetime function at that point. It looks like you can just rename the columns to your liking and continue summarizing and processing.

Hope that helps.

Regards.

  • Related