I have been trying to sum the hours by activity in a dataframe but it didn't work.
the code:
import pandas as pd
fileurl = r'https://docs.google.com/spreadsheets/d/1WuvvsZCfbcioYLvwwHuSunUbs4tjvv05/edit?usp=sharing&ouid=105286407332351152540&rtpof=true&sd=true'
df = pd.read_excel(fileurl, header=0)
df.groupby('Activity').sum()
CodePudding user response:
You have to force hours
column to be strings else you will get datetime.time
instance.
df = pd.read_excel(fileurl, header=0, dtype={'hours': str})
out = (df.assign(hours=pd.to_timedelta(df['hours']))
.groupby('Activity', as_index=False)['hours'].sum())
print(out)
# Output
Activity hours
0 bushwalking 0 days 04:45:00
1 cycling 0 days 11:30:00
2 football 0 days 03:42:00
3 gym 0 days 07:00:00
4 running 0 days 14:00:00
5 swimming 0 days 13:15:00
6 walking 0 days 04:00:00
CodePudding user response:
You can give it a try with the 'raw' version of your url in github.
So
import pandas as pd
fileurl = r'https://github.com/Marcos-zoo/datasets/blob/master/time_sports.xlsx?raw=true'
df = pd.read_excel(fileurl)
df.groupby('Activity').sum()
df