Home > Net >  How do I groupby activity with hours data type in pandas
How do I groupby activity with hours data type in pandas

Time:05-12

I have been trying to sum the hours by activity in a dataframe but it didn't work.

the code:

import pandas as pd

fileurl   = r'https://docs.google.com/spreadsheets/d/1WuvvsZCfbcioYLvwwHuSunUbs4tjvv05/edit?usp=sharing&ouid=105286407332351152540&rtpof=true&sd=true'
df = pd.read_excel(fileurl, header=0)
df.groupby('Activity').sum()

excel link : https://docs.google.com/spreadsheets/d/1WuvvsZCfbcioYLvwwHuSunUbs4tjvv05/edit?usp=sharing&ouid=105286407332351152540&rtpof=true&sd=true

CodePudding user response:

You have to force hours column to be strings else you will get datetime.time instance.

df = pd.read_excel(fileurl, header=0, dtype={'hours': str})

out = (df.assign(hours=pd.to_timedelta(df['hours']))
         .groupby('Activity', as_index=False)['hours'].sum())
print(out)

# Output
      Activity           hours
0  bushwalking 0 days 04:45:00
1      cycling 0 days 11:30:00
2     football 0 days 03:42:00
3          gym 0 days 07:00:00
4     running  0 days 14:00:00
5     swimming 0 days 13:15:00
6     walking  0 days 04:00:00

CodePudding user response:

You can give it a try with the 'raw' version of your url in github.

So

import pandas as pd
fileurl   = r'https://github.com/Marcos-zoo/datasets/blob/master/time_sports.xlsx?raw=true'
df = pd.read_excel(fileurl)
df.groupby('Activity').sum()
df
  • Related