Home > database >  Groupby Agg Mean of DateTimeIndex with other agg functions
Groupby Agg Mean of DateTimeIndex with other agg functions

Time:12-30

I'm looking to groupby the following dataframe using agg function of count and sum but also wanted to get the average time difference in minutes between the DateTimeIndex of the dataframe. When i use df.index i get the error TypeError: unhashable type: 'DatetimeIndex'

data = df.groupby('letter').agg({'letter': 'count', 'occurences' : 'sum', df.index: lambda x: x.diff().mean().dt.minutes})}) 

If I reset the index with 'timestamp' as a column.

data = df.reset_index().groupby('letter').agg({'letter': 'count', 'occurences' : 'sum', 'timestamp': lambda x: x.diff().mean().dt.minutes})}) 

I get the error AttributeError: 'NaTType' object has no attribute 'dt'. I believe that is because letter 'D' only has one instance. I was hoping to treat that as 0 mins.

timestamp                  letter occurences                                                             
2021-11-29 12:00:00 00:00  A      100      
2021-11-29 12:30:00 00:00  B      500
2021-11-29 13:00:00 00:00  B      300
2021-11-30 13:30:00 00:00  A      1000
2021-11-30 14:00:00 00:00  C      250
2021-11-29 14:30:00 00:00  A      100      
2021-11-29 15:00:00 00:00  D      500
2021-11-29 15:30:00 00:00  A      300
2021-11-30 16:00:00 00:00  A      1000
2021-11-30 16:30:00 00:00  C      250  

CodePudding user response:

Please try np.timedelta

data = df.reset_index().groupby('letter').agg({'letter': 'count', 'occurences' : 'sum', 'timestamp': lambda x: (x.diff().mean())/np.timedelta64(1, 'm')}) 

CodePudding user response:

You can use:

data = df.groupby('letter').agg(
                         {'letter': 'count',
                          'occurences' : 'sum',
                          'timestamp': lambda x: x.diff().mean().total_seconds()/60.0
                         })
  • Related