Home > database >  How to calculate the mean of gap between dates for different groups in a dataframe
How to calculate the mean of gap between dates for different groups in a dataframe

Time:07-14

So I have a dataset that looks like this:


class    date
  A   2018-01-01
  B   2018-03-05
  A   2018-01-03
  A   2018-01-05
  B   2018-03-10
  A   2018-01-07

I wish to calculate the mean difference between the dates for each class using Pandas, for example, for Class A, we have:

2018-01-01, 2018-01-03, 2018-01-05 and 2018-01-07

The diff between each of these dates is 2 days, so the mean is also 2.

What I expect to get is a grouped dataframe, like the following:


class  mean
  A     2
  B     5

I have tried df.groupby('class')['date'].diff().fillna(pd.Timedelta(seconds=0)).mean(), but it doesn't return the expected output.

CodePudding user response:

You can try something like this:

df.groupby('class', as_index=False)['date']\
  .apply(lambda x: x.diff().mean())

Output:

  class   date
0     A 2 days
1     B 5 days
  • Related