I need to group the data by customer_id and get the average of purchase dates intervals. My data looks like this:
date customer_id
1/1/2020 1
1/2/2020 2
1/3/2020 3
1/4/2020 1
1/5/2020 2
1/1/2021 1
1/2/2021 2
1/3/2021 3
So I need to see what is the average date ranges for each customer. The desired output is:
customer_id Average_date_ranges(in months)
1 7
2 5
3 12
CodePudding user response:
I think you can simply use transforming the date column type into a datetime
object then call groupby
to get the average date. You can use code below:
df["date"] = pd.to_datetime(df["date"])
df.groupby("customer_id").mean()
CodePudding user response:
A simple approximation using groupby_apply
could be:
>>> df.groupby('customer_id', as_index=False)['date'] \
.apply(lambda x: round(x.diff().mean().days / 30, 0))
customer_id date
0 1 6.0
1 2 6.0
2 3 12.0