I need to build the new column about compare the previous date and the previous date must follow a special rule. I need to find the repeat purchase in past 3 months. I have no idea how can do this. There has some example and my expected output.
transaction.csv:
code,transaction_datetime
1,2021-12-01
1,2022-01-24
1,2022-05-29
2,2021-11-20
2,2022-04-12
2,2022-06-02
3,2021-04-23
3,2022-04-22
expected output:
code,transaction_datetime,repeat_purchase_P3M
1,2021-12-01,no
1,2022-01-24,2021-12-01
1,2022-05-29,no
2,2021-11-20,no
2,2022-04-12,no
2,2022-06-02,2022-04-12
3,2021-04-23,no
3,2022-04-22,no
Thanks in advance
CodePudding user response:
df = pd.read_csv('file.csv')
df.transaction_datetime = pd.to_datetime(df.transaction_datetime)
grouped = df.groupby('code')['transaction_datetime']
df['repeated_purchase_P3M'] = grouped.shift().dt.date.where(grouped.diff().dt.days < 90, 'no')
df
code transaction_datetime repeated_purchase_P3M
0 1 2021-12-01 no
1 1 2022-01-24 2021-12-01
2 1 2022-05-29 no
3 2 2021-11-20 no
4 2 2022-04-12 no
5 2 2022-06-02 2022-04-12
6 3 2021-04-23 no
7 3 2022-04-22 no