Home > other >  How do i identify an entity (merchant) and perform operations on the entries for that entity in pand
How do i identify an entity (merchant) and perform operations on the entries for that entity in pand

Time:03-04

I am trying to perform operations on the time column for each unique merchant (calculate time between transactions). How do I access the individual merchants in an iteration? is there a way to do that in python?

Thank you.

dataframe

CodePudding user response:

Assuming, time is already a datetime64. Use groupby_diff:

df['delta'] = df.groupby('merchant')['time'].diff()
print(df)

# Output
  merchant                time           delta
0        A 2022-01-01 16:00:00             NaT
1        A 2022-01-01 16:30:00 0 days 00:30:00
2        A 2022-01-01 17:00:00 0 days 00:30:00
3        B 2022-01-01 10:00:00             NaT
4        B 2022-01-01 11:00:00 0 days 01:00:00
5        B 2022-01-01 12:00:00 0 days 01:00:00

If you want to compute the mean between transactions per merchant, use:

out = df.groupby('merchant', as_index=False)['time'].apply(lambda x: x.diff().mean())
print(out)

# Output
  merchant            time
0        A 0 days 00:30:00
1        B 0 days 01:00:00

Setup:

data = {'merchant': ['A', 'A', 'A', 'B', 'B', 'B'],
 'time': [pd.Timestamp('2022-01-01 16:00:00'),
          pd.Timestamp('2022-01-01 16:30:00'),
          pd.Timestamp('2022-01-01 17:00:00'),
          pd.Timestamp('2022-01-01 10:00:00'),
          pd.Timestamp('2022-01-01 11:00:00'),
          pd.Timestamp('2022-01-01 12:00:00')]}
df = pd.DataFrame(data)
  • Related