Home > front end >  Groupby column, sort by timestamp and calculate diff between timestamps in Pandas Dataframe?
Groupby column, sort by timestamp and calculate diff between timestamps in Pandas Dataframe?

Time:09-17

I have the following dataframe:

{'Timestamp': {0: Timestamp('2019-06-07 13:37:00'),
  1: Timestamp('2019-03-08 09:10:00'),
  2: Timestamp('2019-06-24 11:45:00'),
  3: Timestamp('2019-01-10 05:32:00'),
  4: Timestamp('2019-01-08 12:24:00'),
  5: Timestamp('2019-10-21 13:40:00'),
  6: Timestamp('2019-03-07 15:50:00'),
  7: Timestamp('2019-02-07 14:48:00'),
  8: Timestamp('2019-05-15 09:07:00'),
  9: Timestamp('2019-12-10 20:56:00'),
  10: Timestamp('2019-01-30 13:58:00'),
  11: Timestamp('2019-05-15 08:37:00'),
  12: Timestamp('2019-10-03 18:17:00'),
  13: Timestamp('2019-01-04 13:23:00'),
  14: Timestamp('2019-02-26 13:55:00'),
  15: Timestamp('2019-07-30 11:47:00'),
  16: Timestamp('2019-11-06 17:35:00'),
  17: Timestamp('2019-02-22 08:03:00'),
  18: Timestamp('2019-07-17 11:16:00'),
  19: Timestamp('2019-09-13 09:24:00')},
 'Col1': {0: 'B',
  1: 'A',
  2: 'A',
  3: 'A',
  4: 'B',
  5: 'A',
  6: 'C',
  7: 'C',
  8: 'C',
  9: 'C',
  10: 'C',
  11: 'A',
  12: 'B',
  13: 'A',
  14: 'A',
  15: 'B',
  16: 'B',
  17: 'B',
  18: 'D',
  19: 'D'}}

I want to groupby Col1, sort the timestamps within the group and calculate the diff and fill with 0's in the first row.

I am trying to do this using the following code but it's not giving me what I need:

df.sort_values(by='Timestamp').groupby('Col1').Timestamp.diff()

CodePudding user response:

If I understand you correctly, you want to sort by ["Col1", "Timestamp"] and then .groupby:

df = df.sort_values(by=["Col1", "Timestamp"])
df["result"] = df.groupby("Col1").diff().fillna(pd.Timedelta(0))
print(df)

Prints:

             Timestamp Col1            result
13 2019-01-04 13:23:00    A   0 days 00:00:00
3  2019-01-10 05:32:00    A   5 days 16:09:00
14 2019-02-26 13:55:00    A  47 days 08:23:00
1  2019-03-08 09:10:00    A   9 days 19:15:00
11 2019-05-15 08:37:00    A  67 days 23:27:00
2  2019-06-24 11:45:00    A  40 days 03:08:00
5  2019-10-21 13:40:00    A 119 days 01:55:00
4  2019-01-08 12:24:00    B   0 days 00:00:00
17 2019-02-22 08:03:00    B  44 days 19:39:00
0  2019-06-07 13:37:00    B 105 days 05:34:00
15 2019-07-30 11:47:00    B  52 days 22:10:00
12 2019-10-03 18:17:00    B  65 days 06:30:00
16 2019-11-06 17:35:00    B  33 days 23:18:00
10 2019-01-30 13:58:00    C   0 days 00:00:00
7  2019-02-07 14:48:00    C   8 days 00:50:00
6  2019-03-07 15:50:00    C  28 days 01:02:00
8  2019-05-15 09:07:00    C  68 days 17:17:00
9  2019-12-10 20:56:00    C 209 days 11:49:00
18 2019-07-17 11:16:00    D   0 days 00:00:00
19 2019-09-13 09:24:00    D  57 days 22:08:00
  • Related