Home > front end >  Comparing group specific rows only of a data frame grouped based on a specific column
Comparing group specific rows only of a data frame grouped based on a specific column

Time:09-17

I have some data in the form as given below. Which is grouped by ID using pandas. In column MT, it checks if TSMax(n) > TSMin(n 1), where n is the row number. What I want is to check this condition within the groups only and not between the rows of one group and the other group, like in row 4, it gives 'true', but I want it to give false as the ID 23 ends there. Then it should continue checking the condition for rows within ID 24 and then separately for other IDs.

Any help is appreciated!

Code:

import pandas as pd
df=pd.DataFrame({'TS' : [20,25,30,35,45,50,55,58,65,75,60,62,72,77],"ID" : [22,22,23,23,23,23,23,23,23,23,24,24,24,24],"L" : [1,2,1,2,1,2,1,2,1,2,1,2,1,2]})
df2=(df.assign(TS=df['TS'].abs())
    .groupby(['ID','L'])['TS'].agg([('Min' , 'min'), ('Max', 'max')])
    .add_prefix('TS'))
df3=(df2.assign(WT=df2['TSMax']-df2['TSMin']))
df4=(df3.assign(MT=df3['TSMax'].gt(df3['TSMin'].shift(-1))))
df4

OutPut:

         TSMin    TSMax   WT  MT
ID    L               
22    1   20       20     0   False
      2   25       25     0   False
23    1   30       65     35  True
      2   35       75     40  True
24    1   60       72     12  True
      2   62       77     15  False

CodePudding user response:

Use a groupby.shift:

df['K'] = df['T'].gt(df.groupby('ID')['P'].shift(-1))

output:

   ID  L   P   T      K
0  22  1  20  20  False
1  22  2  25  25  False
2  23  1  30  58   True
3  23  2  35  75  False
4  24  1  60  62  False
5  24  2  72  77  False
  • Related