Python and Pandas newbie here. I have the following dataframe and I would like to be able compare rows where the route and vals are the same in the previous row/rows and update the frm and to accordingly.
DF:
route frm to val
0 1 0 100 3
1 1 100 300 2
2 1 300 500 3
3 1 500 9999 3
4 2 0 100 3
5 2 100 300 3
6 2 300 500 3
7 2 500 9999 3
Desired Output:
route frm to val
0 1 0 100 3
1 1 100 300 2
3 1 300 9999 3
7 2 0 9999 3
I have tried the following using shift()
which gets me some part of the way, but I am not sure the best way on how to achieve the desired output.
Any suggestions would be appreciated.
df['f'] = np.where((df.route.eq(df.route.shift())) & (df.val == df.val.shift()),df.frm.shift(),df.frm)
df['t'] = np.where((df.route.eq(df.route.shift())) & (df.val == df.val.shift()),df.to.shift(),df.to)
Output:
route frm to val f t
0 1 0 100 3 0 100
1 1 100 300 2 100 300
2 1 300 500 3 300 500
3 1 500 9999 3 300 500
4 2 0 100 3 0 100
5 2 100 300 3 0 100
6 2 300 500 3 100 300
7 2 500 9999 3 300 500
CodePudding user response:
Use a custom groupby.agg
:
(df.groupby(['route', df['val'].ne(df['val'].shift()).cumsum()],
as_index=False)
.agg({'route': 'first',
'frm': 'first',
'to': 'last',
'val': 'last'})
)
Output:
route frm to val
0 1 0 100 3
1 1 100 300 2
2 1 300 9999 3
3 2 0 9999 3