I have this df
d={}
d['id']=['1','1','1','1','1','1','1','1','2','2','2','2','2','2','2','2']
d['qty']=[5,5,5,5,5,6,5,5,1,1,2,2,2,3,5,8]
I would like to create a column that is going to have the following non-equal value of column qty
. Meaning that if qty
is equal to 5 and its next row is 5 I am going to skip it and look until I find next value not equal to 5, In my case it is 6. And all this should be grouped by id
Here is the desired dataframe.
d['id']=['1','1','1','1','1','1','1','1','2','2','2','2','2','2','2','2']
d['qty']=[5,5,5,5,5,6,5,5,1,1,2,2,2,3,5,8]
d['qty2']=[6,6,6,6,6,5,'NAN','NAN',2,2,3,3,3,5,8,'NAN']
Any help is very much appreciated
CodePudding user response:
You can groupby.shift
, mask the identical values, and groupby.bfill
:
# shift up per group
s = df.groupby('id')['qty'].shift(-1)
# keep only the different values and bfill per group
df['qty2'] = s.where(df['qty'].ne(s)).groupby(df['id']).bfill()
output:
id qty qty2
0 1 5 6.0
1 1 5 6.0
2 1 5 6.0
3 1 5 6.0
4 1 5 6.0
5 1 6 5.0
6 1 5 NaN
7 1 5 NaN
8 2 1 2.0
9 2 1 2.0
10 2 2 3.0
11 2 2 3.0
12 2 2 3.0
13 2 3 5.0
14 2 5 8.0
15 2 8 NaN