I have a dataframe
df = pd.DataFrame({'Position' : [1,2,3,4,5,6,7,8,9,10],
'Decimal' : [3,1,5,1,5,2,3,3,7,2]})
df
I need to create a new column "nextPosition" which has values like
nextPosition = [5,1,1,-1,-1,3,0,-1,-1,-1]
where the values are derived from column 'Decimal' values, example
5 - in column 'Decimal' 1st row value 3 appears again after 5 values
1 - in column 'Decimal' 2nd row value 1 appears again after 1 value
1 - in column 'Decimal' 3rd row value 5 appears again after 1 value
-1 - in column 'Decimal' 4th row value 1 doesn't appear again, so -1
and so on
CodePudding user response:
What about:
df['nextPosition'] = df.groupby('Decimal', sort=False)['Position'].diff(-1).abs().sub(1).fillna(-1)
print(df)
Position Decimal nextPosition
0 1 3 5.0
1 2 1 1.0
2 3 5 1.0
3 4 1 -1.0
4 5 5 -1.0
5 6 2 3.0
6 7 3 0.0
7 8 3 -1.0
8 9 7 -1.0
9 10 2 -1.0