I have the following dataframe (created for an example):
In [2]: df
Out[2]:
A B
0 1 2
1 1 3
2 4 6
Assuming that this is a long dataframe, whenever there is a value of 2 in row B, I need to replace that value with the value that is two cells below that.
I tried:
for val in df['B']:
if val == 2:
df['B'] = df.B.shift(-2)
But I get:
A B
0 1 6.0
1 1 NaN
2 4 NaN
And I need:
Out[2]:
A B
0 1 6
1 1 3
2 4 6
Does anyone have suggestions for leaving rows that do not equal '2' unchanged?
CodePudding user response:
You can use where:
df['B'] = df['B'].where(df['B'] != 2, df['B'].shift(-2))
CodePudding user response:
Then we just need to assign it with shift
df.loc[df['B'].eq(2),'B'] = df.B.shift(-2)
df
Out[458]:
A B
0 1 6.0
1 1 3.0
2 4 6.0
CodePudding user response:
You can just loop over the dataframe, which is somehow not optimized at all but works:
import pandas as pd
df = pd.DataFrame([[1, 2], [1, 3], [4, 6]], columns=['A', 'B'])
for ix, row in df.iterrows():
if row['B'] == 2 and ix < len(df)-2:
row['B'] = df.iloc[ix 2]['B']
There are probably better/more elegant ways to do this.