I have a df like this:
df = pd.DataFrame([[1, 184], [1, 3], [4, 6], [2,183], [7,9], [0,7]], columns=['A', 'B'])
df
A B
0 1 184
1 1 3
2 4 6
3 2 183
4 7 9
5 0 7
I need to iterate through column 'B' and for every cell with a value between 182 and 186, I need to store the value from two cells below that into a variable 'marker'.
I tried:
for val in df['B']:
if 182 < int(val) < 186:
print(val)
marker = df['B'].shift(-2).values[0]
print(marker)
And I get:
184
6.0
183
6.0
But I need:
184
6.0
183
7.0
I would love to hear suggestions for fixing this.
CodePudding user response:
We could use Series.between
and Series.shift
s = df['B'].between(182, 186, inclusive="neither")
df.loc[s | s.shift(2), 'B']
Output
0 184
2 6
3 183
5 7
Name: B, dtype: int64
CodePudding user response:
The problem is that marker = df['B'].shift(-2).values[0]
is always just taking the top value in the shifted column, not the value in relation to the iteration.
If you would like to keep your looping methodology you can zip the values and iterate them at the same time
for val,marker in zip(df['B'], df['B'].shift(-2)):
if 182 < int(val) < 186:
print(val)
print(marker)
184
6.0
183
7.0
CodePudding user response:
vals = df.loc[df.B.isin(range(112,187)), 'B'].rename('val')
markers = df.loc[[i 2 for i in vals.index], 'B'].rename('marker')
out = pd.concat([vals.reset_index(drop=True), markers.reset_index(drop=True)], axis=1)
OUTPUT
val marker
0 184 6
1 183 7