I'm familiar with the JavaScript
reducer method, and I'm trying to accomplish something similar with a DataFrame
.
I believe in the method shown below I violate the guidance described in pandas.
You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect.
def use_reducer():
"""reducer"""
df = pd.DataFrame([
{'Thresh': 'SOME', 'A': 1},
{'Thresh': None, 'A': 20},
{'Thresh': None, 'A': 12},
{'Thresh': None, 'A': 12},
{'Thresh': None, 'A': 80, }
])
def reducer(index):
this = df.loc[index]
# first row by detfault has a thresh
if this['Thresh'] == 'SOME':
return df.loc[index, :]
# last row with soem threshold crossed
some = df.loc[df['Thresh'] == 'SOME'].iloc[-1]
# if a threshold is crossed update thresh, this row becomes the next `some`
if (some.A < this.A):
df.loc[index, 'Thresh'] = 'SOME'
return df.loc[index, :]
[reducer(index) for index in df.index]
print(df)
Out
Thresh A
0 SOME 1
1 SOME 20
2 None 12
3 None 12
4 SOME 80
CodePudding user response:
A more efficient (and more pandas-esque :) solution would be to use cummax
and ffill
:
df.loc[df['A'].ge(df['A'].cummax()), 'Thresh'] = df['Thresh'].ffill()
Output:
>>> df
Thresh A
0 SOME 1
1 SOME 20
2 None 12
3 None 12
4 SOME 80