I am finding the indexes of some values above certain cutoffs in a pandas DataFrame . So far I have achieved that using a series of lambda functions.
data.apply([lambda v:v[v>=0.25].idxmin(),
lambda v:v[v>=0.25].idxmin(),
lambda v:v[v>=0.50].idxmin(),
lambda v:v[v>=0.75].idxmin(),
lambda v:v[v>=0.90].idxmin()])
I have attempted to parametrize a lambda function to an arbitrary list of cutoff values. However, if I use the following, results are not correct as all lambda functions have the same name and basically only the last one is present in the dataframe returned by apply. How to parametrize these lambda correctly?
cutoff_values=[25,50,100]
agg_list=[lambda v,c:v[v>=(float(c)/100.0)].idxmin() for c in cutoff_values]
data.apply(agg_list)
What would be a pythonic-pandasque better approach?
CodePudding user response:
For me working nested lambda functions like:
q = lambda c: lambda x: x[x>=c].idxmin()
cutoff_values=[25,50,90]
print (data.apply([q((float(c)/100.0)) for c in cutoff_values]))
CodePudding user response:
You can use this:
df = pd.DataFrame(data={'col':[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]})
df = df[['col']].apply(lambda x: [x[x >= (float(c) / 100.0)].idxmin() for c in cutoff_values])