I'm trying to use a lambda expression to be able to extract date with two conditions. when I write this it works
newdf.loc[lambda newdf: newdf['yearly_premium'] > 2784, :]
but when I add another condition it doesnt :/
newdf.loc[lambda newdf: newdf['yearly_premium'] > 2784 and newdf['Total_Claim_Amount'] < 1021.654, :]
can someone tell me how can I adjust it? thank you guys
CodePudding user response:
When passing a lambda
function to DataFrame.loc
, the expression must evaluate to a boolean array. Multiple conditions in a single expression can be specified using boolean operators so that the evaluation of the entire expression is also a boolean array, e.g.:
lambda newdf: (newdf['yearly_premium'] > 2784) & (newdf['Total_Claim_Amount'] < 1021.654)
Note that each condition is wrapped in parentheses. This is because the default Python evaluation order would otherwise not yield the result that you expect.
Example
import pandas as pd
df = pd.DataFrame({'x': [1, 2, 3], 'y': [3, 4, 5]})
newdf = df.loc[lambda df: (df['x'] > 1) & (df['y'] < 5)]
print(df)
print(newdf)
Output
x y
0 1 3
1 2 4
2 3 5
x y
1 2 4
Note about using lambda
It's worth noting that a lambda
function isn't required in this particular use case. In the previous example, the same result can be achieved directly using labels, e.g.:
newdf = df.loc[(df['x'] > 1) & (df['y'] < 5)]
See Indexing and selecting data for additional documentation.