I have a question about filltering pandas dataframe by optionals as below:
Request: in case 'brand'=='All" or 'lo' == 'All', will exclude from filtering. Issue: I try to process as below but doesn't work. Please help me to fix code.
def filter_for_company(df_source, season, brand, lo):
mask = (
(df_source['Production Priority'] == 'Primary')
& (df_source['Season'] == season)
& (if brand =='All':
pass
else:
df_source['Brand'] == brand
)
& (if lo ='All':
pass
else:
df_source['Liaison Office Code'] == lo
)
)
company_df = df_source.loc[mask,:]
return company_df
CodePudding user response:
Your if-else
statement within the mask definition is not giving you the required boolean condition you want. You can try modifying the codes to properly set the boolean conditions, as follows:
def filter_for_company(df_source, season, brand, lo):
mask = (
(df_source['Production Priority'] == 'Primary')
& (df_source['Season'] == season)
& (brand !='All')
& (lo !='All')
)
if brand !='All':
df_source['Brand'] == brand
if lo !='All':
df_source['Liaison Office Code'] == lo
company_df = df_source.loc[mask,:]
return company_df
Note that this may not yet be the best optimized codes to achieve your purpose. However, without the overall picture of the dataframe and sample data, it is hard to further optimize it.
CodePudding user response:
The conditional slicing for pandas dataframes using loc
or iloc
are designed for filtering based on values actually present in the dataframe. What you want to achieve should be handled separately by the code which calls your filtering function.
However, if you do not have a control over that, you can modify your function as below:
def filter_for_company(df_source, season, brand, lo):
mask = (
(df_source["Production Priority"] == "Primary")
& (df_source["season"] == season)
)
if brand != "All":
df_source = df_source[df_source["Brand"] == brand]
if lo != "All":
df_source = df_source[df_source["Liaison Office Code"] == lo]
company_df = df_source.loc[mask, :]
return company_df