Dynamic for loop function for filtering df based on provided parameters-CodePudding

If I wanted to create a new df that only had rows from the original df that fall into specified categories, what would be the most efficient way to do that?

df = sns.load_dataset('diamonds')

def makenewdf(cuts=['Ideal','Premium'], df=df):
[some kind of loop to dynamically filter df based on the values of cuts]

what would be the best way to make this function such that I could provide the categories I want to sequester?

ex: makenewdf(cuts = ['Good']) would return a df containing only rows where the cut was 'Good' and makenewdf(cuts = ['Good','Ideal','Premium']) would return a df with only rows containing one of the three values in cuts

CodePudding user response：

You're searching for the isin() function, you can use something like this:

def makenewdf(cuts, df):
    return df[df.cut.isin(cuts)]

# Example
print(makenewdf(['Good'], df))

# Example
print(makenewdf(['Good','Ideal','Premium'], df))

CodePudding user response：

Like this: filtered_df = df[df['cuts'].isin(['Ideal', 'Premium'])]