Home > Blockchain >  How can I subset filtering a row per categories with a for loop
How can I subset filtering a row per categories with a for loop

Time:10-15

How can i subset these lines of code with a for loop I'm trying to subset these lines of code but I couldn't, I think that it could be done with a group by and a dictionary but I'm couldn't

df_belgium = df_sales[df_sales["Country"]=="Belgium"]
df_norway = df_sales[df_sales["Country"]=="Norway"]
df_portugal = df_sales[df_sales["Country"]=="portugal"]

CodePudding user response:

The most straightforward way would be to loop through ["Belgium","Norway","portugal"], but trying to create objects with variable variable names like df_{country_name} is highly discouraged (see here), so I would recommend creating a dictionary to store your subset dataframes with the country names as keys.

You can use a dict comprehension:

df_sales_by_country = {country_name: df_sales[df_sales["Country"]==country_name] for country_name in ["Belgium","Norway","portugal"]}

CodePudding user response:

The ideal is to use groupby and to store the sub-DataFrames in a dictionary:

d = dict(df.groupby('Country'))

Then access d['Belgium'] for example.

If you need to filter a subset of the countries:

# use a set for efficiency
keep = {'Belgium', 'Norway', 'Portugal'}

d = {key: g for key, g in df.groupby('Country') if country in keep}

or:

keep = ['Belgium', 'Norway', 'Portugal']

d = dict(df[df['Country'].isin(keep)].groupby('Country'))
  • Related