How can i subset these lines of code with a for loop I'm trying to subset these lines of code but I couldn't, I think that it could be done with a group by and a dictionary but I'm couldn't
df_belgium = df_sales[df_sales["Country"]=="Belgium"]
df_norway = df_sales[df_sales["Country"]=="Norway"]
df_portugal = df_sales[df_sales["Country"]=="portugal"]
CodePudding user response:
The most straightforward way would be to loop through ["Belgium","Norway","portugal"]
, but trying to create objects with variable variable names like df_{country_name}
is highly discouraged (see here), so I would recommend creating a dictionary to store your subset dataframes with the country names as keys.
You can use a dict comprehension:
df_sales_by_country = {country_name: df_sales[df_sales["Country"]==country_name] for country_name in ["Belgium","Norway","portugal"]}
CodePudding user response:
The ideal is to use groupby
and to store the sub-DataFrames in a dictionary:
d = dict(df.groupby('Country'))
Then access d['Belgium']
for example.
If you need to filter a subset of the countries:
# use a set for efficiency
keep = {'Belgium', 'Norway', 'Portugal'}
d = {key: g for key, g in df.groupby('Country') if country in keep}
or:
keep = ['Belgium', 'Norway', 'Portugal']
d = dict(df[df['Country'].isin(keep)].groupby('Country'))