I have a sample dataframe of sales:
product_category | state | total_revenue |
---|---|---|
macbook | New York | 2799 |
macbook | California | 3200 |
macbook | Florida | 5400 |
iphone | California | 700 |
iphone | Texas | 1500 |
For each state in my data frame, I would like to loop over each state and create a specific data frame:
state_lst = ['California', 'New York', 'Texas', 'Florida']
I know the long way would be to write out separate filtering step for each state:
california_df = df[df['state'] == 'California']
But am looking for an efficient way to create a separate dataframe for each state:
for state in state_lst:
state_df = df[df['state'] == state]
state_df.groupby(['product_category'])[['total_revenue']].sum().reset_index()
My desired output is to create a specific dataframe for each state and then group that dataframe by sales in each product category.
Any suggestions?
CodePudding user response:
This will give you a list of dataframes containing only a unique state:
df_list = [df[df.state == unique_state] for unique_state in df.state.unique()]
CodePudding user response:
Maybe this:
df = df.groupby(['state', 'product_category'])[['total_revenue']].sum().reset_index()
[y for x, y in df.groupby('state', as_index=False)]