I'm trying to create a dataframe of sales data from 45 different stores. I saved the data as dataframes in a list: So I have a list of length 45, each element beeing an dataframe of 143 numbers.
What I want: Create a dataframe with an index-column(date) and a seperate column for each store.
What's the easiest way to achieve my goal?
store1 = df_sales_shops[df_sales_shops['Store'] == 1]
index = pd.DataFrame(store1.index)
index = index.drop_duplicates()
stores_revenue = []
for store in range(45):
temp = df_sales_shops[df_sales_shops['Store'] == store 1]
stores_revenue.append(pd.DataFrame(temp.groupby(['Date']).Weekly_Sales.sum()))
stores_revenue[store].reset_index(drop=True, inplace=True)
print(stores_revenue)
print(index)
Output stores revenue[0]:
[ Weekly_Sales
0 1643690.90
1 1641957.44
2 1611968.17
3 1409727.59
4 1554806.68
.. ...
138 1437059.26
139 1670785.97
140 1573072.81
141 1508068.77
142 1493659.74
[143 rows x 1 columns],
...
CodePudding user response:
IIUC, the code below should be equivalent of your loop and what you expect:
out = df[df['Store'].between(1, 45)].groupby(['Store', 'Date'])['Weekly_Sales'].sum() \
.unstack(level='Store').reset_index(drop=True) \
.rename_axis(columns=None).add_prefix('Store')