Home > Blockchain >  Sum up data based on multiple factors in a dataframe
Sum up data based on multiple factors in a dataframe

Time:10-26

As you can see, I do have a data frame of different stores with multiple departments (1-99, but varying). I do want to sum up the revenue of all departments for each shop for each week. Is there a more elegant way than using for loops and if statements? I'm using python with pandas.

Here's a photo of the table :

enter image description here

merged = walmart.merge(stores, how='left').merge(features, how='left')
testing_merged = testing.merge(stores, how='left').merge(features, how='left')
df = pd.DataFrame(data={"Store": merged.Store, "Dept": merged.Dept, "Date": merged.Date, "Weekly_Sales": merged.Weekly_Sales, "IsHoliday": merged.IsHoliday,
                    "Type": merged.Type, "Size": merged.Size, "Temperatur": merged.Temperature, "Fuel_Price": merged.Fuel_Price,
                    "MarkDown1": merged.MarkDown1, "MarkDown2": merged.MarkDown2, "MarkDown3": merged.MarkDown3, "MarkDown4": merged.MarkDown4,
                    "MarkDown5": merged.MarkDown5, "CPI": merged.CPI, "Unemployment": merged.Unemployment})

CodePudding user response:

Using a group by function like the folowing

df.groupby(['departments', 'shop','week'])

Documentation : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

  • Related