Home > Back-end >  Sales per month per product id
Sales per month per product id

Time:12-29

I want to have a list of all product sales from 2020-01-01 until 2020-12-01 for sales of each id_main_product (1-100). Would it be better to first create some product groups (e.g. id_main_product from 1-10,11-20 etc.) as I think that the output would be a big table ?

I have already tried out some things with groupby etc. but I can´t find the solution.

This is the dataset:

casesalesdata = pd.read_csv("https://raw.githubusercontent.com/fredzett/Notebooks/master/01_case_sales_data.csv")

CodePudding user response:

First, you should change the data type of date column to datetime and format as %Y-%m to use later in pivot. Then, you can use pd.pivot.

casesalesdata.date=pd.to_datetime(casesalesdata.date)
casesalesdata.date=casesalesdata.date.dt.strftime('%Y-%m')
pd.pivot_table(casesalesdata,index=["date","id_main_product"],
               values=["items"],
               aggfunc='sum')

You may also refer to Group by Year and Month Panda Pivot Table

CodePudding user response:

  • it's not clear the analysis you wish to complete
  • first step is filter to rows you want
  • you can then generate summaries with groupby() and agg()
casesalesdata = pd.read_csv(
    "https://raw.githubusercontent.com/fredzett/Notebooks/master/01_case_sales_data.csv"
)

casesalesdata["date"] = pd.to_datetime(casesalesdata["date"])
# just dates and wanted products
df = casesalesdata.loc[
    casesalesdata["date"].between("2020-01-01", "2020-12-01")
    & casesalesdata["id_main_product"].between(1, 100)
]

df.groupby([df["date"].dt.month, "id_main_product"]).agg(
    {
        "items": "sum",
        "margin_perc": "mean",
        "avg_price_per_item": "mean",
        "rating": "mean",
    }
)


  • Related