Home > Software engineering >  Pivot table and group by month
Pivot table and group by month

Time:03-30

I have a DataFrame df as follows:

df = pd.DataFrame([["12", "10-01-2022", 'boot', "shoe", 100, 50],
                   ["211", "10-01-2022", 'sandal', "shoe", 210, 20],
                   ["321", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["413", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["15", "10-02-2022", 'dress', "cloth", 155, 95],
                   ["633", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["247", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["8787", "10-04-2022", 'boot', "shoe", 120, 45],
                   ["9232", "10-05-2022", 'shirt', "cloth", 75, 30],
                   ["12340", "10-05-2022", 'dress', "cloth", 175, 95 ]],
                  columns=["count", "date", "name", "category", "price", "revenue"])

I need to aggregate by month to see the sums of count, price and revenue as in:

|name  | category |Count                        | price                       | revenue            |      
|      |          | Jan | Feb | Mar | Apr | Mai | Jan | Feb | Mar | Apr | Mai |Jan | Feb | Mar | Apr | Mai |
|boot  | shoe     | 12  | 734 | 880 | 8787|  -  | 100 | 100 | 75  | 120 | -   | 50 | 45  | 30  | 45 |-|
|sandal| shoe     | 211 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
|dress | cloth    | -   | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
|shirt | cloth    | -   | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

How can I do that?

CodePudding user response:

Try this:

df = pd.DataFrame([["12", "10-01-2022", 'boot', "shoe", 100, 50],
                   ["211", "10-01-2022", 'sandal', "shoe", 210, 20],
                   ["321", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["413", "10-02-2022", 'boot', "shoe", 100, 45],
                   ["15", "10-02-2022", 'dress', "cloth", 155, 95],
                   ["633", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["247", "10-03-2022", 'boot', "shoe", 75, 30],
                   ["8787", "10-04-2022", 'boot', "shoe", 120, 45],
                   ["9232", "10-05-2022", 'shirt', "cloth", 75, 30],
                   ["12340", "10-05-2022", 'dress', "cloth", 175, 95 ]],
                  columns=["count", "date", "name", "category", "price", "revenue"])

['count'] = df['count'].astype(int)
df['month'] = pd.to_datetime(df['date']).dt.strftime('%b')
df.groupby(['category', 'name', 'month'])[['count', 'revenue', 'price']].sum().unstack(fill_value=0)

Output:

                 count revenue price
month              Oct     Oct   Oct
category name                       
cloth    dress   12355     190   330
         shirt    9232      30    75
shoe     boot    10413     245   570
         sandal    211      20   210
  • Related