I have a dataframe df having sales by month:
customer product month revenue
sam A 2021-11 221
tim A 2021-12 220
mi. B 2021-10 213
harry A 2011-11 210
eric. A. 2021-10 213
Need the output:
For Product A:
customer product 2021-10. 2021-11. 2021-12
sam A 0 221 0
tim A 0 0 220
harry A 0 210 0
erIc A. 213 0 0
the format of month is period(M).Any way to do this in python using pivot_table function.There are lot of products so i need to specify the product name in the code.
CodePudding user response:
Are you looking for something like this :
import numpy as np
import pandas as pd
data = pd.DataFrame(columns=["customer","product","month","revenue"])
data=data.append({"customer":"eric.","product":20,"month":"2011-10","revenue":213},ignore_index=True)
data=data.append({"customer":"harry","product":20,"month":"2011-11","revenue":210},ignore_index=True)
data=data.append({"customer":"tim","product":20,"month":"2011-12","revenue":220},ignore_index=True)
data=data.append({"customer":"sam","product":20,"month":"2021-11","revenue":221},ignore_index=True)
data=data.append({"customer":"mi","product":30,"month":"2021-10","revenue":213},ignore_index=True)
pivot_df = pd.pivot_table(data,values=["revenue"],index=["customer","product"],columns=["month"],fill_value=0,aggfunc=np.sum)
pivot_df_filtered = pivot_df.query("product == 20")
CodePudding user response:
maybe this could help:
g_data = data.groupby(['product','month','customer'])['revenue'].sum().unstack('month').fillna(0)
>>> g_data
'''
month 2021-10 2021-11 2021-12
product customer
A eric. 213.0 0.0 0.0
harry 0.0 210.0 0.0
sam 0.0 221.0 0.0
tim 0.0 0.0 220.0
B mi. 213.0 0.0 0.0
'''
# filtering by product
prod_A = g_data.loc[('A'),]
>>> prod_A
'''
month 2021-10 2021-11 2021-12
customer
eric. 213.0 0.0 0.0
harry 0.0 210.0 0.0
sam 0.0 221.0 0.0
tim 0.0 0.0 220.0