I want to have a month wise distribution of total revenue and month on month cumulative revenue sorted according to the months of the year. A sample of the source is table given below
Bill_Date Sale_Net_Value
2021-01-01 220
2021-01-28 300
2021-02-03 250
Expected Output:
Month Total_Revnue cumilitive
Jan 520 520
Feb 250 770
CodePudding user response:
Using your sample table as the df variable below, here is my attempt:
df.index = pd.to_datetime(df['Bill_Date'],format='%Y-%m-%d')
df = df.groupby(pd.Grouper(freq='M')).sum().reset_index()
df['Bill_Date'] = pd.to_datetime(df['Bill_Date'],format='%Y-%m-%d').dt.strftime('%b')
df['Cumulative'] = df['Sale_Net_Value'].cumsum()
df=df.rename(columns={"Bill_Date":"Month", "Sale_Net_Value":"Total_Revenue"})
print(df)
Output:
Month Total_Revenue Cumulative
0 Jan 520 520
1 Feb 250 770
CodePudding user response:
You could group on your months and sum your 'Sale_Net_Value' column, and then create a new column on that using assign
and cumsum()
:
df['Bill_Date'] = pd.to_datetime(df.Bill_Date)
df.groupby(df.Bill_Date.dt.month_name().str[:3]).agg(
Total_revenue=('Sale_Net_Value','sum')
).assign(cumulative= lambda col: col['Total_revenue'].cumsum())
prints:
Total_revenue cumulative
Bill_Date
Jan 520 520
Mar 250 770