I have many columns in a table, but only one column that needs to be pivoted with its values. It looks like this:
OrderNumber Item YearMonth Total
1 1 2019_01 20
1 2 2019_01 40
1 1 2019_02 30
2 1 2019_02 50
The resulting output should be:
OrderNumber Item 2019_01 2019_02 Total
1 1 60 30 20
1 2 60 30 40
1 1 60 30 30
2 1 0 50 50
Basically, sum all the total for each month's order number while retaining all columns. Is there a way to do this? I'm using Pandas
CodePudding user response:
IIUC, you need a pivot_table
merge
:
out = (df
.merge(df.pivot_table(index='OrderNumber', columns='YearMonth',
values='Total', aggfunc='sum', fill_value=0),
on='OrderNumber')
#.drop(columns='YearMonth') # uncomment to drop unused 'YearMonth'
)
Output:
OrderNumber Item YearMonth Total 2019_01 2019_02
0 1 1 2019_01 20 60 30
1 1 2 2019_01 40 60 30
2 1 1 2019_02 30 60 30
3 2 1 2019_02 50 0 50