I am trying to take an excel data frame with the following format:
DOW Location 7/30/2022 8/6/2022 8/13/2022 8/20/2022
Volumes Saturday North 33 32 29 24
Volumes Saturday South 34 17 30 28
Volumes Sunday North 40 57 25 28
Volumes Sunday South 47 38 32 45
ACT Saturday North 750 1060 1066 1082
ACT Saturday South 545 509 1306 1121
ACT Sunday North 801 860 572 795
ACT Sunday South 622 526 711 491
and have it output in this format:
DOW_Location Location Date Volumes ACT
Volumes Saturday North North 7/30/2022 33 750
Volumes Sunday North North 7/31/2022 40 801
Volumes Saturday South South 7/30/2022 34 545
Volumes Sunday South South 7/31/2022 47 622
I figured I could do a df.pivot()
after creating a unique index by combining df['DOW'] df['Location']
and strip out the Weekdays from df['DOW']
to create a df['Category']
column. My problem is that I have no idea how to fill the values in a pivot when the values span across multiple columns.
So I'm not sure what would fill the 'values' argument here:
df = df.pivot(index='DOW_Location', columns='Category', values=?)
If a pivot is not the ideal solution here, what is another way?
CodePudding user response:
# split the DOW into columns
df[['col','day']]=df['DOW'].str.split(' ',expand=True)
# melt to bring the dates as rows
df2=df.melt(id_vars=['DOW','Location','col','day'], var_name='date')
#pivot and sum
(df2.pivot_table(index=['day','Location','date'],
columns=['col'],aggfunc=sum)
.reset_index())
day Location date value
col ACT Volumes
0 Saturday North 7/30/2022 750 33
1 Saturday North 8/13/2022 1066 29
2 Saturday North 8/20/2022 1082 24
3 Saturday North 8/6/2022 1060 32
4 Saturday South 7/30/2022 545 34
5 Saturday South 8/13/2022 1306 30
6 Saturday South 8/20/2022 1121 28
7 Saturday South 8/6/2022 509 17
8 Sunday North 7/30/2022 801 40
9 Sunday North 8/13/2022 572 25
10 Sunday North 8/20/2022 795 28
11 Sunday North 8/6/2022 860 57
12 Sunday South 7/30/2022 622 47
13 Sunday South 8/13/2022 711 32
14 Sunday South 8/20/2022 491 45
15 Sunday South 8/6/2022 526 38