I have a data set that looks like this:
Name Volume Period
1 oil 29000 Jun 21
2 gold 800 Mar 22
3 oil 21000 Jul 21
4 gold 1100 Sep 21
5 gold 3000 Feb 21
6 depower 3 Q1 21
7 oil 23000 Apr 22
8 czpower 26 Q1 23
9 oil 17000 Q1 21
10 gold 2400 May 21
11 oil 12000 Q2 21
12 gold 1800 Jan 22
13 czpower 21 Oct 21
14 api2coal 6000 Q1 22
15 api2coal 11000 Q1 21
16 depower 11 Jan 22
17 api2coal 16000 Jul 21
18 gold 1300 Mar 21
19 depower 3 Q1 22
20 oil 17000 Cal 21
I would like to reshape the data set so as to obtain a data frame with the following characteristics:
- The values in
Name
will become the new variables (columns); - The values in
Period
will become the indexes (should be unique); - The values in
Volume
are the sums of the values of each combination ofName
andPeriod
.
The resulting data frame should look similar to this:
Period api2coal czpower depower gold oil
1: Apr 21 241000 273 213 42500 299000
2: Apr 22 236000 360 234 55700 404000
3: Aug 21 210000 270 305 70000 283000
4: Aug 22 191000 307 236 63200 307000
5: Cal 21 605000 648 852 0 765000
6: Cal 22 710000 718 714 0 542000
7: Cal 23 813000 875 872 0 844000
8: Cal 24 642000 723 734 0 763000
9: Cal 25 109000 164 274 0 195000
10: Dec 21 229000 322 241 47000 254000
11: Dec 22 135000 232 243 52900 240000
12: Feb 21 474000 238 140 42500 292000
13: Feb 22 509000 272 141 45700 270000
14: Jan 21 210000 301 332 70300 122000
15: Jan 22 206000 236 229 54200 180000
16: Jan 23 350000 246 359 56700 208000
17: Jul 21 281000 430 371 43500 287000
18: Jul 22 355000 241 255 71200 160000
19: Jun 21 341000 182 156 39900 278000
20: Jun 22 180000 165 175 52100 348000
Can someone please give me a hint as to how to achieve this? Thank you in advance.
CodePudding user response:
Is this what you're looking for?
newDF = df.pivot_table("Volume", ["Period"], "Name")
CodePudding user response:
I found the solution after trying few things.
pd.pivot_table(df, values="Volume", index=["Period"], columns=["Name"], aggfunc=np.sum)