I currently have a multi index dataframe where two columns are acting as the indexes, column 'Date', and 'Product'. And two additional columns acting as values, columns 'Sales', 'Cost'. . However, I would like to convert one of the index columns as the header identifier, while the other index column acts as the row identifier.
df2 = pd.DataFrame({'Date':['10-31','10-31','10-31','12-31','12-31','12-31'],'Product':['Apple','Pear','Banana','Apple','Pear','Banana'], 'Sales':[1, 0.8, 1.2, 2, 0.9, 1.7], 'Cost':[0.5, 0.3, 0.6, 0.4, 0.4, 0.7]})
Setting 'Date' and 'Product' as index columns:
` df2.set_index(['Date', 'Product'], inplace=True)
df2 `
The multi index dataframe looks like this:
Date | Product | Sales | Cost |
---|---|---|---|
10-31 | Apple | 1 | 0.5 |
Pear | 0.8 | 0.3 | |
Banana | 1.2 | 0.6 | |
12-31 | Apple | 2 | 0.4 |
Pear | 0.9 | 0.4 | |
Banana | 1.7 | 0.7 |
But I want to transpose the 'Product' index column as the header identifier, whilst still keeping the 'Data' index column as the row identifier. Something like this:
| Apple | Pear | Banana |
Date | Sales | Cost | Sales | Cost | Sales | Cost |
---|---|---|---|---|---|---|
10-31 | 1 | 0.5 | 0.8 | 0.3 | 1.2 | 0.6 |
12-31 | 2 | 0.4 | 0.9 | 0.4 | 1.7 | 0.7 |
CodePudding user response:
Example
data = [[1.0, 0.5], [0.8, 0.3], [1.2, 0.6], [2.0, 0.4], [0.9, 0.4], [1.7, 0.7]]
idx = pd.MultiIndex.from_product([['10-31', '12-31'], ['Apple', 'Pear', 'Banana']])
df2 = pd.DataFrame(data, columns=['Sales', 'Cost'], index=idx)
df2
Sales Cost
10-31 Apple 1.0 0.5
Pear 0.8 0.3
Banana 1.2 0.6
12-31 Apple 2.0 0.4
Pear 0.9 0.4
Banana 1.7 0.7
Code
(df2.unstack().swaplevel(0, 1, axis=1).sort_index(axis=1)
.reindex(columns=['Sales', 'Cost'], level=1))
result
Product Apple Banana Pear
Sales Cost Sales Cost Sales Cost
Date
10-31 1.0 0.5 1.2 0.6 0.8 0.3
12-31 2.0 0.4 1.7 0.7 0.9 0.4
CodePudding user response:
Use DataFrame.set_index
with DataFrame.unstack
, then swap levels of MultiIndex
by DataFrame.swaplevel
and sort MultiIndex
by DataFrame.sort_index
with first level:
df2 = pd.DataFrame({'Date':['10-31','10-31','10-31','12-31','12-31','12-31'],'Product':['Apple','Pear','Banana','Apple','Pear','Banana'], 'Sales':[1, 0.8, 1.2, 2, 0.9, 1.7], 'Cost':[0.5, 0.3, 0.6, 0.4, 0.4, 0.7]})
df = (df2.set_index(['Date', 'Product'])
.unstack()
.swaplevel(0,1, axis=1)
.sort_index(axis=1, level=0, sort_remaining=False))
print (df)
Product Apple Banana Pear
Sales Cost Sales Cost Sales Cost
Date
10-31 1.0 0.5 1.2 0.6 0.8 0.3
12-31 2.0 0.4 1.7 0.7 0.9 0.4