Home > Net >  Converting or transposing Multi Index dataframe index column to header identifier
Converting or transposing Multi Index dataframe index column to header identifier

Time:12-12

I currently have a multi index dataframe where two columns are acting as the indexes, column 'Date', and 'Product'. And two additional columns acting as values, columns 'Sales', 'Cost'. . However, I would like to convert one of the index columns as the header identifier, while the other index column acts as the row identifier.

df2 = pd.DataFrame({'Date':['10-31','10-31','10-31','12-31','12-31','12-31'],'Product':['Apple','Pear','Banana','Apple','Pear','Banana'], 'Sales':[1, 0.8, 1.2, 2, 0.9, 1.7], 'Cost':[0.5, 0.3, 0.6, 0.4, 0.4, 0.7]})

Setting 'Date' and 'Product' as index columns:

` df2.set_index(['Date', 'Product'], inplace=True)

df2 `

The multi index dataframe looks like this:

Date Product Sales Cost
10-31 Apple 1 0.5
Pear 0.8 0.3
Banana 1.2 0.6
12-31 Apple 2 0.4
Pear 0.9 0.4
Banana 1.7 0.7

But I want to transpose the 'Product' index column as the header identifier, whilst still keeping the 'Data' index column as the row identifier. Something like this:

         | Apple              | Pear               | Banana   |
Date Sales Cost Sales Cost Sales Cost
10-31 1 0.5 0.8 0.3 1.2 0.6
12-31 2 0.4 0.9 0.4 1.7 0.7

CodePudding user response:

Example

data = [[1.0, 0.5], [0.8, 0.3], [1.2, 0.6], [2.0, 0.4], [0.9, 0.4], [1.7, 0.7]]
idx = pd.MultiIndex.from_product([['10-31', '12-31'], ['Apple', 'Pear', 'Banana']])
df2 = pd.DataFrame(data, columns=['Sales', 'Cost'], index=idx)

df2

                Sales   Cost
10-31   Apple   1.0     0.5
        Pear    0.8     0.3
        Banana  1.2     0.6
12-31   Apple   2.0     0.4
        Pear    0.9     0.4
        Banana  1.7     0.7

Code

(df2.unstack().swaplevel(0, 1, axis=1).sort_index(axis=1)
.reindex(columns=['Sales', 'Cost'], level=1))

result

Product Apple           Banana          Pear
        Sales   Cost    Sales   Cost    Sales   Cost
Date                        
10-31   1.0     0.5     1.2     0.6     0.8     0.3
12-31   2.0     0.4     1.7     0.7     0.9     0.4

CodePudding user response:

Use DataFrame.set_index with DataFrame.unstack, then swap levels of MultiIndex by DataFrame.swaplevel and sort MultiIndex by DataFrame.sort_index with first level:

df2 = pd.DataFrame({'Date':['10-31','10-31','10-31','12-31','12-31','12-31'],'Product':['Apple','Pear','Banana','Apple','Pear','Banana'], 'Sales':[1, 0.8, 1.2, 2, 0.9, 1.7], 'Cost':[0.5, 0.3, 0.6, 0.4, 0.4, 0.7]})

df = (df2.set_index(['Date', 'Product'])
      .unstack()
      .swaplevel(0,1, axis=1)
      .sort_index(axis=1, level=0, sort_remaining=False))
print (df)
Product Apple      Banana       Pear     
        Sales Cost  Sales Cost Sales Cost
Date                                     
10-31     1.0  0.5    1.2  0.6   0.8  0.3
12-31     2.0  0.4    1.7  0.7   0.9  0.4
  • Related