Home > Enterprise >  Split elements of a dataframe into sub-columns
Split elements of a dataframe into sub-columns

Time:09-02

I have a dataframe like so :


| -1W             | ME            | QE              | YE               |
|-----------------|---------------|-----------------|------------------|
| [119.06, 1.20]  | [120.27, 0.0] | [125.85, -5.58] | [80.53, 39.73]   |
| [207.00, 3.70]  | [210.71, 0.0] | [215.54, -4.83] | [119.74, 90.97]  |
| [313.32, 8.28]  | [321.61, 0.0] | [325.20, -3.59] | [203.13, 118.47] |
| [517.90, 10.87] | [528.77, 0.0] | [510.14, 18.63] | [395.14, 133.63] |

How could I separate the elements within this dataframe into two sub-column headers - one header for each element per column (kind of like a multi-indexed Dataframe)?

CodePudding user response:

Try this:

import pandas as pd
import numpy as np
   
d =[[ [119.06, 1.20]  , [120.27, 0.0] , [125.85, -5.58] , [80.53, 39.73] ],  
 [[207.00, 3.70]  , [210.71, 0.0] , [215.54, -4.83] , [119.74, 90.97]  ],
 [[313.32, 8.28]  , [321.61, 0.0] , [325.20, -3.59] , [203.13, 118.47] ],
 [[517.90, 10.87] , [528.77, 0.0] , [510.14, 18.63] , [395.14, 133.63]]]
  
df = pd.DataFrame(d, columns=['-1W','ME', 'QE', 'YE'])
dfe = df.explode(df.columns.tolist())
    
df_out = dfe.set_index(dfe.groupby(level=0).cumcount(), append=True).unstack()
print(df_out)

Output:

      -1W             ME           QE             YE        
        0      1       0    1       0      1       0       1
0  119.06    1.2  120.27  0.0  125.85  -5.58   80.53   39.73
1   207.0    3.7  210.71  0.0  215.54  -4.83  119.74   90.97
2  313.32   8.28  321.61  0.0   325.2  -3.59  203.13  118.47
3   517.9  10.87  528.77  0.0  510.14  18.63  395.14  133.63

CodePudding user response:

Lets do stack and unstack to reshape dataframe:

s = df.stack()
pd.DataFrame([*s], s.index).stack().unstack([1, 2])

      -1W             ME           QE             YE        
        0      1       0    1       0      1       0       1
0  119.06   1.20  120.27  0.0  125.85  -5.58   80.53   39.73
1  207.00   3.70  210.71  0.0  215.54  -4.83  119.74   90.97
2  313.32   8.28  321.61  0.0  325.20  -3.59  203.13  118.47
3  517.90  10.87  528.77  0.0  510.14  18.63  395.14  133.63

CodePudding user response:

Easily set names of sub-levels:

def split(col):
    return pd.DataFrame(col.tolist(), columns=pd.MultiIndex.from_tuples(
        [(col.name, 'sub1_' col.name), (col.name, 'sub2_' col.name)]
        ))

pd.concat([split(df[col]) for col in df], axis=1)
       -1W               ME              QE              YE        
  sub1_-1W sub2_-1W sub1_ME sub2_ME sub1_QE sub2_QE sub1_YE sub2_YE
0   119.06     1.20  120.27     0.0  125.85   -5.58   80.53   39.73
1   207.00     3.70  210.71     0.0  215.54   -4.83  119.74   90.97
2   313.32     8.28  321.61     0.0  325.20   -3.59  203.13  118.47
3   517.90    10.87  528.77     0.0  510.14   18.63  395.14  133.63
  • Related