Home > OS >  How to split multiindex columns without creating 'nan' column name
How to split multiindex columns without creating 'nan' column name

Time:06-29

I have a data frame with multi-index columns like the below (the data frame has been flattened from a nested dictionary)

Index(['A/service1/service2/200',
       ....
       'D/service1/service2/500/std'],)

Now when I try to split the columns using this line of code

df.columns = df.columns.str.split('/', expand=True)

It creates nan column names like below. I can't rename or drop this 'nan' column.

Index(['A','service1','service2','200', nan,
       ....
       'D','service1', 'service2', '500', 'std'],)

I intend to convert the data frame to a nested dictionary. Can anyone help?

CodePudding user response:

You can use nested dictioanry comprehension with split nested keys:

c = ['A/service1/service2/200',
      'D/service1/service2/500/std']

df = pd.DataFrame( [[3296, 1000]], columns=c, index=['ts'])
print (df)

out = {k: {tuple(k1.split('/')): v1 for k1, v1 in v.items()}
                                    for k, v in df.to_dict('index').items()}
print (out)
{'ts': {('A', 'service1', 'service2', '200'): 3296, 
        ('D', 'service1', 'service2', '500', 'std'): 1000}}
  • Related