Home > Back-end >  How to create hiearchical multi-index dataframe from multiple dataframes?
How to create hiearchical multi-index dataframe from multiple dataframes?

Time:05-11

I have multiple dataframes with identical shape, say:

d1 = {'time': [1,2,3,4], 'A': [55.5,55.5,55.5,55.5], 'B':[55.5,55.5,55.5,55.5]}
d2 = {'time': [1,2,3,4], 'A': [7,6,5,4], 'B':[9,8,7,6]}
d3 = {'time': [1,2,3,4], 'A': [1,2,3,4], 'B':[2,3,4,5]}

df1 = pd.DataFrame(data = d1)
df2 = pd.DataFrame(data = d2)
df3 = pd.DataFrame(data = d3)

They will always have the same column names and the entries in the "time" column will always be identical. I want to merge them in a way, that I get a hierarchical index, i.e.

index  frame     'time'    'A'       'B'
0      1         1         55.5      55.5
       2         1         7         9
       3         1         1         2
1      1         2         55.5      55.5
       2         2         6         8
       3         2         2         3
...

I have already experimented with .merge() and .join() and .concatenate() with various options, but I seem unable to wrap my head around it. What is the most elegant way to achieve this?

CodePudding user response:

IIUC, you can concat and sort the index:

dfs = [df1, df2, df3]

(pd
 .concat(dict(enumerate(dfs, start=1)), names=['frame', 'index'])
 .sort_index(level='index')
 .swaplevel()
)

output:

             time     A     B
index frame                  
0     1         1  55.5  55.5
      2         1   7.0   9.0
      3         1   1.0   2.0
1     1         2  55.5  55.5
      2         2   6.0   8.0
      3         2   2.0   3.0
2     1         3  55.5  55.5
      2         3   5.0   7.0
      3         3   3.0   4.0
3     1         4  55.5  55.5
      2         4   4.0   6.0
      3         4   4.0   5.0
  • Related