I have multiple dataframes with identical shape, say:
d1 = {'time': [1,2,3,4], 'A': [55.5,55.5,55.5,55.5], 'B':[55.5,55.5,55.5,55.5]}
d2 = {'time': [1,2,3,4], 'A': [7,6,5,4], 'B':[9,8,7,6]}
d3 = {'time': [1,2,3,4], 'A': [1,2,3,4], 'B':[2,3,4,5]}
df1 = pd.DataFrame(data = d1)
df2 = pd.DataFrame(data = d2)
df3 = pd.DataFrame(data = d3)
They will always have the same column names and the entries in the "time" column will always be identical. I want to merge them in a way, that I get a hierarchical index, i.e.
index frame 'time' 'A' 'B'
0 1 1 55.5 55.5
2 1 7 9
3 1 1 2
1 1 2 55.5 55.5
2 2 6 8
3 2 2 3
...
I have already experimented with .merge() and .join() and .concatenate() with various options, but I seem unable to wrap my head around it. What is the most elegant way to achieve this?
CodePudding user response:
IIUC, you can concat and sort the index:
dfs = [df1, df2, df3]
(pd
.concat(dict(enumerate(dfs, start=1)), names=['frame', 'index'])
.sort_index(level='index')
.swaplevel()
)
output:
time A B
index frame
0 1 1 55.5 55.5
2 1 7.0 9.0
3 1 1.0 2.0
1 1 2 55.5 55.5
2 2 6.0 8.0
3 2 2.0 3.0
2 1 3 55.5 55.5
2 3 5.0 7.0
3 3 3.0 4.0
3 1 4 55.5 55.5
2 4 4.0 6.0
3 4 4.0 5.0