I have 2 dataframes like:
df1
(index)
start_time user class
10/20/1998 0.13 worker
10/21/1998 1341 worker
10/22/1998 1303 worker
df2
(index)
start_time user class
10/20/1998 192 empl
10/21/1998 12 empl
10/23/1998 19 empl
And I want to merge them, completing the indexes that are missing/different as multiindex and keeping one column for both:
(multindex)
class start_time user
worker 10/20/1998 0.13
empl 10/20/1998 192
worker 10/21/1998 1341
empl 10/21/1998 12
worker 10/22/1998 1303
empl 10/23/1998 19
CodePudding user response:
Use concat
with DataFrame.set_index
and append=True
for MultiIndex
, for swapping levels is used DataFrame.swaplevel
:
df = pd.concat([df1.set_index('class', append=True),
df2.set_index('class', append=True)]).swaplevel(1, 0)
print (df)
user
class start_time
worker 10/20/1998 0.13
10/21/1998 1341.00
10/22/1998 1303.00
empl 10/20/1998 192.00
10/21/1998 12.00
10/23/1998 19.00
If ordering is important create DatetimeIndex
and sorting by second level:
df1.index = pd.to_datetime(df1.index)
df2.index = pd.to_datetime(df2.index)
df = (pd.concat([df1.set_index('class', append=True),
df2.set_index('class', append=True)])
.swaplevel(1, 0)
.sort_index(level=1, sort_remaining=False))
print (df)
user
class start_time
worker 1998-10-20 0.13
empl 1998-10-20 192.00
worker 1998-10-21 1341.00
empl 1998-10-21 12.00
worker 1998-10-22 1303.00
empl 1998-10-23 19.00
CodePudding user response:
You can try reset the start_time
index and then set the class
and start_time
columns as index.
df = pd.concat([df1.reset_index().set_index(['class', 'start_time']),
df2.reset_index().set_index(['class', 'start_time'])])
print(df)
user
class start_time
worker 10/20/1998 0.13
10/21/1998 1341.00
10/22/1998 1303.00
empl 10/20/1998 192.00
10/21/1998 12.00
10/23/1998 19.00