I have 2 dataframes to do with Covid-19
df_infect
Dates Australia Bahamas .......
1/22/20 0 0 .......
1/23/20 0 1 .......
and
df_death
Dates Australia Bahamas .......
1/22/20 0 0 .......
1/23/20 0 0 .......
I want to end up with a dataframe that is the combination of both like this,
df_combined
Australia Bahamas ......
Dates Infected Dead Infected Dead
1/22/20 0 0 0 0
1/23/20 0 0 1 0
I'm assuming there is some fancy merging you can do to the dataframes but I can't workout how you'd do it.
CodePudding user response:
You can merge on Dates
with appropriate suffixes; then split the column names to create MultiIndex columns:
out = pd.merge(df_infect, df_death, on='Dates', suffixes=('_infected','_dead')).set_index('Dates')
out.columns = out.columns.str.split('_', expand=True)
out = out.sort_index(level=[0,1], axis=1, ascending=[True, False])
Output:
Australia Bahamas
infected dead infected dead
Dates
1/22/20 0 0 0 0
1/23/20 0 0 1 0
CodePudding user response:
You can add a temporary column to each dataframe describing its type, then concatentate them, and pivot:
new_df = pd.concat([df_deaths.assign(type='Death'), df_infect.assign(type='Infected')]).pivot(index='Dates', columns='type')
Output:
>>> new_df
Australia Bahamas
type Death Infected Death Infected
Dates
1/22/20 0 0 0 0
1/23/20 0 0 0 1