I have following dataframes
df1
ind | stay | res |
---|---|---|
AB10 | yes | yes |
AB11 | no | yes |
df2
ind | date | c |
---|---|---|
AB10 | 22-01 | 1 |
AB10 | 24-01 | 2 |
AB11 | 20-01 | 1 |
AB11 | 21-01 | 2 |
AB11 | 22-01 | 3 |
I want following dataframe as a final one
ind | date | c | stay | res |
---|---|---|---|---|
AB10 | yes | yes | ||
AB10 | 22-01 | 1 | ||
AB10 | 24-01 | 2 | ||
AB11 | no | yes | ||
AB11 | 20-01 | 1 | ||
AB11 | 21-01 | 2 | ||
AB11 | 22-01 | 3 |
How to get this using pandas?
CodePudding user response:
This will do what your question asks:
df1 = pd.concat([
df1.reindex(columns=list(df2.columns) list(df1.columns)[1:]),
df2.reindex(columns=list(df2.columns) list(df1.columns)[1:])]).sort_values('ind')
Output:
ind date c stay res
0 AB10 NaN NaN yes yes
0 AB10 22-01 1.0 NaN NaN
1 AB10 24-01 2.0 NaN NaN
1 AB11 NaN NaN no yes
2 AB11 20-01 1.0 NaN NaN
3 AB11 21-01 2.0 NaN NaN
4 AB11 22-01 3.0 NaN NaN
If you want empty strings instead of NaN
values, append .fillna('')
:
ind date c stay res
0 AB10 yes yes
0 AB10 22-01 1.0
1 AB10 24-01 2.0
1 AB11 no yes
2 AB11 20-01 1.0
3 AB11 21-01 2.0
4 AB11 22-01 3.0
A slightly shorter alternative would be to use concat()
first and then order the columns:
df1 = pd.concat([df1, df2])[list(df2.columns) list(df1.columns)[1:]].sort_values('ind').fillna('')
CodePudding user response:
I believe what you are looking for is a pd.concat()
for your dataframes
df_con = pd.concat([df1, df2])
df_con.sort_values('ind').fillna('')