I have N
no of df
. Every df
has two columns, the first column is always constant with a date
and another column name is dynamic. So for example
df1
date A
2017-12-31 5
2020-12-31 7
2021-12-31 8
df2
date B
2012-12-31 9
2020-12-31 6
and so on. So I want to have a resultant df
which will have all the date
along with the columns side-by-side like this
result_df
date A B ...
2012-12-31 - 9 ...
2017-12-31 5 - ...
2020-12-31 7 6 ...
2021-12-31 8 - ...
Special case - I have few empty dfs too. I would like that to merge into result_df
, for example an empty df ['date', 'C']
will be added like this
result_df
date A B C ...
2012-12-31 - 9 - ...
2017-12-31 5 - - ...
2020-12-31 7 6 - ...
2021-12-31 8 - - ...
CodePudding user response:
We could merge
assign
(for columns "C"):
out = df1.merge(df2, on='date', how='outer').sort_values(by='date').assign(C=float('nan'))
or concat
groupby
first
assign
:
out = pd.concat((df1, df2)).groupby('date', as_index=False).first().assign(C=float('nan'))
Output:
date A B C
0 2012-12-31 NaN 9.0 NaN
1 2017-12-31 5.0 NaN NaN
2 2020-12-31 7.0 6.0 NaN
3 2021-12-31 8.0 NaN NaN
CodePudding user response:
You might want to look at pd.merge: https://pandas.pydata.org/docs/reference/api/pandas.merge.html in your case the following code should work
df1.merge(df2, on='date', how='outer')