This is my Data frame df1:
ID Date Value
0 9712 12/15/2021 15
1 9920 4/11/2021 5
2 9712 08/30/2021 10
3 9560 07/3/2021 25
4 9560 03/03/2021 20
My another data frame df2:
ID
0 9501
1 9502
2 9665
3 9500
4 9431
Now I need to extend the ID column of df1 with ID column of df2. Expected Output:
ID Date Value
0 9712 12/15/2021 15
1 9920 4/11/2021 5
2 9712 08/30/2021 10
3 9560 07/3/2021 25
4 9560 03/03/2021 20
5 9501
6 9502
7 9665
8 9500
9 9431
How can I achieve it?
CodePudding user response:
You can use concat
:
>>> pd.concat([df1,df2])
ID Date Value
0 9712 12/15/2021 15.0
1 9920 04/11/2021 5.0
2 9712 08/30/2021 10.0
3 9560 07/03/2021 25.0
4 9560 03/03/2021 20.0
0 9501 NaN NaN
1 9502 NaN NaN
2 9665 NaN NaN
3 9500 NaN NaN
4 9431 NaN NaN
Which will combine the dataframes with overlapping columns and return everything. Columns outside the intersection will be filled with NaN
values.
Edit:
Assume your df2 has also a date column:
ID Date
0 9501 1
1 9502 2
2 9665 3
3 9500 4
4 9431 5
You can very simply just select []
the Date column to concat:
>>> pd.concat([df,df2[['ID']]])
ID Date Value
0 9712 12/15/2021 15.0
1 9920 04/11/2021 5.0
2 9712 08/30/2021 10.0
3 9560 07/03/2021 25.0
4 9560 03/03/2021 20.0
0 9501 NaN NaN
1 9502 NaN NaN
2 9665 NaN NaN
3 9500 NaN NaN
4 9431 NaN NaN
So irrespectively of how many 'extra' columns you have, you can always just select the 'Date' column
CodePudding user response:
You can simply merge the two dataframes on the ID column with an outer join
df = df1.merge(df2, on='ID', how='outer')
if df2 contains more than one column but you wish to only use one, in this case the ID column you can select only that column like so
df = df1.merge(df2[['ID']], on='ID', how='outer')
this method of selecting just the ID column using
df2[['ID']]
will also work for the concat
method