Extend a column with another column values in python Data frame-CodePudding

This is my Data frame df1:

     ID      Date      Value
0   9712    12/15/2021  15
1   9920    4/11/2021   5
2   9712    08/30/2021  10
3   9560    07/3/2021   25
4   9560    03/03/2021  20

My another data frame df2:

Now I need to extend the ID column of df1 with ID column of df2. Expected Output:

     ID      Date      Value
0   9712    12/15/2021  15
1   9920    4/11/2021   5
2   9712    08/30/2021  10
3   9560    07/3/2021   25
4   9560    03/03/2021  20
5   9501    
6   9502    
7   9665    
8   9500    
9   9431

How can I achieve it?

CodePudding user response：

You can use concat:

>>> pd.concat([df1,df2])

    ID        Date  Value
0  9712  12/15/2021   15.0
1  9920  04/11/2021    5.0
2  9712  08/30/2021   10.0
3  9560  07/03/2021   25.0
4  9560  03/03/2021   20.0
0  9501         NaN    NaN
1  9502         NaN    NaN
2  9665         NaN    NaN
3  9500         NaN    NaN
4  9431         NaN    NaN

Which will combine the dataframes with overlapping columns and return everything. Columns outside the intersection will be filled with NaN values.

Edit:

Assume your df2 has also a date column:

     ID  Date
0  9501     1
1  9502     2
2  9665     3
3  9500     4
4  9431     5

You can very simply just select [] the Date column to concat:

>>> pd.concat([df,df2[['ID']]])

     ID        Date  Value
0  9712  12/15/2021   15.0
1  9920  04/11/2021    5.0
2  9712  08/30/2021   10.0
3  9560  07/03/2021   25.0
4  9560  03/03/2021   20.0
0  9501         NaN    NaN
1  9502         NaN    NaN
2  9665         NaN    NaN
3  9500         NaN    NaN
4  9431         NaN    NaN

So irrespectively of how many 'extra' columns you have, you can always just select the 'Date' column

CodePudding user response：

You can simply merge the two dataframes on the ID column with an outer join

df = df1.merge(df2, on='ID', how='outer')

if df2 contains more than one column but you wish to only use one, in this case the ID column you can select only that column like so

df = df1.merge(df2[['ID']], on='ID', how='outer')

this method of selecting just the ID column using

df2[['ID']]

will also work for the concat method