am trying to compare two dataframes based on different columns and assign a value to a dataframe based on it.
df1 :
date value1 value2
4/1/2021 A 1
4/2/2021 B 2
4/6/2021 C 3
4/4/2021 D 4
4/5/2021 E 5
4/6/2021 F 6
4/2/2021 G 7
df2:
Date percent
4/1/2021 0.1
4/2/2021 0.2
4/6/2021 0.6
output:
date value1 value2 per
4/1/2021 A 1 0.1
4/2/2021 B 2 0.2
4/6/2021 C 3 0.6
4/4/2021 D 4 0
4/5/2021 E 5 0
4/6/2021 F 6 0
4/2/2021 G 7 0.2
Code1:
df1['per'] = np.where(df1['date']==df2['Date'], df2['per'], 0)
error:
ValueError: Can only compare identically-labeled Series objects
Note: changed the column value of df2['Date] to df2['date] and then tried merging code2:
new = pd.merge(df1, df2, on=['date'], how='inner')
error:
ValueError: You are trying to merge on object and datetime64[ns] columns. If you wish to proceed you should use pd.concat
CodePudding user response:
df1['per']=df1['date'].map(dict(zip(df2['Date'], df2['percent']))).fillna(0)
date value1 value2 per
0 4/1/2021 A 1 0.1
1 4/2/2021 B 2 0.2
2 4/6/2021 C 3 0.6
3 4/4/2021 D 4 0.0
4 4/5/2021 E 5 0.0
5 4/6/2021 F 6 0.6
6 4/2/2021 G 7 0.2
CodePudding user response:
You could use pd.merge
and perform a left
join to keep all the rows from df1
and bring over all the date matching rows from df2
:
pd.merge(df1,df2,left_on='date',right_on='Date', how='left').fillna(0).drop('Date',axis=1)
Prints:
date value1 value2 percent
0 04/01/2021 A 1 0.1
1 04/02/2021 B 2 0.2
2 04/06/2021 C 3 0.6
3 04/04/2021 D 4 0.0
4 04/05/2021 E 5 0.0
5 04/06/2021 F 6 0.6
6 04/02/2021 G 7 0.2
*I think there's a typo on your penultimate row. percent
should be 0.6 IIUC.