I have two data frames with similar data, and I would like to substract matching values. Example :
df1:
Letter FREQ Diff
0 A 20 NaN
1 B 12 NaN
2 C 5 NaN
3 D 4 NaN
df2:
Letter FREQ
0 A 19
1 B 11
3 D 2
If we can find the same letter in the column "Letter", I would like to create a new column with the subtraction of the two frequency columns.
Expected output :
df1:
Letter FREQ Diff
0 A 20 1
1 B 12 1
2 C 5 5
3 D 4 2
I have tried to begin like this, but obviously it doesn't work
for i in df1.Letter:
for j in df2.Letter:
if i == j:
df1.Difference[j] == (df1.Frequency[i] - df2.Frequency[j])
else:
pass
Thank you for your help!
CodePudding user response:
Use df.merge
with fillna
:
In [1101]: res = df1.merge(df2, on='Letter', how='outer')
In [1108]: res['difference'] = (res.Frequency_x - res.Frequency_y).fillna(res.Frequency_x)
In [1110]: res = res.drop('Frequency_y', 1).rename(columns={'Frequency_x': 'Frequency'})
In [1111]: res
Out[1111]:
Letter Frequency difference
0 A 20 1.0
1 B 12 1.0
2 C 5 5.0
3 D 4 2.0
CodePudding user response:
Thank you very much for your help!