Home > Software engineering >  Comparing two data frames columns and addition of matching values
Comparing two data frames columns and addition of matching values

Time:11-28

I have two data frames with similar data, and I would like to substract matching values. Example :

df1:

    Letter  FREQ    Diff
0   A       20      NaN
1   B       12      NaN
2   C       5       NaN 
3   D       4       NaN

df2:

    Letter  FREQ
0   A       19
1   B       11
3   D       2

If we can find the same letter in the column "Letter", I would like to create a new column with the subtraction of the two frequency columns.

Expected output :

df1:

    Letter  FREQ    Diff
0   A       20      1
1   B       12      1
2   C       5       5   
3   D       4       2

I have tried to begin like this, but obviously it doesn't work

for i in df1.Letter:
    for j in df2.Letter:
        if i == j:
            df1.Difference[j] == (df1.Frequency[i] - df2.Frequency[j])
        else:
            pass

Thank you for your help!

CodePudding user response:

Use df.merge with fillna:

In [1101]: res = df1.merge(df2, on='Letter', how='outer')
In [1108]: res['difference'] = (res.Frequency_x - res.Frequency_y).fillna(res.Frequency_x)

In [1110]: res = res.drop('Frequency_y', 1).rename(columns={'Frequency_x': 'Frequency'})

In [1111]: res
Out[1111]: 
  Letter  Frequency  difference
0      A         20         1.0
1      B         12         1.0
2      C          5         5.0
3      D          4         2.0

CodePudding user response:

Thank you very much for your help!

  • Related