Home > OS >  How to match the unique ids that I created in df1 to df2 based on two column values?
How to match the unique ids that I created in df1 to df2 based on two column values?

Time:05-07

I have two dataframes, and I am struggling to match the unique ids that I created in df1 to df2 based on 'name' and 'version' values. I need to add a column to df2, let's call it ['ID'], whose values match with the unique id values in df1. The condition is that both the 'name' and 'version' values in df2 must equal the same 'name' and 'version' in df1 in order to be assigned the correct ID value. DF2 has all the elements of DF1 but they are repeated.

df1 = pd.DataFrame.(
    {
        'Unique ID': ['111', '222', '333', '444'],
        'Name': ['A', 'A' ,'B','C'],
        'Version': ['1.1', '1.2', '1.0', '1.1'],
        'x': ['...', '...', '...', '...']
    }
)

DF1

   | UNIQUE ID | NAME | VERSION |  X  |
  1|     111   |   A  |     1.1 | ... |
  2|     222   |   A  |     1.2 | ... |
  3|     333   |   B  |     1.0 | ... |
  4|     444   |   C  |     1.1 | ... |
df2 = pd.DataFrame.(
    {
        'Name': ['A', 'A', 'A', 'A', 'B']
        'Version': [ '1.1' ,'1.1', '1.1', '1.2', '1.0'],
        'x': ['...', '...', '...', '...']
        'x': ['...', '...', '...', '...']
    }
)

DF2

    | NAME | VERSION |  X  | X  |
  1 |  A   |  1.1    | ... |... |
  2 |  A   |  1.1    | ... |... | 
  3 |  A   |  1.1    | ... |... |  
  4 |  A   |  1.2    | ... |... |
  5 |  B   |  1.0    | ... |... |

Desired Output for DF2:

DF2

    | NAME | VERSION |   ID    | X  |  X |
  1 |  A   |  1.1    |   111   |... | ...|
  2 |  A   |  1.1    |   111   |... | ...|
  3 |  A   |  1.1    |   111   |... | ...|
  4 |  A   |  1.2    |   222   |... | ...|
  5 |  B   |  1.0    |   333   |... | ...|

Attempted code:


df2['ID'] = df1[df1['name'   '_'   'version'].isin(df2['name'   '_'   'version'])]['Unique ID'].values 

CodePudding user response:

A way which is a little bit dirty but works :

df2.merge(df1[['Unique ID','Name','Version']],left_on=['Name','Version'], right_on=['Name','Version'],
suffixes=('_left', '_right'))
  • Related