Home > front end >  Add column from another dataframe if two column matches
Add column from another dataframe if two column matches

Time:09-13

I am working with huge volume of data and trying to map values from two dataframe. Looking forward for better Time complexity.

Here I am trying to match Code from df2 which are in df1 and take MLC Code from df1 if values match.

df1

Code MLC Code
1 8
2 66
8 62
4 66

df2

Code
1
2
3
4
4
8

Result df

Code MLC Code
1 8
2 66
3 NA
4 62
4 NA
8 66

Here is the code I am using to perform this task but it take lot of time to compute.

for i, j in enumerate(df2["Code"]):
    for x, y in enumerate(df1["Code"]):
         if j == y:
            df2["MLC Code"][i] == df1["MLC Code"][x]

CodePudding user response:

Try this

df2.merge(df1[['Code', 'MLC Code']], how='left', on='Code')

CodePudding user response:

We can use cumcount with groupby create he sub-merge key

out = df2.assign(key = df2.groupby('Code').cumcount()).\
           merge(df1.assign(key = df1.groupby('Code').cumcount()),how='left')
Out[106]: 
   Code  key  MLC Code
0     1    0       8.0
1     2    0      66.0
2     3    0       NaN
3     4    0      66.0
4     4    1       NaN
5     8    0      62.0

CodePudding user response:

I will try to reproduce the process...

First import module and data

import pandas as pd

# Your sample data
data_1 = {'Code': [1,2,8,4], 'MLC Code': [8,66,62,66]}
data_2 = {'Code': [1,2,3,4,4,8]}

# Create Dataframes from your data
df1 = pd.DataFrame(data_1)
df2 = pd.DataFrame(data_2)

Use merge

df_out = pd.merge(df1, df2, how='right', left_on='Code', right_on='Code')

You will get this output:

    Code    MLC Code
0   1        8.0
1   2       66.0
2   3        NaN
3   4       66.0
4   4       66.0
5   8       62.0

If you want no Index you can do this:

df_out = pd.merge(df1, df2, how='right', left_on='Code', right_on='Code').set_index('Code')
    MLC Code
Code    
1   8.0
2   66.0
3   NaN
4   66.0
4   66.0
8   62.0

Also... The solution given by @alex does the job!!!!

  • Related