I am working with huge volume of data and trying to map values from two dataframe. Looking forward for better Time complexity.
Here I am trying to match Code from df2 which are in df1 and take MLC Code from df1 if values match.
df1
Code | MLC Code |
---|---|
1 | 8 |
2 | 66 |
8 | 62 |
4 | 66 |
df2
Code |
---|
1 |
2 |
3 |
4 |
4 |
8 |
Result df
Code | MLC Code |
---|---|
1 | 8 |
2 | 66 |
3 | NA |
4 | 62 |
4 | NA |
8 | 66 |
Here is the code I am using to perform this task but it take lot of time to compute.
for i, j in enumerate(df2["Code"]):
for x, y in enumerate(df1["Code"]):
if j == y:
df2["MLC Code"][i] == df1["MLC Code"][x]
CodePudding user response:
Try this
df2.merge(df1[['Code', 'MLC Code']], how='left', on='Code')
CodePudding user response:
We can use cumcount
with groupby
create he sub-merge key
out = df2.assign(key = df2.groupby('Code').cumcount()).\
merge(df1.assign(key = df1.groupby('Code').cumcount()),how='left')
Out[106]:
Code key MLC Code
0 1 0 8.0
1 2 0 66.0
2 3 0 NaN
3 4 0 66.0
4 4 1 NaN
5 8 0 62.0
CodePudding user response:
I will try to reproduce the process...
First import module and data
import pandas as pd
# Your sample data
data_1 = {'Code': [1,2,8,4], 'MLC Code': [8,66,62,66]}
data_2 = {'Code': [1,2,3,4,4,8]}
# Create Dataframes from your data
df1 = pd.DataFrame(data_1)
df2 = pd.DataFrame(data_2)
Use merge
df_out = pd.merge(df1, df2, how='right', left_on='Code', right_on='Code')
You will get this output:
Code MLC Code
0 1 8.0
1 2 66.0
2 3 NaN
3 4 66.0
4 4 66.0
5 8 62.0
If you want no Index you can do this:
df_out = pd.merge(df1, df2, how='right', left_on='Code', right_on='Code').set_index('Code')
MLC Code
Code
1 8.0
2 66.0
3 NaN
4 66.0
4 66.0
8 62.0
Also... The solution given by @alex does the job!!!!