I have 2 dataframes. Df1 looks like this
df2 looks like this
i want to compare three columns in both these dataframes, namely, Application_ID, Task Type and Task Category. If there is a row where these 3 column values match (in the screenshots above, these column values do match), I want to create a column called Task_ID in df1 and assign it to the value of Task_ID in df2.
In other words, if there is a match, Task_ID for df1 = 1234 (since the Task_ID for df2 is 1234). How do I do this? Any help is most welcome. thanks in advance.
CodePudding user response:
Try something like this:
df1 = pd.DataFrame({
'Overal PIA Status': ['In Progress'],
'Task Type': ['Privacy Monitoring'],
'Task Category': ['PIA Monitoring'],
'Due Date': ['9/30/2022'],
'Custodian': ['asdfghjkl'],
'Application_ID': [1234]
})
df2 = pd.DataFrame({
'Task Type': ['Privacy Monitoring'],
'Task Category': ['PIA Monitoring'],
'Task Title': ['Application PIA Not Started'],
'Due Date': ['9/24/2022'],
'Task Owner': ['asdfghjkl'],
'Application_ID': [1234],
'Task_ID': [5678]
})
df1['Task_ID'] = [
df2['Task_ID'][i]
if set(df2[['Application_ID', 'Task Type', 'Task Category']].iloc[i])
== set(df2[['Application_ID', 'Task Type', 'Task Category']].iloc[i])
else None
for i in range(len(df1))
]
print(df1)
Output:
Overal PIA Status Task Type Task Category Due Date Custodian Application_ID Task_ID
0 In Progress Privacy Monitoring PIA Monitoring 9/30/2022 asdfghjkl 1234 5678
CodePudding user response:
I did not test it, as I don't have a sample dataset from you, however here is my solution using pd.merge:
pd.merge(df1, df2[['Application_ID', 'Task Type', 'Task Category', 'Task_ID']],
on=['Application_ID', 'Task Type', 'Task Category'], how='left')
Hope it works!