I have two dataframe(s): df1:
| from id | from group | to id | to group |
| 1 | A | 3 | B |
| 4 | B | 4 | X |
| 5 | F | 5 | J |
| 2 | B | 3 | A |
df2:
| From | To |
| A | B |
| F | J |
I want to filter out the values in df2, if present in the 'from group' and 'to group' columns of df1
Expected output:
| from id | from group | to id | to group |
| 4 | B | 4 | X |
| 2 | B | 3 | A |
I am looking for a flexible solution. A solution that would not change if there are changes to the values in the df2 columns.
CodePudding user response:
You can use .merge
with indicator=True
, then filter the df1
:
x = df1[
df1.merge(
df2,
left_on=["from group", "to group"],
right_on=["From", "To"],
indicator=True,
how="left",
)._merge.eq("left_only")
]
print(x)
Prints:
from id from group to id to group
1 4 B 4 X
3 2 B 3 A
CodePudding user response:
df_1 = pd.DataFrame(data={'from id': [1,4,5,2],
'from group': ['A', 'B', 'F', 'B'],
'to id': [3,4,5,3],
'to group': ['B', 'X', 'J', 'A']})
cols = list(df_1)
df_2 = pd.DataFrame(data={'From': ['A', 'F'], 'To': ['B', 'J']})
print(df_1)
print('-----')
print(df_2)
df_1 = df_1.merge(df_2, how='left', left_on=['from group', 'to group'], right_on=['From', 'To'])
df_1 = df_1[df_1['From'].isnull()][cols]
print(df_1)
CodePudding user response:
Quick edit to the answer I provided to your previous iteration of this question.
l1 = df2.values.tolist()
l2 = list(itertools.chain.from_iterable([itertools.permutations(i, r=2) for i in l1]))
df[[j not in l2 for j in list(zip(df['from group'], df['to group']))]]