I want to subset a DataFrame by two columns in different dataframes if the values in the columns are the same. Here is an example of df1 and df2:
df1
A
0 apple
1 pear
2 orange
3 apple
df2
B
0 apple
1 orange
2 orange
3 pear
I would like the output to be a subsetted df1 based upon the df2 column:
A
0 apple
2 orange
I tried df1 = df1[df1.A == df2.B] but get the following error:
ValueError: Can only compare identically-labeled Series objects
I do not want to rename the column in either.
What is the best way to do this? Thanks
CodePudding user response:
If need compare index values with both columns create Multiindex
and use Index.isin
:
df = df1[df1.set_index('A', append=True).index.isin(df2.set_index('B', append=True).index)]
print (df)
A
0 apple
2 orange