I have a database which has two columns with unique numbers. This is my reference dataframe (df_reference). In another dataframe (df_data) I want to get the rows of this dataframe of which a column values exist in this reference dataframe. I tried stuff like:
df_new = df_data[df_data['ID'].isin(df_reference)]
However, like this I can't get any results. What am I doing wrong here?
CodePudding user response:
From what I see, you are passing the whole dataframe in .isin() method. Try:
df_new = df_data[df_data['ID'].isin(df_reference['ID'])]
CodePudding user response:
Convert the ID
column to the index of the df_data
data frame. Then you could do
matching_index = df_reference['ID']
df_new = df_data.loc[matching_index, :]
This should solve the issue.