Home > Blockchain >  Drop all rows id pandas df except ones mentioned in another df
Drop all rows id pandas df except ones mentioned in another df

Time:05-12

I have two dataframes, one containing lot of columns

df1:
id    age   topic     date   text
1     23    Student   1.1.   Lorem
2     19    Student   1.2.   Cupcake
20    19    Student   1.2.   Lorem Ipsum
190   21    Student   11.1.  Cupcake Ipsum

And one with two columns

df2: 
id   count
1    105
20   4843
31   361

What I'm trying to accomplish here is to drop all the rows from df1 that are not mentioned in df2 (their id is not there).

result_df:
id    age   topic     date   text
1     23    Student   1.1.   Lorem
20    19    Student   1.2.   Lorem Ipsum

I've tried this but it's not working:

result_df = df1.drop(df1[df1['id'] != df2['id']].index)

Could you please help me out?

CodePudding user response:

result_df = df1.merge(df2['id'])

Given:

df1:
    id  age    topic
0    1   23  Student
1    2   19  Student
2   20   19  Student
3  190   21  Student

df2:
   id  count
0   1    105
1  20   4843
2  31    361

Doing:

result_df = df1.merge(df2['id'])
print(result_df)

Output:

   id  age    topic
0   1   23  Student
1  20   19  Student

CodePudding user response:

You can do:

df1 = df1.set_index('id')
df2 = df2.set_index('id')
result_df = df1.loc[df1.index.intersection(df2.index)].reset_index()

Output: result_df

   id   age topic   date    text
0   1   23  Student 1.1.    Lorem
1   20  19  Student 1.2.    Lorem Ipsum
  • Related