Home > OS >  Filter Pandas Dataframe using list, but making sure count of elements matches count in list
Filter Pandas Dataframe using list, but making sure count of elements matches count in list

Time:12-07

so I have a list

my_list = [1,1,2,3,4,4]

I have a dataframe that looks like this

col_1    col_2
a        1
b        1
c        2
d        3
e        3
f        4
g        4
h        4

I basically want a final dataframe like

col_1    col_2
a        1
b        1
c        2
d        3
f        4
g        4

Basically I cant use

my_df[my_df['col_2'].isin(my_list)]

since this will include all the rows. I want the first row that matches with each item on the list, but all the same count of rows.

CodePudding user response:

Use GroupBy.cumcount for counter with original and helper DataFrame and filter by inner join in DataFrame.merge:

my_list = [1,1,2,3,4,4]

df1 = pd.DataFrame({'col_2':my_list})
df1['g'] = df1.groupby('col_2').cumcount()
my_df['g'] = my_df.groupby('col_2').cumcount()

df = my_df.merge(df1).drop('g', axis=1)

print (df)
  col_1  col_2
0     a      1
1     b      1
2     c      2
3     d      3
4     f      4
5     g      4
  • Related