pythonic way to drop columns where length of list in column is x-CodePudding

I would like drop the rows where a certain column has a list of length X. What is the most pythonic or efficient way? Instead of looping...

Code example:

import pandas as pd

data = {'column_1': ['1', '2', '3'] ,
    'column_2': [['A','B'], ['A','B','C'], ['A']], 
    "column_3": ['a', 'b', 'c']}

df = pd.DataFrame.from_dict(data)

drop rows where length of list = 3. In this case, row 2 should be deleted since the length of the list is 3

CodePudding user response：

Use Series.str.len to make a boolean indexing

new_df = df[df["column_2"].str.len().ne(3)]


  column_1 column_2 column_3
0        1   [A, B]        a
2        3      [A]        c

Or if you want to remove rows where list length is equal or greater than 3:

new_df = df[df["column_2"].str.len().le(2)]

print(df["column_2"].str.len().ne(3))
#0     True
#1    False
#2     True
#Name: column_2, dtype: bool

CodePudding user response：

Use Series.apply

res = df[df["column_2"].apply(len).le(2)]
print(res)

Output

  column_1 column_2 column_3
0        1   [A, B]        a
2        3      [A]        c