Home > Net >  Remove list of words from a string Series
Remove list of words from a string Series

Time:02-04

I have a list of words to remove:

words_list_to_remove = ['abc', 'def', 'ghi', 'jkl']

I want to remove these words from the string Series (df):

My_strings
first
abc
second
third
def
forth
ghi
jkl

My goal new_df:

My_new_strings
first
second
third
forth

I want to keep each element as a string and also keep the index of each element. I tried to convert both of them to set but did not work for me.

Any help would appreciate it!

CodePudding user response:

You can use .isin() and pass your words_list_to_remove to it:

import pandas as pd

# Define Pandas Series that holds your data
df = pd.Series(["first","abc","second","third","def","forth","ghi","jkl"])

print("before dropping:\n", df)

# Define list of strings to drop
words_list_to_remove = ['abc', 'def', 'ghi', 'jkl']

# Only keep rows that are not in list
df = df[~df.isin(words_list_to_remove)]

print("\nafter dropping:\n", df)

As you can see in the output, the index is preserved as well:

before dropping:
0     first
1       abc
2    second
3     third
4       def
5     forth
6       ghi
7       jkl
dtype: object

after dropping:
0     first
2    second
3     third
5     forth
dtype: object

Note: you would usually name a DataFrame as df, it would be better to rename your Series something else to avoid confusion.

  • Related