Home > Back-end >  Remove pattern within a column if present in a list in pandas
Remove pattern within a column if present in a list in pandas

Time:12-19

I have a dataframe such as :

COL1            COL2 
Element1_VAL1   A
Element2_VAL2   B
Something_lima3 C 
Something_logit5 D

and list such as:

the_list=['_VAL1','_VAL2','_lima3']

And I would like to remove from COL1, all matching patterns within the_list and get:

COL1             COL2 
Element1         A
Element2         B
Something        C 
Something_logit5 D

Here is the dataframe in dict format :

{'COL1': {0: 'Element1_VAL1', 1: 'Element2_VAL2', 2: 'Something_lima3', 3: 'Something_logit5'}, 'COL2 ': {0: 'A', 1: 'B', 2: 'C', 3: 'D'}}

CodePudding user response:

Try with replace(), but modified slightly:

df['new'] = df['COL1'].str.replace('|'.join(the_list), '',regex=True)

print(df)

               COL1 COL2                new
0     Element1_VAL1     A          Element1
1     Element2_VAL2     B          Element2
2   Something_lima3     C         Something
3  Something_logit5     D  Something_logit5

This '|'.join(the_list) will join all the different elements in your list with |, which str.replace accepts and reads as or. So if any of those substrings are spotted, it will replace them ''.

CodePudding user response:

You can use pandas replace() which is very helpful because it allows you to pass a list of elements to be replaced with a single element (blank for this case) and avoid multiple calls of .str.replace(). Try:

df['COL1'] = df['COL1'].replace(the_list,'')
  • Related