Home > Blockchain >  Pandas: remove characters based on conditions in a DataFrame
Pandas: remove characters based on conditions in a DataFrame

Time:11-12

I have a DF that looks like this:

ids
-----------
cat-1,paws
dog-2,paws
bird-1,feathers,fish
cows-2,bird_3 
.
.
.

I need to remove all the ids that have a - or _ in the dataframe. So, final data frame should be

ids
-----------
paws
paws
feathers,fish
.
.
.

I've tried using lambda like this:

df['ids'] = df['ids'].apply(lambda x: x.replace('cat-1', '').replace('dog-2', '' )...)

But this is not a scalable solution and I would need to add all the ids with dashes and underscores into the above. What would be a more scalable/efficient solution?

CodePudding user response:

You can use a regex pattern:

df.ids.str.replace('\w*[-_]\w*,?', '')

Output:

0             paws
1             paws
2    feathers,fish
3                 
Name: ids, dtype: object
  • Related