I have a DF that looks like this:
ids
-----------
cat-1,paws
dog-2,paws
bird-1,feathers,fish
cows-2,bird_3
.
.
.
I need to remove all the ids that have a - or _ in the dataframe. So, final data frame should be
ids
-----------
paws
paws
feathers,fish
.
.
.
I've tried using lambda like this:
df['ids'] = df['ids'].apply(lambda x: x.replace('cat-1', '').replace('dog-2', '' )...)
But this is not a scalable solution and I would need to add all the ids with dashes and underscores into the above. What would be a more scalable/efficient solution?
CodePudding user response:
You can use a regex pattern:
df.ids.str.replace('\w*[-_]\w*,?', '')
Output:
0 paws
1 paws
2 feathers,fish
3
Name: ids, dtype: object