Home > Software engineering >  Removing all text within double quotes
Removing all text within double quotes

Time:01-11

I am working on preprocessing some text in Python and would like to get rid of all text that appears in double quotes within the text. I am unsure how to do that and will appreciate your help with. A minimally reproducible example is below for your reference. Thank you in advance.

x='The frog said "All this needs to get removed" something'

So, pretty much what I want to get is 'The frog said something' by removing the text in the double quotes from x above, and I am not sure how to do that. Thanks once again.

CodePudding user response:

Use regex substitution:

import re

x='The frog said "All this needs to get removed" something'
res = re.sub(r'\s*"[^"] "\s*', ' ', x)
print(res)

The frog said something

CodePudding user response:

If you want to use index and slicing:

s='The frog said "All this needs to get removed" something'

# To get the index of both the quotes
[i for i, x in enumerate(s) if x == '"']
#[14, 44]

s[:13] s[45:]
#'The frog said something'
  • Related