I have a string as below:
Financial strain: No\n?Food insecurity:\nWorry: No\nInability: No\n?Transportation needs:\nMedical: No\nNon-medical: No\nTobacco Use\n?Smoking status: Never Smoker\n?
I want to first match the substring/sentence of interest (I.e. the sentence beginning with "Food insecurity" and ending with "\n?") then remove all the newlines in this sentence apart from the last one i.e. the one before the question mark.
I have been able to match the sentence w/o its last newline and question mark with regex (Food insecurity:).*?(?=\\n\?)
but I struggle to remove the first 2 newlines of the matched sentence and return the whole preprocessed string. Any advice?
CodePudding user response:
You could use re.sub
with a callback function:
inp = "Financial strain: No\n?Food insecurity:\nWorry: No\nInability: No\n?Transportation needs:\nMedical: No\nNon-medical: No\nTobacco Use\n?Smoking status: Never Smoker\n?"
output = re.sub(r'Food insecurity:\nWorry: No\nInability: No(?=\n\?)', lambda m: m.group().replace('\n', ''), inp)
print(output)