Home > Software design >  Regex to Remove Block Comments but Keep Empty Lines?
Regex to Remove Block Comments but Keep Empty Lines?

Time:12-26

Is it possible to remove a block comment without removing the line breaks with a regex?

Let's say I have this text:

text = """Keep this /* this has to go
this should go too but leave empty line */
This stays on line number 3"""

I came up with this regex:

text = re.sub(r'/\*.*?\*/', '', text, 0, re.DOTALL)

But this gives me:

Keep this 
This stays on line number 3

What I want is:

Keep this

This stays on line number 3

Can it be done?

CodePudding user response:

We can make a slight change to your current logic and use a lambda callback as the replacement for re.sub:

import re

text = """Keep this /* this has to go
this should go too but leave empty line */
This stays on line number 3"""

text = re.sub(r'/\*.*?\*/', lambda m: re.sub(r'[^\n] ', '', m.group()), text, flags=re.S)
print(text)

This prints:

Keep this 

This stays on line number 3

The replacement logic in the lambda function operates on the /* ... */ comment block. It strips off all characters except for newlines, leaving the newline structure intact while removing all other content from the intermediate comment lines.

  • Related