Home > OS >  removing block comments but keeping linebreaks
removing block comments but keeping linebreaks

Time:04-16

I'm removing block comments from python scripts with this regex re.sub("'''.*?'''", "", string, flags = re.DOTALL) It removes the complete block comment including line breaks (\n). However I would like to keep the line breaks for further processing of the files. Any way to do this with a regex?

CodePudding user response:

What youre doing is trying to find repeated matches of lines contained within the multiline strings and replace them with new line characters instead of the whole line. Re.sub can actually take a method/lambda as its second parameter and that is what you should do. Here is the description and an example from pythons documentation

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

>>> def dashrepl(matchobj):
...     if matchobj.group(0) == '-': return ' '
...     else: return '-'
>>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
'pro--gram files'
>>> re.sub(r'\sAND\s', ' & ', 'Baked Beans And Spam', flags=re.IGNORECASE)
'Baked Beans & Spam'

Using that concept, you just find the blockquotes and everything within them, then pass that match to a method which will run its own search, but this time you get the ability to just replace any line with a newline character. So that would be like replace "^.*" with "\n" and made sure you dont remove the triple quotes, or dont include them in the original regex group. Then you can just pass that value back from the method which should then happen for each group indepentantly.

  • Related