The input next fle is as follows
int 1; //integer
//float 1; //floating point number
int m; //integer
/*if a==b
begin*/
print 23 /* 1, 2, 3*/
end
float/* ty;
int yu;*/
Expected output is as follows
int 1; //integer
int m; //integer
print 23
end
float
CodePudding user response:
Here is a two step replacement which seems to work:
inp = """int 1; //integer
//float 1; //floating point number
int m; //integer
/*if a==b
begin*/
print 23 /* 1, 2, 3*/
end
float/* ty;
int yu;*/"""
output = re.sub(r'^\s*//.*?\n', '', inp, flags=re.M)
output = re.sub(r'\n?/\*.*?\*/(\n?)', r'\1', output, flags=re.M|re.S)
print(output)
This prints:
int 1; //integer
int m; //integer
print 23
end
float
The first call to re.sub
removes all lines which start with a //
comment. The second call to re.sub
removes the C-style /* */
comments. It works by trying to match a newline both before and after the comment itself. Then, it replaces with as much as only a single newline, assuming one followed the comment.
CodePudding user response:
You can convert matches of the following to empty strings.
\/\/.*\r?\n|\/\/.*|^\/\*[\s\S]*?\*\/\r?\n|\/\*[\s\S]*?\*\/
Note the second alternation element must follow the first and the fourth alternation element must follow the third.
The regular expression can be broken down as follows.
(?m) # set multiline flag
^\/\/ # match '//' at beginning of line
.*\r?\n # match 0 chars other than line
# terminators then match line terminator
| # or
\/\/.* # match '//'
.* # match the remainder of the line
| # or
^\/\* # match '/*' at the beginning of a line
[\s\S]*? # match 0 characters including line
# terminators, lazily
\*\/ # match '*/'
\r?\n # match line terminators
| # or
\*\/ # match '*/'
[\s\S]*? # match 0 characters including line
# terminators, lazily
\*\/ # match '*/'