I want to remove comments, strings after #, but with a special case where # is not considered a comment if it is inside curly braces (it is only when # appears before curly braces).
Input:
This is 1. # Comment 1.
This is 2. # Comment 2
#Comment 3 {#Commit4}
This is 3.
# Commit5
This is 4.{#This is 5} # Commit6
#Commit7
This is 6.
# Commit8 #Commit9
#Commit10
# Commit11
#Commit12 #Commit13
{ # This is 7}; { # This is 8 } # Commit14
{# This is 9}; { # This is 10 }={# This is 11} {# This is 12}={# This is 13}# Commit15
# Commit16 {# Commit17 }
Output:
This is 1.
This is 2.
This is 3.
This is 4.{#This is 5}
This is 6.
{ # This is 7}; { # This is 8 }
{# This is 9}; { # This is 10 }={# This is 11} {# This is 12}={# This is 13}
I want to implement the sub function provided by python3 built-in re module and provide my sample code, but I can't remove all # (by my definition)
reStr = str(input)
reStr = re.sub(r"((^|\n)(.*[^{])?)(#[^\n] )", r'\1', reStr)
print (f"{reStr}")
If you have a better solution please let me know, Thanks
CodePudding user response:
You can use the regex pattern (#[^{}\n] )($|{)
and replace with \2
.
see https://regex101.com/r/grDMqH/1
CodePudding user response:
Using the fact that overlapping matches aren't allowed, here is one way to do so:
({.*?})|#.*
Replace with: \1
See the online demo here.