Home > Back-end >  Match and replace hash comments
Match and replace hash comments

Time:11-03

What I'm looking for

I would like to match hash comments excluding inline comments in PowerShell . I can match them using #(?<!<#)(?![<>])(.*)$ (see regex101). However I would want to achieve this by not using lookbehind. Is there any way I can get the same (or closest) behavior to the regex with lookbehind?

A regex only solution would be nice but I would also appreciate solutions JavaScript only solutions.

What I tried

My best was try #(?!.*>)(.*)$ (see regex101), but I'm unhappy that it misses anything that has> after #.

I also tried writing JavaScript only parsing but it got very complex (more than 30 lines) and was very slow.

Why I need it

I use this to inline PowerShell used in an open-source project (GitHub code). However, the project is online and runs on browser (website) and unfortunately I realized that it does not work in Safari as Safari does not yet support lookbehind (caniuse.com).

Syntax: Hash comment vs inline comment

Rule: Match all hash comments but exclude inline comments

Hash comments in PowerShell, just like in bash and other languages starts from hash (#) and continues until the end of the line.

Write-Host "Hello world" # Here is a hash comment

Inline comments in PowerShell is between <# and #>

Write-Host <# Here is an inline comment #> "Hello world"

CodePudding user response:

With regular expressions, that represent plain text patterns, you can not just match pieces of texts, you can also capture parts of the matches. Once you capture a substring, you can apply any logic you want after that, be it replacing or keeping parts of the string.

To replace the comments, you can use

text = text.replace(/(<#.*?#>)|#.*/g, (m, g) => g ? g : "REPLACED")

To remove the comments, you can use

text = text.replace(/(<#.*?#>)|#.*/g, "$1")

In both cases, the pattern is (<#.*?#>)|#.*:

  • (<#.*?#>) - Group 1: a <# text, then any zero or more chars other than line break chars, as few as possible, and then #> text
  • | - or
  • #.* - # char and the rest of the line.

In the first case, (m, g) => g ? g : "REPLACED" replacement keeps group 1 value if Group 1 matched (m stands for the whole match and g stands for the Group 1 value). In the second case, the Group 1 value is put back if it was matched with the help of the $1 backreference. It represents an empty string if Group 1 did not match.

CodePudding user response:

You could match the following regular expression.

^(?:<#.*?#>|[^#])*(#.*)?

The capture group, if not empty, contains the comment to be removed.

Demo

Hover the cursor over each element of the expression at the link to obtain a description of its function.

  • Related