Home > Enterprise >  Regular Expression for inline Comments Powershell
Regular Expression for inline Comments Powershell

Time:12-17

I am using a regular expression to filter out the commented code

"(/\*([^*]|(\* [^*/]))*\* /)|(//.*)"

The problem with this expression is that it will filter out inline comments such as

public const string UpdateFileProcessRequestWithModifiedBy = "usp_Update_File_Request_Status_Modified_By"; // this one updates file_process_request and file_details table with the current status and Modified By

How can I ignore the inline comments (comments at the end of the line) using regular expression.

I doing this using a PowerShell script which reads the content of the file which has PATHs in it and then read the content of those paths and do a Select-String

$isConstantFile = [bool]($fileContent |
        Select-String -Pattern "\bConstants.cs\b" |
        Select-String -Pattern $dirPattern |
        ForEach-Object{
            Get-Content $_ | 
            Select-String -Pattern $filePattern | 
            Select-String -Pattern "(/\*([^*]|(\* [^*/]))*\* /)|(//.*)" -NotMatch -Quiet #Regex for comments
        })

CodePudding user response:

Try the following:

@'
/* 
 block comment, multi-line
*/

int i = 1;

/* block comment, single-line*/

// Single-line, stand-alone
   // Ditto with indentation

int j = 1; // Only keep this one

/**/
'@ -replace '(?sm)/\*.*?\*/[ \t]*(?:\r?\n)?|^[ \t]*//[^\r\n]*(?:\r?\n)?'

Note:

  • The key part is ^[ \t]*// in combination with the multi-line regex option ((?m), in its inline form), which makes ^ match at the start of each line: it makes sure that // comments are only removed if they are the beginning of a line, optionally preceded by in-line whitespace.

  • The regex is complicated by also trying to remove the subsequent newline along with the comment; if that isn't needed, it can be simplified to:

    • (?sm)/\*.*?\*/|^[ \t]*//[^\r\n]*

Output:


int i = 1;


int j = 1; // Only keep this one


For an explanation of the regex and the ability to experiment with it, see this regex101.com page.

Note:

  • To incorporate this regex into your command, you can not use line-by-line processing the way you do now (Get-Content $_ | Select-String ...), because the regex needs to match across lines.

  • A simple way to remove the comments of interest from a given file would be
    (Get-Content -Raw $_) -replace '...' (where '...' represents the regex above).

  • Some general notes on PowerShell's regex support:

    • PowerShell's regex features, which build on .NET's regex APIs, are case-insensitive by default
    • .NET regexes don't have the concept of a global flag (g).
    • Whether PowerShell's behavior is in effect global (i.e. action on all matches of a given regex) depends on the feature used; e.g.:
      • -replace is implicitly global, -match is not.
      • Select-String is only fully global if you use the -AllMatches switch.
  • Related