Home > Mobile >  Parsing an input-string with different quotes via RegEx
Parsing an input-string with different quotes via RegEx

Time:10-28

I need to convert an input-string with multipe words into a string-array via Powershell. Words can be separated by multiple spaces and/or linebreaks. Each word can be escaped by a single quote or a double quote. Some words may start with a hashtag - in that case any quoting appears after that hashtag.

Here a code sample of a possible input and the expected result:

$inputString = @"
  test1
  #custom1
  #"custom2"           #'custom3'
  #"custom ""four"""   #'custom ''five'''
  test2 "test3" 'test4'
"@

$result = @(
    'test1'
    '#custom1'
    '"#custom2"'
    "#'custom3'"
    '#"custom ""four"""'   
    "#'custom ''five'''"
    'test2' 
    '"test3"' 
    "'test4'"
)

Is there any solution to do this via a clever RegEx-expression? Or does someone have a parser-snippet/function to start with?

CodePudding user response:

Assuming you fully control or implicitly trust the input string, you can use the following approach, which relies on Invoke-Expression, which should normally be avoided:

Assumptions made:

  • # only appears at the start of embedded strings.
  • No embedded string contains newlines itself.
$inputString = @"
  test1
  #custom1
  #"custom2"           #'custom3'
  #"custom ""four"""   #'custom ''five'''
  test2 "test3" 'test4'
"@

$embeddedStrings = Invoke-Expression @"
Write-Output $($inputString -replace '\r?\n', ' ' -replace '#', '`#')
"@

Caveat: The outer quoting around the individual strings is lost in the process and the embedded, escaped quotes are unescaped; outputting $embeddedString yields:

test1
#custom1
#custom2
#custom3
#custom "four"
#custom 'five'
test2
test3
test4

The approach relies on the fact that your embedded strings use PowerShell's quoting and quote-escaping rules; the only problems are the leading # characters, which are escaped as `# above. By replacing the embedded newlines (\r?\n) with spaces, the result can be passed as a list of positional arguments to Write-Output, inside a string that is then evaluated with Invoke-Expression.

  • Related