I am trying to write a Markdown to BBCode parser. In this example the markdown input is:
This is **bold** but this is also **bold**.
This too is **bold** but and so is this**bold**.
And the BBCode output I am trying to program is:
This is [b]bold[/b] but this is also [b]bold[/b].
This too is [b]bold[/b] but and so is this[b]bold[/b].
My code PowerShell:
$MarkdownString ={This is **bold** but this is also **bold**.
This too is **bold** but and so is this**bold**.}
I removed the **
and replaced it with [b]
:
$MarkdownString -Replace '\*\*', '[b]' | New-Variable - BBCodeOutput1
And then trying to fix the lack of a \
in the closing tag for BBCode [\B]
$BBCodeOutput1 -replace '\[b\].*?\[b\]', '\[b\].*?\[\\b\]' | New-Variable -BBCodeOutput2
But the replace Operator second parameter field just interprets everything to be literal text and not Regular expression.
I am sooo confused now, the docs say -Replace
is fully RegEx capable while Replace()
is not.
PS: Any ideas on how to handle this this task would be so welcome!!
CodePudding user response:
This can be done using a single -replace
operation with a capture group:
$MarkdownString = @'
This is **bold** but this is also **bold**.
This too is **bold** but and so is this**bold**.
'@
$MarkdownString -replace '\*\*(. ?)\*\*', '[b]$1[/b]'
Output:
This is [b]bold[/b] but this is also [b]bold[/b].
This too is [b]bold[/b] but and so is this[b]bold[/b].
Explanation:
(...)
defines a capture group. Its captured value (the sub string between two occurences of **
) is then referred to in the substitute using the placeholder $1
which stands for the 1st capture group.
See regex101 demo for detailed explanation and the ability to play around with the RegEx.
Notes:
- PowerShell 7 contains a
ConvertFrom-Markdown
command which could be the basis for a more robust Markdown-to-BBCode converter. - If you need something even more powerful, there are also Markdown parsers for .NET which could be used from PowerShell as well. E. g. markdig.