I am trying to match curly quotes that are inside shortcodes and replace them with normal quotes but leave the ones outside.
Here is an example content:
“foobar” [tagname id=“1035” linked=“true”] “another” [tagname id=“1”]
Should output the following:
“foobar” [tagname id="1035" linked="true"] “another” [tagname id="1"]
It can be PCRE or Javascript regex. Any suggestions is appreciated.
CodePudding user response:
For doing replacements on substrings that match some pattern it's often more efficient and comfortable to use a callback if available. With PHP and preg_replace_callback
e.g.:
$res = preg_replace_callback('~\[[^\]\[]*\]~', function($m) {
return str_replace(['“','”'], '"', $m[0]);
}, $str);
This pattern matches an opening square bracket followed by any amount of characters that are no square brackets, followed by a closing square bracket. The callback function replaces quotes.
Here is a PHP demo at tio.run. This can easily be translated to JS with replace
function (demo).
let res = str.replace(/\[[^\]\[]*\]/g, m => { return m.replace(/[“”]/g,'"'); });
Without callback in PCRE/PHP also the \G
anchor can be used to continue where the previous match ended. To chain matches to an opening square bracket (without checking for a closing).
$res = preg_replace('~(?:\G(?!^)|\[)[^“”\]\[]*\K[“”]~u', '"', $str);
See this demo at regex101 or another PHP demo at tio.run
(?!^)
prevents \G
from matching at start (default). \K
resets beginning of the reported match.
To have it mentioned, another method could be to use a lookahead at each “
”
for checking if there is a closing ]
ahead without any other square brackets in between: [“”](?=[^\]\[]*\])
This does not check for an opening [
and works in all regex flavors that support lookaheads.
CodePudding user response:
Since this is a little tricky, I am contributing from my end.
So, we can,
match strings that follow a format of
=“some_chars”
Since you have an additional constraint of match only if they are inside the square brackets, we will use positive lookahead
?=
to match the above only if it is followed by a closing square bracket (since the string is uniformly formed, there will always be an opening square bracket which we won't bother about).
Snippet:
<?php
$str = "“foobar” [tagname id=“1035” linked=“true”] “another” [tagname id=“1”]";
echo preg_replace('/(\=“([^”]*)”)(?=.*\])/', '="${2}"', $str);