I'm looking to do replacements in unknown third-party inputs in strings that sometimes have quotes among them.
I want to replace a wholeword whereever it occurs unless it's in double or single-quotes, and unless the quote is escaped.
Example: Replacing FOO by BAR
Input:
FOO "FOO" 'FOO' " 1 FOO 2 " ABCFOOXYZ " str1\"FOO\"str3'FOO'\'\'" ' str1\'FOOstr3"FOO"\"\"' \"FOO\"
Expected output:
BAR "FOO" 'FOO' " 1 FOO 2 " ABCFOOXYZ " str1\"FOO\"str3'FOO'\'\'" ' str1\'FOOstr3"FOO"\"\"' \"BAR\"
More tests:
name: FOO
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[FOO]
statement = FOO " is the owner of the house."
Expected output:
name: BAR
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[BAR]
statement = BAR " is the owner of the house."
I saw this question: Match and replace a word not in quotes (string contains escaped quotes) which I thought was similar and could be a good starting point the accepted answer does not work at all:
https://regex101.com/r/Lfan64/5
If anyone could help me get the expected result from my regex that would be great, thanks.
CodePudding user response:
You can use
const text = String.raw`name: FOO
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[FOO]
statement = FOO " is the owner of the house."
FOO "FOO" 'FOO' " 1 FOO 2 " ABCFOOXYZ " str1\"FOO\"str3'FOO'\'\'" ' str1\'FOOstr3"FOO"\"\"' \"FOO\"`
console.log( text.replace(
/((?:[^\\]|^)(?:\\{2})*(?:"[^"\\]*(?:\\[^][^"\\]*)*"|'[^'\\]*(?:\\[^][^'\\]*)*'))|FOO/g,
(match, group) => group || "BAR"
))
Details:
((?:[^\\]|^)(?:\\{2})*(?:"[^"\\]*(?:\\[^][^"\\]*)*"|'[^'\\]*(?:\\[^][^'\\]*)*'))
- Group 1:(?:[^\\]|^)
- a char other than\
or start of string(?:\\{2})*
- zero or more sequences of double backslash(?:"[^"\\]*(?:\\[^][^"\\]*)*"|'[^'\\]*(?:\\[^][^'\\]*)*')
- either of a double or single quoted string literal pattern with escape sequence support
|
- orFOO
- aFOO
string in any other context.
The (match, group) => group || "BAR"
replacement means that if Group 1 matches, the replacement is Group 1 value, else, the replacement is BAR
.
CodePudding user response:
If I am understanding your requirements correctly, you may try this regex for your cases:
((['"])(?:\\.|(?!\2).)*(?<!\\)\2)|\bFOO\b/g
This regex uses alternation to match and discard what we need to keep on LHS of |
whereas on RHS we match whatever we want to replace in the result.
Code:
const str = String.raw`name: FOO
favoriteQuote: "I am my own FOO."
children: 'FOO\'s children'
cars: ownersList[FOO]
statement = FOO " is the owner of the house."
FOO "FOO" 'FOO' " 1 FOO 2 " ABCFOOXYZ " str1\"FOO\"str3'FOO'\'\'" ' str1\'FOOstr3"FOO"\"\"' \"FOO\"`;
var repl = str.replace(/((['"])(?:\\.|(?!\2).)*(?<!\\)\2)|\bFOO\b/g,
(_, g) => g || "BAR");
console.log(repl);
RegEx Details:
(
: Start capture group #1(['"])
: Match'
or"
in capture group #2(?:\\.|(?!\2).)*
: Match an escaped character or any character except the quote we matched in capture group #2(?<!\\)\2
: Match whatever quote we matched in capture group #2 as long as it is not preceded by a\
)
: End capture group #1|
: OR\bFOO\b
: Match complete wordFOO