I am trying to pick out all values from an object I have in string form. I have created the regular expression, but I am still having issues with not being able to remove the quotes and have hit a wall...
Here is the code I have with results I get compared to desired results:
const regex = /(?:"([^"] )\")|([^=",{}.] )/g
const string = 'obj{a="0",b="1",domain="a-ss.test.io:666",f="g",range="3.594e-04...4.084e-04"}'
const matches = string.match(regex)
console.log(matches)
Here is the resulting array:
[
"obj",
"a",
"\"0\"",
"b",
"\"1\"",
"domain",
"\"a-ss.test.io:666\"",
"f",
"\"g\"",
"range",
"\"3.594e-04...4.084e-04\""
]
Though the desired result I would like is:
[
"obj",
"a",
"0",
"b",
"1",
"domain",
"a-ss.test.io:666",
"f",
"g",
"range",
"3.594e-04...4.084e-04"
]
Does anyone know how to also remove the quotes from each array value that is returned?
CodePudding user response:
You need to get whole match values in the result since String#match
with a regular expression containing the /g
flag loses all captures. See that String#match
returns:
An
Array
whose contents depend on the presence or absence of the global (g
) flag, ornull
if no matches are found.
- If the
g
flag is used, all results matching the complete regular expression will be returned, but capturing groups are not included.- If the
g
flag is not used, only the first complete match and its related capturing groups are returned. In this case,match()
will return the same result asRegExp.prototype.exec()
(an array with some extra properties).
You want to get Group 1 or Group 2 values, so ask the engine to return just them:
const regex = /"([^"] )"|([^=",{}.] )/g
const string = 'obj{a="0",b="1",domain="a-ss.test.io:666",f="g",range="3.594e-04...4.084e-04"}'
const matches = Array.from(string.matchAll(regex), x => x[1] || x[2])
console.log(matches)
CodePudding user response:
The next provided approach uses the regex ... /(?<key>\p{L} )="(?<value>[^"]*)"|(?<id>\p{L} )\{/gu
... which makes use of named capturing groups and unicode escapes (the latter for matching any letter sequence in any language ... \p{L}
).
The regex first tries to match and capture key
-value
pairs of the form <key>="<value>"
by its first alternative pattern... (?<key>\p{L} )="(?<value>[^"]*)"
... which matches ...
- and captures any letter sequence of any language ...
(?<key>\p{L} )
- followed by the two characters ...
="
- followed by a captured character sequence that does not contain a double quote ...
(?<value>[^"]*)
- followed by a double quote ...
"
.
Otherwise it tries to match and capture the object related part of the form <obj>{ ... }
by its second alternative pattern... (?<id>\p{L} )\{
... which matches ...
- and captures any letter sequence of any language ...
(?<id>\p{L} )
- followed by an opening brace ...
\{
.
The pattern then gets consumed by matchAll
where the return value needs to be transformed into an array in order to be mappable.
The method used is flatMap
since the callback either returns the captured single id
value or returns an array of the two other captured values key
and value
.
const sampleData = `obj{a="0",b="1",domain="a-ss.test.io:666",f="g",range="3.594e-04...4.084e-04"}`;
// see ... [https://regex101.com/r/PjNhlx/1]
const regXTokens = /(?<key>\p{L} )="(?<value>[^"]*)"|(?<id>\p{L} )\{/gu;
console.log(
[...sampleData.matchAll(regXTokens)]
.flatMap(({ groups: { id, key, value } }) => id || [key, value])
);
CodePudding user response:
Assuming you don't have characters {
, }
, =
, "
, ,
in your quoted values you can simply use a .split()
on those characters:
const string = 'obj{a="0",b="1",domain="a-ss.test.io:666",f="g",range="3.594e-04...4.084e-04"}';
let matches = string.split(/[\{\}\=",] /);
console.log(matches);
Output:
[
"obj",
"a",
"0",
"b",
"1",
"domain",
"a-ss.test.io:666",
"f",
"g",
"range",
"3.594e-04...4.084e-04",
""
]