There are many posts on converting JSON into arrays in JavaScript, however, the search results I reviewed do not produce the specific desired results I am seeking (described below).
I'm looking for a RegEx that can convert JSON into an array of all characters (commas, curly brackets for objects, brackets for arrays, colons, escape characters, null, etc.) but ignores whitespace.
Psuedo-code for RegEx:
Match any curly bracket char (open and closed), any word wrapped in parenthesis, colon char, comma char, bracket char (open and closed), number, and null.
Sample Input String (JSON will change)
{
"hello":"world",
"\"foo\"":"bar",
"my_object":{
"my_array": [
1,
null,
{ "my_key":"my_value" }
]
}
}
Desired Output From RegEx
[
"{", "hello", ":", "world", ",", "\"foo\"", ":", "bar", ",", "my_object",":", "{", "my_array", ":", "[", 1, ",", null, ",", "{", "my_key", ":", "my_value", "}", "]", "}", "}"
]
Thank you!
CodePudding user response:
RegExp is good at quite a number of tasks, but I still strongly recommend not using it with data structures like JSON (or, god forbid, XML/HTML).
Since this is literally a data structure in JavaScript, use a recursive function to get the output you want:
const jsdoc = {
"hello": "world",
"\"foo\"": "bar",
"my_object": {
"my_array": [
1,
null,
{
"my_key": "my_value"
}
]
}
}
const parseToArray = obj => {
var res = [];
if (Array.isArray(obj)) {
res.push('[');
for (entry of obj) {
var parsed = parseToArray(entry);
//while (Array.isArray(parsed)) parsed = parseToArray(entry);
res.push(parsed, ',');
}
res.splice(-1);
res = [...res, ']', ','];
}
else if (obj && typeof obj === 'object') {
obj = Object.entries(obj);
res.push("{");
for (entry of obj) {
res = [...res, entry[0], ":", ...parseToArray(entry[1]), ','];
}
res.splice(-1);
res.push("}");
} else {
res = typeof obj === 'string' ? [obj] : obj;
}
return Array.isArray(res) ? res.flat() : res;
}
console.log(parseToArray(jsdoc));
NOTE: The way this script handles comma separators is not JSON-compliant; that is, it may spit back commas where there is no proceeding element in an Array structure. With a bit more work this could probably be resolved.
ANOTHER NOTE: This is probably, at best, PoC-level code - it's very ugly and inefficient, and there are likely many optimizations that can be made to this. It may go without saying, but this will absolutely not scale that well in production without substantial tweaking. However I think it does demonstrate that it's much less of a headache to flatten your JSON this way, as opposed to trying to account for any and all weird edge cases you might encounter using RegExp.