I need to parse a string, which contains both text content and specific tags.
Expected result must be an array containing items, with separation between texts and tags.
An example of string to parse
There is user [[user-foo]][[/user-foo]] and user [[user-bar]]label[[/user-bar]].
Some informations:
user-
tag is static.- Following part (
foo
orbar
) is dynamic and can be any string. - Same for the text parts.
- Tags can receive some text as child.
Expected result
[
'There is user ',
'[[user-foo]][[/user-foo]]',
' and user ',
'[[user-bar]]label[[/user-bar]]',
'.'
]
What I tried
Here is a regex I created:
/\[\[user-[^\]] ]][A-Za-z]*\[\[\/user-[^\]] \]\]/g
It's visible/editable here: https://regex101.com/r/ufwVV1/1
It identifies all tag parts, and returns two matches, related to the two tags I have. But, text content is not included. I don't know if this first approach is correct.
CodePudding user response:
Maybe there's a better solution in terms of efficiency... But at least, that works.
- Get the tags using regex
- Get the tags position (start/end) within the string
- Use those positions against the string
const string = "There is user [[user-foo]][[/user-foo]] and user [[user-bar]]label[[/user-bar]]."
// Get the tags using regex
const matches = string.match(/\[\[[a-z-\/] \]\]/g)
console.log(matches)
// Get the tags position (start/end) within the string
const matchPositions = matches.map((match) => ({start: string.indexOf(match), end: string.indexOf(match) match.length}))
console.log(matchPositions)
// Use those positions against the string
let currentPos = 0
let result = []
for(let i=0; i<matchPositions.length; i =2){
const position = matchPositions[i]
const secondPosition = matchPositions[i 1]
// Get the substring in front of the current tag (if any)
if(position.start !== currentPos){
const firstSubString = string.slice(currentPos, position.start)
if(firstSubString !== ""){
result.push(firstSubString)
}
}
// Get the substring from the opening tag start to the closing tag end
result.push(string.slice(position.start, secondPosition.end))
currentPos = secondPosition.end
// Get the substring at the end of the string (if any)
if(i === matchPositions.length-2){
const lastSubString = string.slice(secondPosition.end)
if(lastSubString !== ""){
result.push(lastSubString)
}
}
}
console.log(result)
CodePudding user response:
Here is my solution, inspired from @louys-patrice-bessette answer.
const string = 'There is user [[user-foo]][[/user-foo]] and user [[user-bar]]label[[/user-bar]].';
const regex = /\[\[user-[^\]] \]\][A-Za-z0-9_ ]*\[\[\/user-[^\]] \]\]/g;
const { index, items } = [...string.matchAll(regex)].reduce(
(result, regExpMatchArray) => {
const [match] = regExpMatchArray;
const { index: currentIndex } = regExpMatchArray;
if (currentIndex === undefined) {
return result;
}
return {
items: [
...result.items,
string.substring(result.index, currentIndex),
match,
],
index: currentIndex match.length,
};
},
{
index: 0,
items: [],
}
);
if (index !== string.length) {
items.push(string.substring(index, string.length));
}
console.log(items);