I have this input text with a pattern * [*]*[/*] *
:
Input: [tag]random text EXCLUDE[/tag] text here
Output: text here
Input: [tag_1]another random text EXCLUDE[/tag_1] another text here
Output: another text here
Input: [tag_1]another random text EXCLUDE[/tag_1] another text here [tag_1] another random text EXCLUDE[/tag_1] text [tag_3]another random text EXCLUDE[/tag_3]
Output: another text here text
What I want is to remove the text, between [*]
and [/*]
, like replacing it with ''.
The problem is that the symbols, between []
are random, but if there is an open [*]
, there is and closed [/*]
, without nesting.
CodePudding user response:
This should do the trick:
let OUTPUT = INPUT
.split('[')
.filter(s => s.indexOf(']') == -1 || s.startsWith('/'))
.map(s => s
.split(']')
.reverse()[0])
.join('')
The main point is, the text inside [] doesn't actually matter, all we need are the square brackets to act as "anchors".
I tried and failed to write a concise step-by-step explanation... My suggestions is to copy-paste the code in a console, feed it some data and watch what comes out of each step, it's self-evident.
CodePudding user response:
This looks like a nice job for regular expressions:
const theStrings = [ '[tag]random text EXCLUDE[/tag] text here',
'[tag_1]another random text EXCLUDE[/tag_1] another text here',
'[random]another random text EXCLUDE[/random] another text here',
'[this]is not to be [/filtered] as they don\'t match',
'[tag_1]another random text EXCLUDE[/tag_1] another text here [tag_1] another random text EXCLUDE[/tag_1] text [tag_3]another random text EXCLUDE[/tag_3]'
]
let replaced = '';
let prev_len = 0;
theStrings.forEach(str => {
replaced = str;
do {
prev_len = replaced.length ;
replaced = replaced.replace(/\[(.*)\].*?\[\/\1\](.*)/,'$2')
} while (replaced.length < prev_len);
console.log("before:",str,"\nafter:", replaced)
} )
https://regex101.com/r/cE6NZm/1
basically it's capturing anything between [this]
and [/this]
(notice they are the same but the second one has to be preceded by a /
) and letting out that portion of the string