Remove text between random symbols - JavaScript-CodePudding

I have this input text with a pattern * [*]*[/*] *:

Input: [tag]random text EXCLUDE[/tag] text here

Output: text here

Input: [tag_1]another random text EXCLUDE[/tag_1] another text here

Output: another text here

Input: [tag_1]another random text EXCLUDE[/tag_1] another text here [tag_1] another random text EXCLUDE[/tag_1] text [tag_3]another random text EXCLUDE[/tag_3]

Output: another text here text

What I want is to remove the text, between [*] and [/*], like replacing it with ''.

The problem is that the symbols, between [] are random, but if there is an open [*], there is and closed [/*], without nesting.

CodePudding user response：

This should do the trick:

let OUTPUT = INPUT
    .split('[')
    .filter(s => s.indexOf(']') == -1 || s.startsWith('/'))
    .map(s => s
        .split(']')
        .reverse()[0])
    .join('')

The main point is, the text inside [] doesn't actually matter, all we need are the square brackets to act as "anchors".

I tried and failed to write a concise step-by-step explanation... My suggestions is to copy-paste the code in a console, feed it some data and watch what comes out of each step, it's self-evident.

CodePudding user response：

This looks like a nice job for regular expressions:

const theStrings = [ '[tag]random text EXCLUDE[/tag] text here',
  '[tag_1]another random text EXCLUDE[/tag_1] another text here',
  '[random]another random text EXCLUDE[/random] another text here',
  '[this]is not to be [/filtered] as they don\'t match',
  '[tag_1]another random text EXCLUDE[/tag_1] another text here [tag_1] another random text EXCLUDE[/tag_1] text [tag_3]another random text EXCLUDE[/tag_3]'
 ]

let replaced = '';
let prev_len = 0;
theStrings.forEach(str => {
  replaced = str;
  do {
    prev_len = replaced.length ;
    replaced = replaced.replace(/\[(.*)\].*?\[\/\1\](.*)/,'$2') 
  } while (replaced.length < prev_len);
  console.log("before:",str,"\nafter:", replaced)

} )

https://regex101.com/r/cE6NZm/1

basically it's capturing anything between [this] and [/this] (notice they are the same but the second one has to be preceded by a / ) and letting out that portion of the string