I would like to create an Array out of this string:
// string
'a b[text="Fly to San Fran",text2="More"] c foo[text=Fly to San Fran,text2=More] bar d'
// resulting array:
[
'a',
'b[t="Fly to San Fran",text2=More]',
'c',
'foo[t=Fly to San Fran,text2=More]',
'bar',
'd'
]
How would a regex look like to split the string or is this the wrong approach?
So far I tried the following, which results in way too many null values.
/([a-zA-Z]*\[[a-z]*\])|([\w]*)/g
=>
[
'a',
null,
'b[t="Fly to San Fran",text2=More]',
null,
'c',
null
'foo',
null,
[t=Fly to San Fran,text2=More]',
null,
'bar',
null,
'd'
]
CodePudding user response:
\[[a-z]*\]
matches only letters within [
...]
. But there occur "
=
,
and spaces. Better use negation \[[^\]\[]*\]
here. This would match any character that is neither [
nor ]
inside.
const s = `a b[text="Fly to San Fran",text2="More"] `
`c foo[text=Fly to San Fran,text2=More] bar d`;
let res = s.match(/[a-zA-Z]*\[[^\]\[]*\]|\w /g);
res.forEach(element => console.log(element));
CodePudding user response:
Using a regex in this case would be extremely tedious.
I would recommend using a CSS selector parsing library.
In the following example, we can use the parsel library to tokenize the selector, then reduce
over the tokens to combine the adjacent ones.
const str = `a b[text="Fly to San Fran",text2="More"] c foo[text=Fly to San Fran,text2=More] bar d`
const tokens = parsel.tokenize(str).map(e => e.content)
const res = tokens.slice(1).reduce((acc, curr) => {
const prev = acc[acc.length - 1]
return curr == " " || prev == " " ? acc.push(curr) : acc[acc.length - 1] = curr, acc
}, [tokens[0]]).filter(e => e != " ")
console.log(res)
<script src="https://projects.verou.me/parsel/dist/nomodule/parsel.js"></script>