My input will always look something like this with an arbitrary amount of key/value pairs:
"irrelevant part[identifier]key1=val1;key2=val2;key3=val3"
and the desired end result is [key1, val1, key2, val2, key3, val3]
I'm able to match and get key1=val1;key2=val2;key3=val3
via \[identifier](.*)
but am lost on how to grab the pairs delimited by a semi-colon.
CodePudding user response:
Complex regex are powerful but notoriously difficult to comprehend and maintain, and also can get quite slow. You don't seem to need one for your case at all, I would go with something like this:
input
.dropWhile(_ != ']')
.tail
.split(";")
.map { _.split("=") }
.collect { case Array(k,v) => k -> v }
.toMap
CodePudding user response:
You can use a pattern with a \G
operator:
val text = "irrelevant part[identifier]key1=val1;key2=val2;key3=val3"
val regex = """(?:\G(?!^);|\[identifier])([\w-] )=([^;]*)""".r
val results = (regex findAllIn text).matchData.map(x => (x.group(1), x.group(2))).toMap
println(results) // => Map(key1 -> val1, key2 -> val2, key3 -> val3)
See this Scala demo and this regex demo. Details:
(?:\G(?!^);|\[identifier])
- either the end of the previous match and then a;
char, or[identifier]
char sequence([\w-] )
- Group 1: one or more word or-
chars=
- a=
char and([^;]*)
- Group 2: any zero or more chars other than semi-colon.