Home > database >  Regex for obtaining all key value pairs with delimiter
Regex for obtaining all key value pairs with delimiter

Time:07-09

My input will always look something like this with an arbitrary amount of key/value pairs:

"irrelevant part[identifier]key1=val1;key2=val2;key3=val3"

and the desired end result is [key1, val1, key2, val2, key3, val3]

I'm able to match and get key1=val1;key2=val2;key3=val3 via \[identifier](.*) but am lost on how to grab the pairs delimited by a semi-colon.

CodePudding user response:

Complex regex are powerful but notoriously difficult to comprehend and maintain, and also can get quite slow. You don't seem to need one for your case at all, I would go with something like this:

input
 .dropWhile(_ != ']')
 .tail
 .split(";")
 .map { _.split("=") }
 .collect { case Array(k,v) => k -> v }
 .toMap

CodePudding user response:

You can use a pattern with a \G operator:

val text = "irrelevant part[identifier]key1=val1;key2=val2;key3=val3"
val regex = """(?:\G(?!^);|\[identifier])([\w-] )=([^;]*)""".r
val results = (regex findAllIn text).matchData.map(x => (x.group(1), x.group(2))).toMap
println(results) // => Map(key1 -> val1, key2 -> val2, key3 -> val3)

See this Scala demo and this regex demo. Details:

  • (?:\G(?!^);|\[identifier]) - either the end of the previous match and then a ; char, or [identifier] char sequence
  • ([\w-] ) - Group 1: one or more word or - chars
  • = - a = char and
  • ([^;]*) - Group 2: any zero or more chars other than semi-colon.
  • Related