Home > Software engineering >  Collecting Values from a String between two different characters including the last value
Collecting Values from a String between two different characters including the last value

Time:12-20

Example string:

x=0, y=0, width=1920, height=1080, width_mm=531, height_mm=299, name='\\\\\\\\.\\\\DISPLAY4', is_primary=True

I want to get every value behind the "=" sign.

With

print(re.findall(r"=(.*?),", "x=0, y=0, width=1920, height=1080, width_mm=531, height_mm=299, name='\\\\\\\\.\\\\DISPLAY4', is_primary=True"))

I get:

['0', '0', '1920', '1080', '531', '299', "'\\\\\\\\.\\\\DISPLAY4'"]

But I want the "True" from "is_primary" too

With

=(.*?)(,|$)

I can split the string in two groups and fetch the values from group1 with a for loop but i think, there is a more beautiful way and i really want to see it

And is it maybe even possible to get the

"DISPLAY4"

out of:

"'\\\\\\\\.\\\\DISPLAY4'"

in the same expression?

CodePudding user response:

You can use re.findall and then exclude matching , or = before and after the = sign, using a single capture group.

If the values themselves can not contain ' you could use also exclude matching it:

[^=\s,] =[\\.']*([^=\s,'] )

Regex demo

import re

pattern = r"[^=\s,] =[\\.']*([^=\s,'] )"
s = "x=0, y=0, width=1920, height=1080, width_mm=531, height_mm=299, name='\\\\\\\\\\\\\\\\.\\\\\\\\DISPLAY4', is_primary=True"

print(re.findall(pattern, s))

A bit more precise match with 2 capture groups:

[^=\s,] =(?:'(?:\\ \.\\ )?([^\s,='] )'|([^\s,=] ))

The pattern matches:

  • [^=\s,] = Match 1 chars other than a whitspace char , = and then match =
  • (?: Non capture group for the alternatives
    • ' Match the '
    • (?:\\ \.\\ )? Optionally match 1 times /, a dot . and again 1 times /
    • ([^\s,='] ) Capture group 1, match 1 chars other than a whitspace char , = '
    • ' Match the '
    • | Or
    • ([^\s,=] ) Capture group 2, match 1 chars other than a whitespace char , =
  • ) Close the non capture group

See a regex demo and a Python demo.

import re

pattern = r"[^=\s,] =(?:'(?:\\ \.\\ )?([^\s,='] )'|([^\s,=] ))"

s = "x=0, y=0, width=1920, height=1080, width_mm=531, height_mm=299, name='\\\\\\\\\\\\\\\\.\\\\\\\\DISPLAY4', is_primary=True"

res = [m.group(1) if m.group(1) else m.group(2) for _, m in enumerate(re.finditer(pattern, s), start=1)]
print(res)

Both will output:

['0', '0', '1920', '1080', '531', '299', 'DISPLAY4', 'True']
  • Related