I have a string:
s = r'"url" : "a", "meta": "b", "url" : "c"'
What I want is to capture the substring url: ...
up to the ,
, so the expected output is a list:
[r'"url" : "a"', r'"url" : "b"']
I am using:
re.findall(r'("url"):(.*),', s)
but all it does is to return the entire string. Is there something i am doing wrong?
CodePudding user response:
Your last "," was beeing matched due to a greedy search, (.*?) is non greedy. Also the last comma is optional so that needs to be ignored if not present
import re
s = r'"url":"a","meta":"b","url":"c"'
print(re.findall(r'("url"):"(.*?)",?', s))
CodePudding user response:
You must escape the ,
to avoid including the comma inside the group. Try this:
re.findall(r'(("url" :[^,]*),*)', s)