Using regex to extract substrings-CodePudding

I have a string:

s = r'"url" : "a", "meta": "b", "url" : "c"'

What I want is to capture the substring url: ... up to the ,, so the expected output is a list:

[r'"url" : "a"', r'"url" : "b"']

I am using:

re.findall(r'("url"):(.*),', s)

but all it does is to return the entire string. Is there something i am doing wrong?

CodePudding user response：

Your last "," was beeing matched due to a greedy search, (.*?) is non greedy. Also the last comma is optional so that needs to be ignored if not present

import re

s = r'"url":"a","meta":"b","url":"c"'

print(re.findall(r'("url"):"(.*?)",?', s))

CodePudding user response：

You must escape the , to avoid including the comma inside the group. Try this:

re.findall(r'(("url" :[^,]*),*)', s)