I have string like this:
[KEY1=ABC, KEY2=ABC, ... , KEY3={KEY123=ABC, KEY456=ABC}, KEY111=ABC, KEY4=KEY=VALUE&KEY=VALUE&KEY=VALUE]
I want to parse it into a python dict like this
dict = {
'KEY1': 'ABC',
'KEY2': 'ABC',
'KEY3': '{KEY123=ABC, KEY456=ABC}',
'KEY111': 'ABC',
'KEY4': 'KEY=VALUE&KEY=VALUE&KEY=VALUE',
}
Note that, the values can contain complex/special chars like /\-@'"
In case 'KEY4'; It can be like a URL GET String format. Example
KE4=id=1&username=xx&url=http://xxx...&ref=1
The following code partially works. It does not parse Values that contains nested key=val. like 'KEY3' above
dict = builtins.dict(re.findall(r'(\S )=(".*?"|\S )', input))
CodePudding user response:
I put your (slightly edited) regex in a function that you can call on the dictionary values to transform nested strings in dictionary as well:
import re
import json
input_str = "[KEY1=ABC, KEY2=ABC, KEY3={KEY123=ABC, KEY456=ABC}, KEY111=ABC]"
def dict_from_str(s):
return dict(re.findall(r'(\w )=([^{]*?|\{.*?})(?:,|$)', s.strip('[]{}')))
dict_out = dict_from_str(input_str)
for k, v in dict_out.items():
if '=' in v:
dict_out[k] = dict_from_str(v)
print(json.dumps(dict_out, indent=4))
Output:
{
"KEY1": "ABC",
"KEY2": "ABC",
"KEY3": {
"KEY123": "ABC",
"KEY456": "ABC"
},
"KEY111": "ABC"
}
If you have more levels, you could consider writing a recursive function.
Edit: you can handle URLs separately (note that the keys must be unique, unlike your first example):
input_str = "[KEY1=ABC, KEY2=ABC, KEY3={KEY123=ABC, KEY456=ABC}, KEY111=ABC, KEY4=KEY=VALUE&KEY2=VALUE&KEY3=VALUE]"
def dict_from_str(s):
return dict(re.findall(r'(\w )=([^{]*?|\{.*?})(?:,|$)', s.strip('[]{}')))
def dict_from_url(s):
return dict(re.findall(r'(\w )=([^=]*?)(?:&|$)', s.strip('[]{}')))
dict_out = dict_from_str(input_str)
for k, v in dict_out.items():
if '&' in v:
dict_out[k] = dict_from_url(v)
elif '=' in v:
dict_out[k] = dict_from_str(v)
print(json.dumps(dict_out, indent=4))
Output:
{
"KEY1": "ABC",
"KEY2": "ABC",
"KEY3": {
"KEY123": "ABC",
"KEY456": "ABC"
},
"KEY111": "ABC",
"KEY4": {
"KEY": "VALUE",
"KEY2": "VALUE",
"KEY3": "VALUE"
}
}