I need to convert a line of SQL into a dictionary. Given the following partial line:
'field_1="value_1", field_2=1234'
I what to create the following dictionary:
{"field_1": "value_1", "field_2": 1234}
Using:
text = text.replace(",", "=")
text = text.split("=")
text = [i.strip() for i in text]
col_names = text[::2]
values = text[1::2]
dict_ = dict(zip(col_names, values))
I get as close as:
{'field_1': '"value_1"', 'field_2': '1234'}
I'm pretty sure I'll be able to sort out the extra quotation marks however I'm struggling with this line because of the comma inside the quotation marks wrecks my split()
:
'field_1="value_1a, value_1b", field_2=1234'
I have a feeling I might be able to use regex here to solve bothh my quotation mark and extra comma issues but I can't work out the exact syntax. The values can sometimes be strings and sometimes integers/floats. The field names can vary considerably and there aren't always spaces after commas. There also isn't a comma at the end of the string.
Thanks in advance!
CodePudding user response:
Here is a very simple approach, that would do the job for you:
def parse(s):
buf = []
token = ""
quotes = False
punct = {",", "="}
for c in s:
if c == '"':
quotes = not quotes
elif c in punct and not quotes:
buf.append(token)
token = ""
elif c != " " or quotes:
token = c
buf.append(token)
return dict(zip(buf[::2], buf[1::2]))
print(parse('field_1="value_1a, value_1b", field_2=1234'))
# {'field_1': 'value_1a, value_1b', 'field_2': '1234'}
It should be handling all simple cases (including commas inside the values), however, you'll need to improve it to work with different types of quotes, quotes inside the quotes, etc.
I hope this would give you the right direction.
CodePudding user response:
As Grzegorz Oledzki said, you'll want to split by your commas and then "deal with" (strip) your quotations afterwards
x = 'field_1="value_1", field_2=1234'
x = {i.split("=")[0]:i.split("=")[1].strip("\"") for i in x.split(",")}
{'field_1': 'value_1', ' field_2': '1234'}
Note that this doesnt evaluate integers so we can expand on this so it can accomplish that:
import ast
x = 'field_1="value_1", field_2=1234'
x = {i.split("=")[0]:i.split("=")[1].strip("\"") if i.split("=")[1].count("\"") else ast.literal_eval(i.split("=")[1].strip("\"")) for i in x.split(",")}
{'field_1': 'value_1', ' field_2': 1234}