I have a large script that parses js with a dataframe entry, but to shorten the question, I put what I need in a separate variable. My variable contains the following value
value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"
I apply the following script and get data like this
value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"
def parse_json(value):
arr = value.split("},")
arr = [x "}" for x in arr]
arr[-1] = arr[-1][:-1]
return json.dumps({str(i):add_quotation_marks(x) for i, x in enumerate(arr)})
def add_quotation_marks(value):
words = re.findall(r'(\w :)', value)
for word in words:
value = value.replace(word[:-1], f'"{word[:-1]}"')
return json.loads(value)
print(parse_json(value))
{"0": {"from": [3, 4], "to": [7, 4], "color": 2}, "1": {"from": [3, 6], "to": [10, 6], "color": 3}}
The script executes correctly, but I need to get a slightly different result. This is what the result I want to get looks like:
{
"0": {
"from": {
"0": "3",
"1": "4"
},
"to": {
"0": "7",
"1": "4"
},
"color": "2"
},
"1": {
"from": {
"0": "3",
"1": "6"
},
"to": {
"0": "10",
"1": "6"
},
"color": "3"
}
}
This is valid json and valid yaml. Please tell me how can I do this
CodePudding user response:
I'd suggest a regex approach in this case:
res = []
# iterates over each "{from:...,to:...,color:...}" group separately
for obj in re.findall(r'\{([^}] )}', value):
item = {}
# iterates over each "...:..." key-value separately
for k, v in re.findall(r'(\w ):(\[[^]] ]|\d )', obj):
if v.startswith('['):
v = v.strip('[]').split(',')
item[k] = v
res.append(item)
This produces this output in res
:
[{'from': ['3', '4'], 'to': ['7', '4'], 'color': '2'}, {'from': ['3', '6'], 'to': ['10', '6'], 'color': '3'}]
Since your values can contain commas, trying to split on commas or other markers is fairly tricky, and using these regexes to match your desired values instead is more stable.
CodePudding user response:
Here's the code that converts the the value
to your desired output.
import json5 # pip install json5
value = "{from:[3,4],to:[7,4],color:2},{from:[3,6],to:[10,6],color:3}"
def convert(str_value):
str_value = f"[{str_value}]" # added [] to make it a valid json
parsed_value = json5.loads(str_value) # convert to python object
output = {} # create empty dict
# Loop through the list of dicts. For each item, create a new dict
# with the index as the key and the value as the value. If the value
# is a list, convert it to a dict with the index as the key and the
# value as the value. If the value is not a list, just add it to the dict.
for i, d in enumerate(parsed_value):
output[i] = {}
for k, v in d.items():
output[i][k] = {j: v[j] for j in range(len(v))} if isinstance(v, list) else v
return output
print(json5.dumps(convert(value)))
Output
{
"0": {
"from": {
"1": 4
},
"to": {
"0": 7,
"1": 4
},
"color": 2
},
"1": {
"from": {
"0": 3,
"1": 6
},
"to": {
"0": 10,
"1": 6
},
"color": 3
}
}
json5
package allows you to convert a javascrip object to a python dictionary so you dont have to dosplit("},{")
.- Then added
[
and]
to make the string a valid json. - Then load the string using
json5.loads()
- Now you can loop through the dictionary and convert it to desired output format.