A web platform allows me to create a String similar to JSON structure, the problem is that the arrays are inserted without double quotes. I think that with Python I could solve the problem to later convert the string to JSON with:
jsonObj = json.loads(string_toConvert)
This is the String that I have:
import json
to_json = '''
{
"Element1": {
"ID": "ID321",
"Data a": [elm1, elem2, 1, 1, , 2, , 3, 354, , NCA, x]
},
"Element2": {
"ID": "ID421",
"Data a": [elm1, elem2, 1, 1, , 2, , 3, 354, , NCA, x],
"Data b": [elm1, elem2, 3, 4, , 5, , 3, 354, , CAA, x, y, z, , ]
},
"Element3": {
"ID": "ID512",
"Data a": [elm1, elem2, elm3, 2, 2, , 2, , 3, 54, , ABC, x, y, z, w, , ]
}
}'''
As you know, to convert the String to JSON the elements of the array must be enclosed in double quotes
I need to convert it to the following structure:
to_json = '''
{
"Element1": {
"ID": "ID321",
"Data a": ["elm1", "elem2", "1", "1", " ", "2", " ", "3", "354", " ", "NCA", "x"]
},
"Element2": {
"ID": "ID421",
"Data a": ["elm1", "elem2", "1", "1", " ", "2", " ", "3", "354", " ", "NCA", "x"],
"Data b": ["elm1", "elem2", "3", "4", " ", "5", " ", "3", "354", " ", "CAA", "x", "y", "z", " ", " "]
},
"Element3": {
"ID": "ID512",
"Data a": ["elm1", "elem2", "elm3", "2", "2", " ", "2", " ", "3", "54", " ", "ABC", "x", "y", "z", "w", " "]
}
}
'''
Remark: The triple quotes (''') is only to store the whole string into a variable
CodePudding user response:
You should really fix this at the source, or if you don't control the source, contact the owner about this. Fiddling with strings that are like JSON, but not really, and trying to make them valid JSON, is not good practice.
So after this disclaimer, here is a way to fix it. But this will fail when other parts of the the input string look like lists, but really aren't.
import re
to_json = re.sub(r"([[,] ?)(.*?)(?=[,\]])", r'\1"\2"', to_json)
CodePudding user response:
trincot's solution is the most elegant and should be preferred, but I wanted to provide a different approach for people that are not necessarily familiar with regex. It does assume that the input is well behaved and that the arrays are always provided in the same form.
to_json: str = """
{
"Element1": {
"ID": "ID321",
"Data a": [elm1, elem2, 1, 1, , 2, , 3, 354, , NCA, x]
},
"Element2": {
"ID": "ID421",
"Data a": [elm1, elem2, 1, 1, , 2, , 3, 354, , NCA, x],
"Data b": [elm1, elem2, 3, 4, , 5, , 3, 354, , CAA, x, y, z, , ]
},
"Element3": {
"ID": "ID512",
"Data a": [elm1, elem2, elm3, 2, 2, , 2, , 3, 54, , ABC, x, y, z, w, , ]
}
}
"""
in_array = False
out_string = ""
temp_string = ""
for character in to_json:
if character == "[":
# Beginning of an array
in_array = True
continue
elif character == "]":
# End of the array
quoted_values = []
# Put each word between quotes in a list
for word in temp_string.split(","):
quoted_values.append(f'"{word.strip()}"')
# Put it all back together in a string
temp_string = ", ".join(quoted_values)
# Add the array to the output
out_string = f"{out_string}[{temp_string}]"
# empty the temp string
temp_string = ""
# Exit array
in_array = False
continue
if in_array:
temp_string = character
else:
out_string = character
print(out_string)