Home > database >  python 3 - how to clean json string with double backslashes and u00
python 3 - how to clean json string with double backslashes and u00

Time:12-02

I have several ugly json strings like the following:

test_string = '{\\"test_key\\": \\"Testing tilde \\u00E1\\u00F3\\u00ED\\"}'

that I need to transform it in a more visually friendly dictionary and then save it to a file:

{'test_key': 'Testing tilde áóí'}

So for that I am doing:

test_string = test_string.replace("\\\"", "\"") #  I suposse there is a safer way to do this
print(test_string)
#{"test_key": "Testing tilde \u00E1\u00F3\u00ED"}

test_dict = json.loads(test_string, strict=False)
print(test_dict)
#{'test_key': 'Testing tilde áóí'}

At this point test_dict seems correct. Then I save it to a file:

with open('test.json', "w") as json_w_file:
    json.dump(test_dict, json_w_file)

At this point the content of test.json is the ugly version of the json:

{"test_key": "Testing tilde \u00E1\u00F3\u00ED"}

Is there a safer way to transform my ugly json to a dictionary? Then how could I save the visually friendly version of my dictionary to a file?

Python 3

CodePudding user response:

The string looks like double-encoded json to me. This decodes it an writes a utf-8 json file.

test_string = '{\\"test_key\\": \\"Testing tilde \\u00E1\\u00F3\\u00ED\\"}'

test_dict = json.loads(json.loads(f'"{test_string}"'))

with open('test.json', "w") as json_w_file:
    json.dump(test_dict, json_w_file, ensure_ascii=False)
  • Related