I am trying to create JSON file using python code. file is created successfully with English language but not properly working with Marathi Language.
Please check out code:
import os
import json
jsonFilePath = "E:/file/"
captchaImgLocation = "E:/file/captchaimg/"
path_to_tesseract = r"C:/Program Files/Tesseract-OCR/tesseract.exe"
image_path = r"E:/file/captchaimg/captcha.png"
x = {
"FName": "प्रवीण",
}
# convert into JSON:
y = json.dumps(x, ensure_ascii=False).encode('utf8')
# the result is a JSON string:
print(y.decode())
completeName = os.path.join(jsonFilePath, "searchResult_Unicode.json")
print(str(completeName))
file1 = open(completeName, "w")
file1.write(str(y))
file1.close()
O/P on console:
{"FName": "प्रवीण"}
<br>
File created inside folder like this:
b'{"FName": "\xe0\xa4\xaa\xe0\xa5\x8d\xe0\xa4\xb0\xe0\xa4\xb5\xe0\xa5\x80\xe0\xa4\xa3"}'
There is no run time or compile time error but JSON is created with with above format. Please suggest me any solution.
CodePudding user response:
You have encoded the JSON string, so you must either open the file in binary mode or decode the JSON before writing to file, so:
file1 = open(completeName, "wb")
file1.write(y)
or
file1 = open(completeName, "w")
file1.write(y.decode('utf-8'))
Doing
file1 = open(completeName, "w")
file1.write(str(y))
writes the string representation of the bytes to the file, which always the wrong thing to do.
CodePudding user response:
Do you want your json to be human readable? It's usually bad practice since you would never know what encoding to use.
You can write/read your json files with the json module without worrying about encoding:
import json
json_path = "test.json"
x = {"FName": "प्रवीण"}
with open(json_path, "w") as outfile:
json.dump(x, outfile, indent=4)
with open(json_path, "r") as infile:
print(json.load(infile))