Psycopg2 Json UTF8-CodePudding

In Postgres I have column of json type with UTF-8 encoding. I am inserting json from Python with Psycopg2

cursor.execute("INSERT INTO my_table (json_data) VALUES (%s)",Json(data))

The result in the database is

{"item": "r\u00f6b"}

Meaning ö appears as \u00f6. When I manually insert this item through pgAdmin, it appears normally as ö. The documentation says that psycopg2.extras.Json uses dumps inside, which has this encoding issue. Could you suggest a solution to this to allow storing {"item": "röb"} without \u ?

CodePudding user response：

This is due to json.dumps having the default of ensure_ascii=True per json:

If ensure_ascii is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is false, these characters will be output as-is. ...

As permitted, though not required, by the RFC, this module’s serializer sets ensure_ascii=True by default, thus escaping the output so that the resulting strings only contain ASCII characters.

The solution is to override the default by creating a lambda function that uses dumps(ensure_ascii=False). Derived from examples here psycopg2 json.

import json
from psycopg2.extras import Json 

json.dumps('ö')                                                                                                                                                            
'"\\u00f6"'

json.dumps('ö', ensure_ascii=False)                                                                                                                                        
'"ö"'

dumps_lambda = lambda x: json.dumps(x, ensure_ascii=False)

j = Json('ö', dumps=dumps_lambda)                                                                                                                                         

j.adapted                                                                                                                                                                 
'ö'