I am adding JSON string to Postgres jsonb column in python using the sqlalchemy library.
Code to create json string is
survey_data_series = scto_df.loc[int(i)]
survey_data_json_string = json.dumps(survey_data_series.to_dict())
Code to add to postgresql database is:
def add_record(self, table_name, record_data):
record = getattr(self, table_name)(**record_data)
self.s.add(record)
self.s.flush()
return(record.id)
My JSON is printed properly in logs
{"completiondate": "Feb 8, 2022 10:49:29 AM", "submissiondate": "Feb 8, 2022 10:49:29 AM", "starttime": "Feb 8, 2022 10:37:17 AM", "endtime": "Feb 8, 2022 10:49:21 AM", "deviceid": "ac29d854b49a49be", "subscriberid": NaN}
But when it goes to database the json is enclosed in a string and an escape character is added in front of every double quote.
"{\"completiondate\": \"Feb 8, 2022 10:49:29 AM\", \"submissiondate\": \"Feb 8, 2022 10:49:29 AM\", \"starttime\": \"Feb 8, 2022 10:37:17 AM\", \"endtime\": \"Feb 8, 2022 10:49:21 AM\", \"deviceid\": \"ac29d854b49a49be\", \"subscriberid\": NaN}"
Due to which none of the below queries are working:
SELECT JSON_EXTRACT_PATH_TEXT(survey_data_json::json,'starttime') AS starttime
FROM "Suvita".survey_data_json;
SELECT survey_data_json -> 'starttime' AS starttime
FROM "Suvita".survey_data_json;
SELECT (survey_data_json -> 0)--> '\"starttime\"' AS starttime
FROM "Suvita".survey_data_json; --Gives the whole json string
SELECT * FROM "Suvita".survey_data_json where survey_data_json->>'deviceid'='ac29d854b49a49be';
My requirement is to be able to query JSON column by matching the key value. Please help
SOLVED: I edited the code to pass the dictionary directly to sqlalchemy table object to add the row. It automatically converted the dictionary to JSON.
record=self.survey_data_json(surveyid=surveyid,survey_data_json=dict_object)
self.s.add(record)
Thanks @snakecharmerb
CodePudding user response:
By default, SQLAlchemy will automatically cll json.dumps
on values assigned to a JSON or JSONB column, so it isn't necessary to call it yourself - in fact this will lead to double-encoded values as seen in the question.
The functions used to encode and decode JSON can be managed by setting the json_serializer and json_deserializer arguments to create_engine
If you don't want SQLAlchemy to serialise automatically, pass an identity function like lambda x: x
.
import json
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import JSONB
engine = sa.create_engine('postgresql psycopg2:///test', future=True)
engine2 = sa.create_engine(
'postgresql psycopg2:///test', future=True, json_serializer=lambda x: x
)
tbl = sa.Table(
't73387703',
sa.MetaData(),
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('data', JSONB),
)
tbl.drop(engine, checkfirst=True)
tbl.create(engine)
d = {'a': 1, 'b': 2}
ins = tbl.insert()
with engine.begin() as conn:
conn.execute(ins.values(data=json.dumps(d)))
conn.execute(ins.values(data=d))
with engine2.begin() as conn:
conn.execute(ins.values(data=json.dumps(d)))
test# select * from t73387703 order by id;
id │ data
════╪════════════════════════
1 │ "{\"a\": 1, \"b\": 2}"
2 │ {"a": 1, "b": 2}
3 │ {"a": 1, "b": 2}
(3 rows)