Home > OS >  Import json with missing keys into postgres with Flask-Sqlalchemy
Import json with missing keys into postgres with Flask-Sqlalchemy

Time:12-14

I have a question you guys might be able to answer.

I have a json file that looks something like this:

[
{
        "address": "some address",
        "full_time_school": false,
        "name": "some name",
        "official_id": "722154",
        "school_type": "Grundschule",
        "school_type_entity": "Grundschule",
        "state": "BW"
    },
    {
        "address": "some other address",
        "name": "some other name",
        "official_id": "722190",
        "state": "BW"
    }
]

The point is that not every entry has all keys.

I have a flask-sqlalchemy model that looks like this:

class School(db.Model):

    __tablename__ = "school"  # pragma: no cover

    address = db.Column(db.String)
    full_time_school = db.Column(db.Boolean)
    name = db.Column(db.String)
    official_id = db.Column(db.Integer)
    school_type = db.Column(db.String)
    school_type_entity = db.Column(db.String)
    state = db.Column(db.String)

    def __repr__(self):
        return f"<name {self.name}"

And I have a python script to add the json entries into my postgresql database that looks like this:

from my_project import db
from my_project.models import School

import json
import os

# insert data
for filename in os.listdir("datamining"):
    if filename.endswith(".json"):

        file = open(os.path.join("datamining", filename))

        print(f"Add schools from {filename.strip('.json')}")

        data = json.load(file)

        cleaned_data = {school["official_id"]: school for school in data}.values()
        print(f"Adding {len(data)} schools to the database.")

        for school in cleaned_data:
            entry = School(
                    id=school["official_id"]
                )
            for key, value in school.items():
                entry.key = value
            
            db.session.add(entry)
            db.session.commit()
        file.close()

print("Added all schools!!!")

I don't know why but somehow every cell is NULL except the official_id field. How so and how can I fix that? I'm at the end of my wits right now. Every pointer or help is much appreciated.

EDIT: What I found out so far is, that entry.key is not interpreted as entry.state for example, but actually creates a reference entry.key = "BW" for example. Why is that?

CodePudding user response:

Your problem is

entry.key = value

You are just writing your values over and over into the attribute 'key' within your School model. I'm actually surprised SQLAlchemy doesn't raise some kind of error here... Just pass all your values into the constructor and you should be fine:

school["id"] = school.pop("official_id")
entry = School(**school)

EDIT: It's "BW" because this happens to be the last value that is written into the attribute.

CodePudding user response:

You can do this much easier and faster by executing this native parameterized query:

insert into schools 
 select (jsonb_populate_record(null::schools, j)).*
 from jsonb_array_elements(:jsontext::jsonb) j;
 
  • Related