Home > Enterprise >  SQL Alchemy insert into table, with column being selected from another table
SQL Alchemy insert into table, with column being selected from another table

Time:03-09

I have the following tables, and I wanted to see if I could perform an insert, where for each inserted row, the value of the linked column gets selected by the other one using SQL alchemy and the insert function. The tables are:

from sqlalchemy import MetaData, Table, Column, Integer, VARCHAR, ForeignKey, create_engine

SQL_ENGINE = create_engine(...)

metadata_obj = MetaData(bind=SQL_ENGINE, schema="collections")
flows = Table(
    "flows", metadata_obj,
    Column("id", Integer, autoincrement=True, primary_key=True),
    Column("voyage_name", VARCHAR(100), nullable=False, unique=True),
    Column("vessel_name", VARCHAR(length=100), nullable=True, unique=False),
    Column("origin_country_id", Integer, ForeignKey("metadata.tbl_country.id"), nullable=True,
           unique=False),
    Column("destination_country_id", Integer, ForeignKey("metadata.tbl_country.id"),
           nullable=True, unique=False))
country = Table(
    "country", metadata_obj,
    Column("id", Integer, autoincrement=True, primary_key=True),
    Column("country_name", VARCHAR(100), nullable=False, unique=False),
    Column("country_iso", VARCHAR(2), nullable=False, unique=False)
)    

When I extract the values, I get the following data as a pandas dataframe:

voyage_name vessel_name origin_country_iso destination_country_iso
voyagename1 vesselname1 US GB
voyagename2 vesselname2 BR FR
voyagename3 vesselname3 NL CY

Which I want to insert into my flows table. I then convert the above dataframe to a list of dictionaries that looks like this:

insert_data = [
    {
        "voyage_name": "voyagename1",
        "vessel_name": "vesselname1",
        "origin_country_iso": "US",
        "destination_country_iso": "GB"
    },
    {
        "voyage_name": "voyagename2",
        "vessel_name": "vesselname2",
        "origin_country_iso": "BR",
        "destination_country_iso": "FR"
    },
    {
        "voyage_name": "voyagename3",
        "vessel_name": "vesselname3",
        "origin_country_iso": "NL",
        "destination_country_iso": "CY"
    }
]

If written in SQL I could do the following to upload the data into my flows table:

INSERT INTO flows (
    voyage_name,
    vessel_name,
    origin_country_id,
    destination_country_id
)
VALUES
    ('voyagename1', 'vesselname1', 
     (SELECT id FROM country WHERE country_iso = 'US'), (SELECT id FROM country WHERE country_iso = 'GB')
    ),
    ('voyagename2', 'vesselname2', (SELECT id FROM country WHERE country_iso = 'BR'), 
     (SELECT id FROM country WHERE country_iso = 'FR')
    ),
    ('voyagename1', 'vesselname3', 
     (SELECT id FROM country WHERE country_iso = 'NL'), (SELECT id FROM country WHERE country_iso = 'CY')
    )

How would I do the equivalent insert statement using SQLalchemy and the insert statement, with values fed by my dictionary?

from sqlalchemy.dialects.postgresql import insert
flows_insert_stmt = insert(flows).values(insert_data)

NOTE: Database is PostgreSQL.

CodePudding user response:

Assuming that you already have all iso values in that country table and no misses, one way to construct such a query statement is like this

from sqlalchemy import insert, select

for row in insert_data:
    stmt = insert(flows).values(
        voyage_name=row['voyage_name'],
        vessel_name=row['vessel_name'],
        origin_country_id=select(country.c.id).where(country.c.country_iso==row['origin_country_iso']).scalar_subquery(),
        destination_country_id=select(country.c.id).where(country.c.country_iso==row['destination_country_iso']).scalar_subquery(),
    )
    print(stmt)

A statement produced by this script is:

INSERT INTO collections.flows (voyage_name, vessel_name, origin_country_id, destination_country_id) VALUES (%(voyage_name)s, %(vessel_name)s, (SELECT collections.country.id 
FROM collections.country 
WHERE collections.country.country_iso = %(country_iso_1)s), (SELECT collections.country.id 
FROM collections.country 
WHERE collections.country.country_iso = %(country_iso_2)s)) RETURNING collections.flows.id

The use of scalar_subquery is to force one value result

  • Related