Assume, I have a Python program, and I have an Offer
object Offer(title='title1', category='cat1', regions=['reg1'])
.
I want to add this Offer
into psql db, with minimal number of queries (performance). Inserts of new regions and categories are rare (number of regions and categories is limited (and are unique), but number of offers is unlimited).
Basically Regions
and Categories
can be inserted by query:
INSERT INTO Categories(name)
SELECT 'cat1'
WHERE NOT EXISTS(
SELECT 1 FROM Categories WHERE name = 'cat1'
)
RETURNING id;
, but I need to execute another query to get id of region/category (when region/category already exists). I need this id to execute query inserting data to Offers
table:
INSERT INTO Offers(title, category)
SELECT 'title1', (SELECT id FROM Regions WHERE name = 'reg1')
WHERE NOT EXISTS(
SELECT 1 FROM Offers WHERE title = 'title1' AND category = (SELECT id FROM Regions WHERE name = 'reg1')
);
Currently my code looks like:
INSERT INTO Categories(name)
SELECT 'cat1'
WHERE NOT EXISTS(
SELECT 1 FROM Categories WHERE name = 'cat1'
);
INSERT INTO Regions(name)
SELECT 'reg1'
WHERE NOT EXISTS(
SELECT 1 FROM Regions WHERE name = 'reg1'
);
INSERT INTO Offers(title, category)
SELECT 'title1', (SELECT id FROM Regions WHERE name = 'reg1')
WHERE NOT EXISTS(
SELECT 1 FROM Offers WHERE title = 'title1' AND category = (SELECT id FROM Regions WHERE name = 'reg1')
);
INSERT INTO OfferRegions(offer, region)
SELECT (SELECT id FROM Offers WHERE title = 'title1'), (SELECT id FROM Regions WHERE name = 'reg1')
WHERE NOT EXISTS(
SELECT 1 FROM OfferRegions WHERE offer = (SELECT id FROM Offers WHERE title = 'title1') AND region = (SELECT id FROM Regions WHERE name = 'reg1')
);
I don't know how to do it robust and efficient (without unnecessary SELECTs). I work with Python/psycopg2.
Tables:
CREATE TABLE IF NOT EXISTS Regions
(
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
UNIQUE (name)
);
CREATE TABLE IF NOT EXISTS Categories
(
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
UNIQUE (name)
);
CREATE TABLE IF NOT EXISTS Offers
(
id SERIAL PRIMARY KEY,
title TEXT,
category SERIAL REFERENCES Categories (id) ON UPDATE CASCADE ON DELETE CASCADE,
UNIQUE (id)
);
CREATE TABLE IF NOT EXISTS OfferRegions
(
offer SERIAL REFERENCES Offers (id) ON UPDATE CASCADE ON DELETE CASCADE,
region SERIAL REFERENCES Regions (id) ON UPDATE CASCADE ON DELETE CASCADE,
UNIQUE (offer, region)
);
CodePudding user response:
You are looking for INSERT ... ON CONFLICT DO NOTHING
.
For that, you need a unique constraint on the column that identifies the object.
That allows you to retrieve the generated id
, for example:
INSERT INTO categories (name) VALUES ('cat1')
ON CONFLICT ON (name) DO NOTHING
RETURNING id;
Dependent tables could be filled like this, using a variable cat_id
that was set with the result from the above query:
INSERT INTO offers (category, title)
SELECT CASE WHEN cat_id IS NULL
THEN (SELECT id FROM categories WHERE name = 'cat1')
ELSE cat_id
END,
/* similar for "title */
There is of course a race condition: someone could delete the categories
row before your second INSERT
. But perhaps that is good enough.