I have a postgres table with unique columns. I upload dataframes to the table using copy expert and psycopg2 in python. There are duplicate columns in the uploading dataframe. How can I remove the duplicate columns on the uploading dataframe and upload the dataframe?
Postgres table:
email (unique) | verified | id |
---|---|---|
[email protected] | True | 1 |
[email protected] | None | 2 |
[email protected] | True | 3 |
[email protected] | None | 4 |
List to be uploaded
verified | |
---|---|
[email protected] | None |
[email protected] | None |
[email protected] | None |
[email protected] | None |
Expected table result:
email (unique) | verified | id |
---|---|---|
[email protected] | True | 1 |
[email protected] | None | 2 |
[email protected] | True | 3 |
[email protected] | None | 4 |
[email protected] | None | 5 |
[email protected] | None | 6 |
CodePudding user response:
If you want to make sure that the email field in your DB is unique, you should make it unique in Postgres:
ALTER TABLE your_table ADD UNIQUE (email);
When inserting values you can ignore duplicates (like in your example):
INSERT INTO your_table (
email,
verified
)
VALUES (
%s,
%s
)
ON CONFLICT (
email
)
DO NOTHING;
Or you can update the values:
INSERT INTO your_table (
email,
verified
)
VALUES (
%s,
%s
)
ON CONFLICT (
email
)
DO UPDATE
SET
verified = %s
;
CodePudding user response:
demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=05db3742877c60fc42c4d4ee520d9bfb
many inserts expected the serial don't have gap/hole is unrealistic. If you really want no gap serial no, you can have a insert_time_stamp then use row_number to construct the serial no.
CREATE OR REPLACE FUNCTION func_u_info_ins ()
RETURNS TRIGGER
AS $$
DECLARE
seq bigint;
BEGIN
RAISE NOTICE 'new email: %, new.vertified: %', NEW.email, NEW.verified;
IF NEW.email IN (
SELECT
email
FROM
u_infos) THEN
RAISE NOTICE E'already existed\n';
RAISE NOTICE 'currval: %\n', currval('u_infos_id_seq'::regclass);
seq := (
SELECT
currval('u_infos_id_seq'::regclass));
RAISE info E'seq: %\n', seq;
PERFORM
pg_catalog.setval('u_infos_id_seq'::regclass, seq, FALSE);
RETURN NULL;
ELSE
RAISE NOTICE 'new email, new record';
RETURN new;
END IF;
END
$$
LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER trig_u_info_ins BEFORE INSERT ON u_infos FOR EACH ROW EXECUTE PROCEDURE func_u_info_ins ();