Home > Mobile >  What could allow several entries into the database with identical composite keys?
What could allow several entries into the database with identical composite keys?

Time:04-12

SELECT user_id FROM contacts WHERE phone = " 18888576309";

    user_id   |     phone     |         created_at         
-------------- --------------- ----------------------------
 db0db85f2331 |  18888576309  | 2022-03-22 23:28:04.308612 
 db0db85f2331 |  18888576309  | 2022-03-22 23:28:04.310283 
 db0db85f2331 |  18888576309  | 2022-03-22 23:28:04.311891 

This doesn't look very weird, until you realize that the model is defined like this:

class Contact(db.Model):
    __tablename__ = 'contacts'
    user_id = db.Column(VARCHAR(16), primary_key=True)
    phone = db.Column(VARCHAR(64), primary_key=True)
    created_at = db.Column(DateTime)

    def __init__(self, user_id, phone):
        self.user_id = user_id
        self.contact_phone = phone
        self.created_at = datetime.datetime.utcnow()

Note: There are two primary keys defined here (making, afaik, a composite key). They are user_id and phone.

I've been trying for longer than I care to admit to find the source of how this could happen, but the internet seems devoid of any information regarding this as a possibility (there are in fact several posts that say this is impossible!). The only unusual thing, and I believe this is the root of the problem (though that's purely an intuition) is that the method which saves these objects uses Session.bulk_save_objects(). Thus far I have:

  • Checked all relevant migrations; anything altering the contacts table was well before 2022-03-22
  • Checked the file-history of the files which both define the Contact and the file which handles the creation of contacts---there hasn't been a change since well before 2022-03-22
  • Attempted to reproduce this issue locally a plethora of ways (including using multiprocessing to try to call Session.bulk_save_objects() on the same list of objects simultaneously)
  • Read all I could in SQLAlchemy's docs regarding Session.bulk_save_objects()

How might this type of thing happen at all? I was always under the impression that rows w/ identical primary keys just couldn't exist by virtue of some fundamental database logic, and maybe the only way it could happen is if two different processes added identical rows to the DB at the same time.

CodePudding user response:

There are two primary keys defined here (making, afaik, a composite key).

If there's indeed a PK on (user_id, phone), then your output doesn't make any sense. UNIQUE indexes (backing PRIMARY KEY and UNIQUE constraints) are enforced at all times.

First make sure you are not barking up the wrong tree. Are there multiple tables named "contacts" or "Contacts"? In multiple schemas?

Check with:

SELECT oid, relnamespace::regnamespace AS schema, relname, relkind
FROM   pg_class
WHERE  relname ILIKE 'contacts';

See:

Then check the actual table definition in psql with \d contacts - or whatever the actual, correct, fully-qualified tablename is.

If that doesn't clear the mist (like it probably will), you must be looking at a broken index. Fix with REINDEX asap:

REINDEX TABLE contacts;

Or, at the very least, just the one index:

REINDEX INDEX contacts_pkey;  -- use actual name

Or, more radically, with VACUUM FULL (rewriting table and all indexes from scratch):

VACUUM FULL contacts;

Both block concurrent access.

I have never seen a broken index myself, but it happens - especially on older versions and/or faulty hardware. It would be advisable to pin down the cause ...

  • Related