Does the order of keys in a primary key array matter?-CodePudding

We've got a distributed/sharded database, so we use a combination of id and partner_id (which is our partition ID) as the primary key on any of our distributed tables.

In my schema.rb file, this declaration would look like this:

create_table "back_in_stock_subscriptions", primary_key: ["id", "partner_id"], force: :cascade do |t|

The issue that's leading to my question is that when one of my co-workers runs migrations, the schema is modified to look like this:

create_table "back_in_stock_subscriptions", primary_key: ["partner_id", "id"], force: :cascade do |t|

So any time they run migrations, they have to stage for committing just the piece that they're actually adding to the database, then discard the remaining changes to the schema.rb file. Or if they're running migrations someone else added, they just discard the schema.rb changes altogether. Granted, it's a very minor nuisance, and I have no idea why it's happening to begin with, but it is slightly concerning.

So my question is does it even matter? If he accidentally lets one of the "reversals" slip into a commit, would it be a big deal (other than that the next time I run migrations, it would reverse it back)?

CodePudding user response：

A test case:

create table pk_test1 (id integer, id2 integer, primary key(id, id2));
create table pk_test1 (id integer, id2 integer, primary key(id2, id));

\d pk_test
              Table "public.pk_test"
 Column |  Type   | Collation | Nullable | Default 
-------- --------- ----------- ---------- ---------
 id     | integer |           | not null | 
 id2    | integer |           | not null | 
Indexes:
    "pk_test_pkey" PRIMARY KEY, btree (id, id2)

\d pk_test2
              Table "public.pk_test2"
 Column |  Type   | Collation | Nullable | Default 
-------- --------- ----------- ---------- ---------
 id     | integer |           | not null | 
 id2    | integer |           | not null | 
Indexes:
    "pk_test2_pkey" PRIMARY KEY, btree (id2, id)


insert into pk_test values (1, 2), (2,1);
insert into pk_test2 values (1, 2), (2,1);

select * from pk_test;
 id | id2 
---- -----
  1 |   2
  2 |   1
(2 rows)

test(5432)=# select * from pk_test2;
 id | id2 
---- -----
  1 |   2
  2 |   1

insert into pk_test values (1, 2);
ERROR:  duplicate key value violates unique constraint "pk_test_pkey"
DETAIL:  Key (id, id2)=(1, 2) already exists.

insert into pk_test2 values (1, 2);
ERROR:  duplicate key value violates unique constraint "pk_test2_pkey"
DETAIL:  Key (id2, id)=(2, 1) already exists.

--THE PROBLEM

create table pk_child_tbl (child_id integer, parent_id integer, parent_id2 integer, foreign key(parent_id, parent_id2) references pk_test2);

insert into pk_test2 values (3,2);

 select * from pk_test2;
 id | id2 
---- -----
  1 |   2
  2 |   1
  3 |   2

 \d pk_test2
              Table "public.pk_test2"
 Column |  Type   | Collation | Nullable | Default 
-------- --------- ----------- ---------- ---------
 id     | integer |           | not null | 
 id2    | integer |           | not null | 
Indexes:
    "pk_test2_pkey" PRIMARY KEY, btree (id2, id)
Referenced by:
    TABLE "pk_child_tbl" CONSTRAINT "pk_child_tbl_parent_id_parent_id2_fkey" FOREIGN KEY (parent_id, parent_id2) REFERENCES pk_test2(id2, id)


insert into pk_child_tbl values (2, 3, 2);
ERROR:  insert or update on table "pk_child_tbl" violates foreign key constraint "pk_child_tbl_parent_id_parent_id2_fkey"
DETAIL:  Key (parent_id, parent_id2)=(3, 2) is not present in table "pk_test2".

If you have foreign key references that just point at the parent table with out specifying the parent table column order that will pick up the primary key order and the child parent_id/parent_id2 fields will no longer match the parent table id/id2 order.

CodePudding user response：

As far as uniqueness is concerned, the order of the columns in the primary key definition does not matter.

But a primary key is always implemented with a unique index, and that index can be used to speed up queries. For that, the order of columns can matter a lot. For example, this query:

SELECT * FROM tab WHERE col1 = 42;

can use the index for a primary key on (col1, col2), but not the index for a primary key on (col2, col1).