Home > Net >  Postgres: best way to implement single favourite flag
Postgres: best way to implement single favourite flag

Time:08-09

I have this table:

CREATE TABLE public.data_source__instrument (
  instrument_id int4 NOT NULL,
  data_source_id int4 NOT NULL,
  CONSTRAINT data_source__instrument__pk PRIMARY KEY (data_source_id, instrument_id)
);

For clarity, here's an example of the data I might have in this table:

instrument_id data_source_id
1 1
1 2
1 3
2 2
2 3

I would like to be able to set a favourite data source for each instrument. I'd also like each instrument to have 1 and only 1 favourite data source.

The solution I came up with is the below:

CREATE TABLE public.data_source__instrument (
  instrument_id int4 NOT NULL,
  data_source_id int4 NOT NULL,
  fav_data_source boolean NULL, -- ### new column ###
  CONSTRAINT data_source__instrument__pk PRIMARY KEY (data_source_id, instrument_id),
  CONSTRAINT fav_data_source UNIQUE (instrument_id,fav_data_source) -- ### new constraint ###
);

I chose to mark favourites with the value true and set non-favourite tuples to null (because the unique constraint doesn't affect NULLs).

This solution will allow at most one true value per instrument in the fav_data_source column.

Example:

instrument_id data_source_id fav_data_source
1 1 true
1 2 null
1 3 null
2 2 null
2 3 true

However, I'm not completely satisfied with this solution. For starters, it allows instruments without a favourite data source. Moreover, the value of the fav_data_source could be set to false which is confusing and not very useful (since it doesn't play well with the unique constraint).

Is there a better way of doing this? Am I overlooking something?

EDIT: Ideally, I'd prefer a simple solution to this problem and avoid using features like database triggers.

CodePudding user response:

I chose to mark favourites with the value true and set non-favourite tuples to null (because the unique constraint doesn't affect NULLs), [however] the value of the fav_data_source could be set to false which is confusing and not very useful

There are two ways to improve this:

  • put a check constraint on the column to prevent false values:

    fav_data_source boolean NULL check (fav_data_source IS TRUE or fav_data_source IS NULL)
    
  • use a partial index to check only rows with true values (see also here):

    fav_data_source boolean NOT NULL
    …
    CREATE UNIQUE INDEX unique_fav_data_source ON data_source__instrument (instrument_id) WHERE (fav_data_source);
    

It allows instruments without a favourite data source.

I don't think it's possible to express such a "out of x, at least one y must exist" constraint in a relational database without triggers. (Assuming you still want to allow instruments without any data sources at all).

I'd also like each instrument to have 1 and only 1 favourite data source.

You might consider storing the favourite data source in the instruments table itself, as a non-nullable foreign reference column.

To avoid duplication, put only the non-favourite (additional/alternate) data source in the relation table, then provide data_source__instrument as a view:

CREATE VIEW data_source__instrument(instrument_id, data_source_id, is_favourite) AS
SELECT id, fav_data_source_id, true FROM instrument
UNION ALL
SELECT instrument_id, data_source_id, false FROM alternate_data_source__instrument;

The downside of this approach is that you cannot create a foreign key reference to the view (should you need to do so), and that updating the data sources gets quite complicated. Also this only works for M:N relations, not 1:N ones, where you would need a way to assert that the favourite source of an instrument actually belongs to that instrument (technically possible with an extra foreign key, but really ugly) and where you would introduce circular references which are just a mess.

CodePudding user response:

What you need is to think about DB normalisation and Entities and their Relations. Specifically

  • having that boolean flag in your joint table is violating normal form (functional dependency on other rows)
  • "favourite data source" is an entity on its own, the flag you have is describing a relation when it should describe an entity

I assume you have separate tables for instruments and data sources, and omitted those just for the sake of simplifying your question.

Here's what you need:

create table instrument
(
  id serial primary key
);

create table data_source
(
  id serial primary key
);

create table data_source__instrument
(
  data_source_id integer references data_source on delete cascade on update cascade,
  instrument_id integer references instrument on delete cascade on update cascade,
  constraint data_source__instrument_pk primary key (data_source_id, instrument_id)
);

create table favourite_data_source
(
  data_source_id integer not null,
  instrument_id integer not null unique,
  constraint favourite_data_source_pk foreign key (data_source_id, instrument_id) references data_source__instrument (data_source_id, instrument_id) on update cascade on delete cascade
);

And now to reproduce your example:


insert into instrument
values (1),
  (2);

insert into data_source
values (1),
  (2),
  (3);

insert into data_source__instrument (instrument_id, data_source_id)
values (1, 1),
  (1, 2),
  (1, 3),
  (2, 2),
  (2, 3);

insert into favourite_data_source (instrument_id, data_source_id)
values (1, 1),
  (2, 3);

With this scheme

  • you can have 0 to 1 favourite data source per instrument
  • you have to insert favourite data source data separately
  • FKs will handle integrity thanks to cascade and checks
  • Related