What is the best way to construct database tables for many to many relationship (with additional con-CodePudding

Business problem: Suppose that we have a few medical centers and doctors, who work in these centers. Obviously, many doctors can work in one center. But also one doctor can work in many centers at the same time. And we have to store information about who is the head doctor of each medical center (each medical center can have only one head doctor and one doctor can be the head doctor in multiple centers).

Question: What is the best way to construct database tables to serve these business requirements?

I see two variants (described below) but if you see more, please, let me know.

Variant 1
In this variant, we store information about the head doctor in the join table jobs. I see two disadvantages here:

the column jobs.is_head will contain false in most cases and it looks strange (and looks like we store unnecessary information).
we need somehow to restrict adding two head doctors into one center.

create table doctors
(
    id   bigint  not null
        constraint doctors_pk
            primary key,
    name varchar not null
);

create table medical_centers
(
    id      bigint  not null
        constraint medical_centers_pk
            primary key,
    address varchar not null
);

create table jobs
(
    medical_center_id bigint  not null
        constraint centers_fk
            references medical_centers,
    doctor_id         bigint  not null
        constraint doctors_fk
            references doctors,
    is_head           boolean not null,
    constraint jobs_pk
        primary key (doctor_id, medical_center_id)
);

Variant 2
In this variant, we store information about the head doctor in medical_centers table. Two disadvantages again:

we have two types of relationships between tables now: many to many and one to many (because one doctor can be the head doctor in multiple centers), which is a bit complicated, especially considering that I want to use this schema through ORM framework (JPA implementation).
we have to somehow restrict setting doctor as a head doctor if this doctor is not working in this center.

create table doctors
(
    id   bigint  not null
        constraint doctors_pk
            primary key,
    name varchar not null
);

create table medical_centers
(
    id             bigint  not null
        constraint medical_centers_pk
            primary key,
    address        varchar not null,
    head_doctor_id bigint
        head_doctor_id_fk
            references doctors
);

create table jobs
(
    medical_center_id bigint not null
        constraint centers_fk
            references medical_centers,
    doctor_id         bigint not null
        constraint doctors_fk
            references doctors,
    constraint jobs_pk
        primary key (doctor_id, medical_center_id)
);

CodePudding user response：

It is a trade - off, but operational complexity or some(?) storage. I think variation 1 looks good. If you want to change a little bit you can add another table called "med_cen_heads" with unique constraint on med_center_id column. Thus, we prevent adding a second doctor to same medical center. The hard part is checking if the head doctor works in the medical center or not before insert.

INSERT INTO med_cen_heads
SELECT medical_center_id, doctor_id         
FROM jobs
WHERE EXISTS (SELECT 1 FROM jobs WHERE medical_center_id = 'medical_center_id_to_insert' and doctor_id 'doctor_id_to_insert');

Also, you can create "before insert trigger" to check if values exist in jobs table.

It could look like this:

create table doctors
(
    id   bigint  not null
        constraint doctors_pk
            primary key,
    name varchar not null
);

create table medical_centers
(
    id      bigint  not null
        constraint medical_centers_pk
            primary key,
    address varchar not null
);

create table jobs
(
    job_id serial,
    medical_center_id bigint  not null
        constraint centers_fk
            references medical_centers,
    doctor_id         bigint  not null
        constraint doctors_fk
            references doctors
);

create table med_cen_heads
(
    medical_center_id bigint unique not null,
    doctor_id         bigint not null
);

This will save you from unnecessary storage but, BOOLEAN data type is just 1 byte. Let' s assume you have 1 billion med_center - doctor pairs(I don' t think you will have). In this way you only store 0.93132 GB more for your extra column. Of course, becuse there is only one head doctor in a medical center this column will be skewed. Yet when you query normal doctors this column will not be your concern, you should use "doctor_id" or any other columns.

In short, from my point of view stick with variation 1 with this small change:

create unique index unique_row on jobs(medical_center_id) where is_head;

Check you cannot add a second head doctor to a medical center.