I have a question about a fundamental aspect of PostgreSQL.
Suppose I have two tables along the lines of the following:
create table source_data_property (
source_data_property_id integer primary key generated by default as identity,
property_name text not null
);
create table source_data_value (
source_data_value_id integer primary key generated by default as identity,
source_data_property_id integer not null references source_data_property,
data_value numeric not null
);
Suppose I write a very simple query that just performs a basic join:
select
sdp.source_data_property_id,
sdp.property_name,
sdv.source_data_value_id,
sdv.data_value
from source_data_property as sdp
join source_data_value as sdv using (source_data_property_id)
;
For optimal query performance, is it necessary to add an index on the source_data_property_id
column in the source_data_value
table? My original thought was no, because the source_data_property_id
is already indexed in the source_data_property
table, but after thinking about it a bit I'm not so sure.
CodePudding user response:
For optimal query performance, is it necessary to add an index on the source_data_property_id column in the source_data_value table?
In general yes, make indexes for your foreign keys. However...
A very small table won't get any advantage from indexes and Postgres will do a seq scan instead.
Similarly it depends on what sort of queries you're doing. In your example you're fetching every row in source_data_property which will also fetch every row in source_data_value. Using an index is slower and Postgres will do a seq scan instead.