I have a system with a large number of tables that contain historical data. Each table has a ts_from
and ts_to
column which are of type timestamptz
. These represent the time period in which the data for a particular row was valid.
These columns are indexed.
If I want to query all rows that were valid at a particular timestamp, it is trivial to write the ts_from <= @at_timestamp AND ts_to >= @at_timestamp
WHERE
clause to utilitise the index.
However, I wanted to create a function called Temporal.at
which would take the @at_timestamp
column and the ts_from
/ ts_to
columns and do this by hiding the complexity of the comparison from the query that uses it. You might think this is trivial, but I would also like to extend the concept to create a function called Temporal.between
which would take a @from_timestamp
and @to_timestamp
and select all rows that were valid between those two periods. That function would not be trivial, as one would have to check where rows partially overlap the period rather than always being fully enclosed by it.
The issue is this: I have written these functions but they do not cause the index to be used. The query performance is woefully slow on the history tables, some of which have hundreds of millions of rows.
The questions therefore are: a) Is there a way to write these functions so that we can be sure the indexes will be used?
b) Am I going about this completely the wrong way and is there a better way to proceed?
CodePudding user response:
This is complicated if you model ts_from
and ts_to
as two different timestamp columns. Instead, you should use a range type: tstzrange
. Then everything will become simple:
for containment in an interval, use
@at_timestamp <@ from_to
for interval overlap, use
tstzinterval(@from_timestamp, @to_timestamp) && from_to
Both queries can be supported by a GiST index on the range column.