Home > Net >  grouping multiple queries into a single one, with Postgres
grouping multiple queries into a single one, with Postgres

Time:03-17

I have a very simple query:

SELECT * FROM someTable
WHERE instrument = '{instrument}' AND ts >= '{fromTime}' AND ts < '{toTime}'
ORDER BY ts

That query is applied to 3 tables across 2 databases.

I receive a list of rows that have timestamps (ts). I take the last timestamp and it serves as the basis for the 'fromTime' of the next iteration. toTime is usually equal to 'now'.

This allows me to only get new rows at every iteration.

I have about 30 instrument types and I need an update every 1s.

So that's 30 instruments * 3 queries = 90 queries per second.

How can I rewrite the query so that I could use a function like this:

getData table [(instrument, fromTime) list] toTime

and get back some dictionary, in the form:

Dictionary<instrument, MyDataType list>

To use a list of instruments, I could do something like:

WHERE instrument in '{instruments list}'

but this wouldn't help with the various fromTime as there is one value per instrument.

I could take the min of all fromTime values, get the data for all instruments and then filter the results out, but that's wasteful since I could potentially query a lot of data to throw is right after.

What is the right strategy for this?

CodePudding user response:

So there is a single toTime to test against per query, but a different fromTime per instrument.

One solution to group them in a single query would be to pass a list of (instrument, fromTime) couples as a relation.

The query would look like this:

SELECT [columns] FROM someTable
JOIN (VALUES
   ('{name of instrument1}', '{fromTime for instrument1}'),
   ('{name of instrument2}', '{fromTime for instrument2}'),
   ('{name of instrument3}', '{fromTime for instrument3}'),
   ...
) AS params(instrument, fromTime)
ON someTable.instrument = params.instrument AND someTable.ts >= params.fromTime
WHERE ts < 'toTime';

Depending on your datatypes and what method is used by the client-side driver to pass parameters, you may have to be explicit about the datatype of your parameters by casting the first value of the list, as in, for example:

JOIN (VALUES
   ('name of instrument1', '{fromTime for instrument1}'::timestamptz),

If you had much more than 30 values, a variant of this query with arrays as parameters (instead of the VALUES clause) could be preferred. The difference if that it would take 3 parameters: 2 arrays 1 upper bound, instead of N*2 1 parameters. But it depends on the ability of the client-side driver to support Postgres arrays as a datatype, and the ability to pass them as a single value.

  • Related