SQL Time Series Group with max amount of results-CodePudding

I have timeseries data in a table using Timescaledb.

Data is as follows:

    time                    locationid parameterid unitid value
2022-04-18T10:00:00.000Z      "1"        "1"        "2"    2.2
2022-04-18T10:00:00.000Z      "2"        "1"        "2"    3.0
2022-04-18T09:00:00.000Z      "1"        "1"        "2"    1.2
2022-04-18T09:00:00.000Z      "2"        "1"        "2"    4.0
2022-04-18T08:00:00.000Z      "1"        "1"        "2"    2.6
2022-04-18T08:00:00.000Z      "2"        "1"        "2"    3.1
2022-04-18T07:00:00.000Z      "1"        "1"        "2"    2.1
2022-04-18T07:00:00.000Z      "2"        "1"        "2"    2.7

I have 1000s of rows with time series IOT data that I am putting into graphs using HighCharts.

My question is, is there a way to limit the number of items returned in my results, but not a classic limit. I'd like to return a 256 data groups at all times. So if I had 2,560 rows my query would group by/date trunc / time_bucket every 100 rows, but if I had 512 rows my query would only group every 2 rows so that I am always returning 256 no matter what.

My current query:

SELECT time_bucket('4 hours', time) as "t"
,locationid, avg(timestamp) as "x", avg(value) as "y"
FROM probe_data
WHERE locationid = '${q.locationid}'and parameterid = '${q.parameterid}' 
and time > '${q.startDate}' and time < `${q.endDate}`
GROUP BY "t", locationid
ORDER BY "t" DESC;

It seems like I should be able to use my min date and max date to count the number of possible returns and then divide by 256? Not sure if this is the best way to do it. Any help is appreciated. Thank you!

CodePudding user response：

No - SQL doesn't support dynamic grouping. To do what you ask, you'd have to fetch the resultset & check the number of records returned to then run again with your logic.

Effectively, you have a presentation level issue - not a SQL issue.

CodePudding user response：

Probably something with NTILE, not sure if this would work but I'd imagine doing it something like this:

SELECT avg(sub.timestamp), avg(sub.value) FROM (
    SELECT 
        timestamp,
        value,
        NTILE (256) OVER (
            ORDER BY time
       ) bucket_no
    FROM 
        probe_data
) sub GROUP BY sub.bucket_no;

Where the inner query would have all of your data broken into 256 groups, each group would then have a column bucket_no, and your outer query would group by the bucket_no

EDIT: just realized the mysql tag on this question is probably inaccurate, but I'll leave the answer as it might point you in the right direction for timescaledb.