SQL - Select data for line graph without duplicate data-CodePudding

I have a table that is setup like this:

number_value |        number_date
------------- ----------------------
           4 | 2022-04-05 10:00:00 
           4 | 2022-04-05 10:05:00
           4 | 2022-04-05 10:10:00
           5 | 2022-04-05 10:15:00
           5 | 2022-04-05 10:20:00
           7 | 2022-04-05 10:25:00
           5 | 2022-04-05 10:30:00
           5 | 2022-04-05 10:35:00

This data is going to be used to create a line graph.

To reduce the amount of data that needs to be sent, I want the select query to format the data like this:

number_value |      start_point    |      end_point
------------- --------------------- ----------------------
           4 | 2022-04-05 10:00:00 | 2022-04-05 10:10:00
           5 | 2022-04-05 10:15:00 | 2022-04-05 10:20:00
           7 | 2022-04-05 10:25:00 | 2022-04-05 10:25:00
           5 | 2022-04-05 10:30:00 | 2022-04-05 10:35:00

This was my first attempt at a query which does this:

SELECT number_value, MIN(number_date) as start_point, MAX(number_date) as end_point 
FROM number_table 
GROUP BY number_value;

An obvious flaw with this attempt is the GROUP BY makes so if there is a duplicate value which occurred at a prior time (5 -> 7 -> 5), then those values are grouped together.

The problem I'm having is I can't see a way to do this as an SQL query.

All help appreciated.

CodePudding user response：

This is a gaps and islands problem, and we can use the difference in row numbers method for one approach:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY number_date) rn1,
              ROW_NUMBER() OVER (PARTITION BY number_value ORDER BY number_date) rn2
    FROM number_table
)

SELECT number_value,
       MIN(number_date) AS start_point,
       MAX(number)date) AS end_point
FROM cte
GROUP BY number_value, rn1 - rn2
ORDER BY MIN(number_date);