Home > database >  For each unique record ID, return the most recent record of type Y iff there is a more recent record
For each unique record ID, return the most recent record of type Y iff there is a more recent record

Time:11-25

For each unique record ID, return the most recent record of type Y iff there is a more recent record of type X

To make explaining easier I will put the records sorted by EventDate descending and look only at specific record ID's. (Most recent at the top.)

Case 1

ID EventDate Type
1 Some Dates Otherstuff (multiple records)
1 July 29 X
1 Feb 23 Y
1 Jan 3 Y
1 Some Dates Otherstuff (multiple records)

Return record from Feb 23 of Type Y (Feb 23 is a closer date to the Jan 1 date of record with type X)

Case 2

ID EventDate Type
2 Some Dates Otherstuff (multiple records)
2 Nov 2 X
2 Oct 31 Y
2 Some Dates Otherstuff
2 July 2 X
2 Feb 23 Y
2 Jan 5 Y
2 Some Dates Otherstuff

Return records from Feb 23 of type Y and Oct 31 of Type Y. These are the records that are the closest to the type X records in terms of date respectively. (Feb 23 Type y is closest to July 2 of type X and Oct 31 type Y is closest to Nov 2 type X)

Case 3

ID EventDate Type
3 Some Dates Otherstuff (multiple records)
3 July 2 X
3 Feb 23 Y
3 Some Dates Otherstuff
3 Jan 5 X
3 Some Dates Otherstuff

Return Feb 23 of type Y record

Case 4

ID EventDate Type
4 Some Dates Otherstuff (multiple records)
4 Oct 15 Y
4 July 2 X
5 Feb 23 X
5 Some Dates Otherstuff
5 Jan 5 Y
5 Jan 1 Y
5 Some Dates Otherstuff

Return ONLY the Jan 5th of type Y record. It is the closest to record of type X in terms of dates that has happened before the type X

SELECT
    *
FROM
    (
        SELECT
            *,
            ROW_NUMBER() OVER ( PARTITION BY A.ID ORDER BY EventDate DESC ) AS pc

        FROM
            SOMETABLE AS "A"
            INNER JOIN
            (
                SELECT
                    ID AS 'BID',
                    MIN(EventDate) AS 'OldestDate'
                FROM
                    SOMETABLE
                WHERE
                    TYPE = 'X' 
                GROUP BY
                    ID
            ) AS "B" ON A.ID = B.BID

    WHERE
        EventDate < OldestDate
        AND
        Type = 'Y'

    ) AS "FINAL"

This fails in cases where there are multiple records of type Y that need to be pulled, as it 'filters out' any records newer than the OLDEST instance of type X.

CodePudding user response:

So the question is looking for the same ID with two different types of events happened closely. Not sure if you're looking for X before Y, Y before X or either.

We can divide the data into two groups:

  • X events: (ID, 'X', EventDate)
  • Y events: (ID, 'Y', EventDate) Join both on ID and the compare EventDate, then pick the one based on your criteria.

Step 1. Prep data

create table event (
    id          int,
    event_date  date,
    type        char(1));
    

insert into event
values
(1, '2022-01-01', 'X'),
(1, '2022-01-03', 'X'),
(1, '2022-01-05', 'Y'),
(1, '2022-01-07', 'Y'),
(1, '2022-01-09', 'X'),
(1, '2022-01-11', 'X'),
(1, '2022-01-15', 'Y');

Step 2. Join and compare both events

with 
event_x as (select id, type, event_date from event where type='X'),
event_y as (select id, type, event_date from event where type='Y'),
event_xy as (
select x.id,
       x.event_date as x_event_date,
       y.event_date as y_event_date,
       datediff(x.event_date,y.event_date)      as days_from_x_to_y
  from event_x x
  join event_y y
 using (id))
select id,
       x_event_date,
       y_event_date,
       days_from_x_to_y,
       abs(days_from_x_to_y) as days_between_x_and_y
  from event_xy
 where 1 = 1
   -- and days_from_x_to_y <=0   -- x happened after y
   -- and days_from_x_to_y >=0   -- x happened before y
 order by days_between_x_and_y asc;

We get:

id|x_event_date|y_event_date|days_from_x_to_y|days_between_x_and_y|
-- ------------ ------------ ---------------- -------------------- 
 1|  2022-01-03|  2022-01-05|              -2|                   2|
 1|  2022-01-09|  2022-01-07|               2|                   2|
 1|  2022-01-09|  2022-01-05|               4|                   4|
 1|  2022-01-01|  2022-01-05|              -4|                   4|
 1|  2022-01-11|  2022-01-07|               4|                   4|
 1|  2022-01-03|  2022-01-07|              -4|                   4|
 1|  2022-01-11|  2022-01-15|              -4|                   4|
 1|  2022-01-11|  2022-01-05|               6|                   6|
 1|  2022-01-01|  2022-01-07|              -6|                   6|
 1|  2022-01-09|  2022-01-15|              -6|                   6|
 1|  2022-01-03|  2022-01-15|             -12|                  12|
 1|  2022-01-01|  2022-01-15|             -14|                  14|

With slight changes in where clause, we can pick the (x, y) event we need.

CodePudding user response:

This query takes the most recent Y value for a given X value if it exists.

SELECT 
    * 
    ,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY XDateTime ASC) AS 'Degree'
FROM
    (SELECT 
        *
        ,ROW_NUMBER() OVER (PARTITION BY YDateTime ORDER BY XDateTime ASC) AS 'dc'
    FROM
        (SELECT
            ID
            ,EventDateTime AS 'YDateTime'
            ,B.XDateTime
            ,DATEDIFF(SECOND, EventDateTime, B.XDateTime) AS 'Time'
            ,ROW_NUMBER() OVER (PARTITION BY B.XDateTime ORDER BY EventDateTime DESC) AS 'pc'
        
        FROM vw_A6Productivity AS "A"

        INNER JOIN
            (SELECT
                ID AS 'BID'
                ,EventDateTime AS 'XDateTime'
            
            FROM TABLE
            
            WHERE TYPE = 'X'
            
            GROUP BY 
                ID
                ,EventDateTime
            ) AS "B"

        ON A.ID= B.BID

        WHERE 
            EventDateTime < XDateTime -- Inner join filters for Nulls automatically
            AND STATUS = 'Y'
        
        ) AS "C"

    WHERE
        pc = 1
        
    ) AS "D"

WHERE dc = 1;
  • Related