Home > OS >  Left join event table to incident table where dates are closest for each device id
Left join event table to incident table where dates are closest for each device id

Time:11-18

I have two table as follows:

T1

event_id device_id event_time  var1 var2
1        A         2021-01-01  5    6
2        C         2021-01-02  8    7
3        B         2021-01-05  1    6
4        C         2021-01-07  7    7
5        D         2021-01-12  8    9
6        C         2021-01-18  3    4
7        B         2021-01-21  7    1

T2

device_id incident_time
B         2021-01-06
C         2021-01-17

I would like the Output to be

device_id incident_time event_id event_time var1 var2
B         2021-01-06    3        2021-01-05  1    6
C         2021-01-17    6        2021-01-18  3    4

So, I am trying to find event_id, event_time, var1 and var2 of a single event in T1 where the event time is closest to the incident_time for each device in T2, irrespective of if event_time is bigger or smaller than incident_time. My ugly (and incorrect) code so far:

 select T2.device_id, T2.incident_time, (select distinct on (device_id) event_id 
                                         from T1
                                         where device_id = T2.device_id 
                                         and event_time <= T2.incident_time
                                         order by device_id, event_time desc) as event_id
 from T2;

How can I accomplish this more elegantly using Join statement? Or is the select as statement in brackets correct? How can I include all the required columns? Thanks

CodePudding user response:

Skip the subquery, do a join instead. Something like:

select distinct on (T2.device_id) T2.device_id, T2.incident_time,
                                  T1.event_id, T1.event_time, T1.var1, T1.var2
from T2
left join T1 ON T1.device_id = T2.device_id 
order by T2.device_id, ABS(T2.incident_time - T1.event_time)

CodePudding user response:

Schema and insert statements:

 create table T1(event_id int, device_id varchar(2), event_time date,  var1 int, var2 int);
 insert into T1 values(1        ,'A'         ,'2021-01-01',  5,    6);
 insert into T1 values(2        ,'C'         ,'2021-01-02',  8,    7);
 insert into T1 values(3        ,'B'         ,'2021-01-05',  1,    6);
 insert into T1 values(4        ,'C'         ,'2021-01-07',  7,    7);
 insert into T1 values(5        ,'D'         ,'2021-01-12',  8,    9);
 insert into T1 values(6        ,'C'         ,'2021-01-18',  3,    4);
 insert into T1 values(7        ,'B'         ,'2021-01-21',  7,    1);
 
 create table T2 (device_id varchar(2), incident_time date);
 insert into T2 values('B',         '2021-01-06');
 insert into T2 values('C',         '2021-01-17');

Query 1 (using cte and row_number() window function):

 with cte as
 (
   select T2.device_id, T2.incident_time,T1.event_id,T1.event_time,T1.var1, T1.var2,
   row_number()over(partition by T2.device_id order by ABS(T2.incident_time-T1.event_time))rn
   from T2 left join T1 on T2.device_id=T1.device_id
 )
 select device_id, incident_time, event_id, event_time, var1, var2 from cte
 where rn=1

Output:

device_id incident_time event_id event_time var1 var2
B 2021-01-06 3 2021-01-05 1 6
C 2021-01-17 6 2021-01-18 3 4

Query 2 (using distinct on ):

select distinct on (T2.device_id) T2.device_id, T2.incident_time,
       T1.event_id, T1.event_time, T1.var1, T1.var2
       from T2
       left join T1 ON T1.device_id = T2.device_id 
       order by T2.device_id, ABS(T2.incident_time - T1.event_time)

Output:

device_id incident_time event_id event_time var1 var2
B 2021-01-06 3 2021-01-05 1 6
C 2021-01-17 6 2021-01-18 3 4

db<>fiddle here

CodePudding user response:

I would use "Fetch First 1 Rows With Ties" via "Cross Join Lateral" to get all suitable events for each incident.

Select T1.event_id, T1.device_id, T1.event_time, T2.incident_time, T1.var1, T1.var2
From T2 Cross Join Lateral
(Select T1.event_id, T1.device_id, T1.event_time, T1.var1, T1.var2
 From T1 
 Where T1.device_id=T2.device_id
 Order by Abs(T1.event_time-T2.incident_time)
 Fetch First 1 Rows With Ties) As T1
  • Related