I have two table as follows:
T1
event_id device_id event_time var1 var2
1 A 2021-01-01 5 6
2 C 2021-01-02 8 7
3 B 2021-01-05 1 6
4 C 2021-01-07 7 7
5 D 2021-01-12 8 9
6 C 2021-01-18 3 4
7 B 2021-01-21 7 1
T2
device_id incident_time
B 2021-01-06
C 2021-01-17
I would like the Output to be
device_id incident_time event_id event_time var1 var2
B 2021-01-06 3 2021-01-05 1 6
C 2021-01-17 6 2021-01-18 3 4
So, I am trying to find event_id
, event_time
, var1
and var2
of a single event in T1
where the event time is closest to the incident_time
for each device in T2
, irrespective of if event_time
is bigger or smaller than incident_time
. My ugly (and incorrect) code so far:
select T2.device_id, T2.incident_time, (select distinct on (device_id) event_id
from T1
where device_id = T2.device_id
and event_time <= T2.incident_time
order by device_id, event_time desc) as event_id
from T2;
How can I accomplish this more elegantly using Join statement? Or is the select as statement in brackets correct? How can I include all the required columns? Thanks
CodePudding user response:
Skip the subquery, do a join instead. Something like:
select distinct on (T2.device_id) T2.device_id, T2.incident_time,
T1.event_id, T1.event_time, T1.var1, T1.var2
from T2
left join T1 ON T1.device_id = T2.device_id
order by T2.device_id, ABS(T2.incident_time - T1.event_time)
CodePudding user response:
Schema and insert statements:
create table T1(event_id int, device_id varchar(2), event_time date, var1 int, var2 int);
insert into T1 values(1 ,'A' ,'2021-01-01', 5, 6);
insert into T1 values(2 ,'C' ,'2021-01-02', 8, 7);
insert into T1 values(3 ,'B' ,'2021-01-05', 1, 6);
insert into T1 values(4 ,'C' ,'2021-01-07', 7, 7);
insert into T1 values(5 ,'D' ,'2021-01-12', 8, 9);
insert into T1 values(6 ,'C' ,'2021-01-18', 3, 4);
insert into T1 values(7 ,'B' ,'2021-01-21', 7, 1);
create table T2 (device_id varchar(2), incident_time date);
insert into T2 values('B', '2021-01-06');
insert into T2 values('C', '2021-01-17');
Query 1 (using cte and row_number() window function):
with cte as
(
select T2.device_id, T2.incident_time,T1.event_id,T1.event_time,T1.var1, T1.var2,
row_number()over(partition by T2.device_id order by ABS(T2.incident_time-T1.event_time))rn
from T2 left join T1 on T2.device_id=T1.device_id
)
select device_id, incident_time, event_id, event_time, var1, var2 from cte
where rn=1
Output:
device_id | incident_time | event_id | event_time | var1 | var2 |
---|---|---|---|---|---|
B | 2021-01-06 | 3 | 2021-01-05 | 1 | 6 |
C | 2021-01-17 | 6 | 2021-01-18 | 3 | 4 |
Query 2 (using distinct on ):
select distinct on (T2.device_id) T2.device_id, T2.incident_time,
T1.event_id, T1.event_time, T1.var1, T1.var2
from T2
left join T1 ON T1.device_id = T2.device_id
order by T2.device_id, ABS(T2.incident_time - T1.event_time)
Output:
device_id | incident_time | event_id | event_time | var1 | var2 |
---|---|---|---|---|---|
B | 2021-01-06 | 3 | 2021-01-05 | 1 | 6 |
C | 2021-01-17 | 6 | 2021-01-18 | 3 | 4 |
db<>fiddle here
CodePudding user response:
I would use "Fetch First 1 Rows With Ties" via "Cross Join Lateral" to get all suitable events for each incident.
Select T1.event_id, T1.device_id, T1.event_time, T2.incident_time, T1.var1, T1.var2
From T2 Cross Join Lateral
(Select T1.event_id, T1.device_id, T1.event_time, T1.var1, T1.var2
From T1
Where T1.device_id=T2.device_id
Order by Abs(T1.event_time-T2.incident_time)
Fetch First 1 Rows With Ties) As T1