Home > Enterprise >  How to join two tables based on non-unique id in the first table (SQLite 3.12.2)
How to join two tables based on non-unique id in the first table (SQLite 3.12.2)

Time:09-24

Table A's Serial ID is random generated. Then Table B will according to Table A's serial id and category to generate its sound.

I tried this command, but failed.

SELECT          A.timestamp, B.timestamp, B.SerialID, A.Category, B.Sound
FROM            A 
JOIN            B
ON              B.timestamp > A.timestamp
AND             A.SerialID = B.SerialID
ORDER BY        B.timestamp

Table A

Timestamp SerialID Category
1 3 Cat
2 5 Dog
10 44 Cat
13 5 Cat
15 3 Dog

Table B

Timestamp SerialID Sound
3 3 Meow
4 5 Bark
11 44 Meow
14 5 Meow
16 3 Bark

Desire table: Table B's first serial id that matched Table A and B.Timestamp greater than A.Timestamp

A.Timestamp B.Timestamp Serial ID Category Sound
1 3 3 Cat Meow
2 4 5 Dog Bark
10 11 44 Cat Meow
13 14 5 Cat Meow
15 16 3 Dog Bark

CodePudding user response:

SELECT
   t1.Timestamp AS TimestampA,
   t2.Timestamp AS TimestampB,
   t1.SerialID,
   t1.Category,
   t2.Sound
FROM t1
JOIN t2 ON t1.SerialID = t2.SerialID AND t2.Timestamp = (
  SELECT MIN(t2.Timestamp)
  FROM t2 
  WHERE t2.SerialID = t1.SerialID AND t2.Timestamp >= t1.Timestamp
)

Or the same using the LEAD window function

WITH cte AS (
  SELECT *,
    LEAD(Timestamp) OVER (PARTITION BY SerialID ORDER BY Timestamp) as next_timestamp
  FROM t1
)
SELECT
   t1.Timestamp AS TimestampA,
   t2.Timestamp AS TimestampB,
   t1.SerialID,
   t1.Category,
   t2.Sound
FROM cte t1
JOIN t2 ON t1.SerialID = t2.SerialID 
       AND t2.Timestamp >= t1.Timestamp 
       AND (t2.Timestamp < t1.next_timestamp OR next_timestamp IS NULL)
ORDER BY t1.Timestamp  

db<>fiddle

CodePudding user response:

My favorite solution for this would be a lateral join just picking the desired B row for each A row. But lateral joins are not yet featured in SQLite.

You can get the desired B row with two steps though, by selecting the tmestamp in a subquery in the select clause:

select
  ab.timestamp as a_timestamp, b.timestamp as b_timestamp,
  b.serialid, ab.category, b.sound
from
(
  select a.*,
    (
      select b.timestamp
      from b
      where b.serialid = a.serialid
      and b.timestamp > a.timestamp
      order by b.timestamp
      limit 1
    ) as best_timestamp
  from a
) ab
join b on b.serialid = ab.serialid
      and b.timestamp = ab.best_timestamp
order by ab.timestamp, ab.serialid;

Another approach uses a window function. There you'd join all candidates and then keep the best one. This is your query with the added picking.

select a_timestamp, b_timestamp, serialid, category, sound
from
(
  select 
    a.timestamp as a_timestamp, b.timestamp as b_timestamp,
    b.serialid, a.category, b.sound,
    min(b.timestamp) over (partition by a.serialid) as best_timestamp
  from a
  join b on b.serialid = a.serialid and b.timestamp > a.timestamp
) ab
where b_timestamp = best_timestamp
order by a_timestamp, ab.serialid;
  • Related