Home > Mobile >  Match nearest timestamp in Redshift SQL
Match nearest timestamp in Redshift SQL

Time:12-06

I have two tables, t1 and t2. For each id in t1 I have multiple records in t2. I want to match the closest timestamp of t2 to each record of t1. In t1 there is a flag, if it's 1 I want to match the closest timetamp of t2 that's smaller and if it's 0 I want to match the closest timestamp that is larger than that in t1. So alltogether I have the following table: T1 id, flag, timestamp T2 id, timestamp

Is there an efficient way to do that?

Edit, here is some example:

T1

customer_id timestamp_t1 flag
1 01.01.21 12:00 1
2 01.01.21 13:00 0

T2

customer_id timestamp_t2 additional attributes
1 01.01.21 11:00 attribute1
1 01.01.21 10:00 attribute2
1 01.01.21 13:00 attribute3
2 01.01.21 11:00 attribute4
2 01.01.21 12:00 attribute5
2 01.01.21 14:00 attribute6
2 01.01.21 15:00 attribute7

Result:

customer_id timetsamp_t1 timestamp_t2 flag additional attributes
1 01.01.21 12:00 01.01.21 11:00 1 attribute1
2 01.01.21 13:00 01.01.21 14:00 0 attribute6

I hope this helps. As you can see. In the result, we matched 11:00 of T2 with 12:00 of T1 because the flag was 1 we chose the closest timestamp that was smaller than 12:00. We also matched 14:00 with 13:00, because the flag was 0 (so we matched the closest timestamp with id 2 that is larger than 13:00).

CodePudding user response:

You could use correlated sub-queries to find the rows before/after the timestamp, and then use a CASE expression to pick which to join on...

SELECT
  *
FROM
  t1
INNER JOIN
  t2
    ON t2.id = CASE WHEN t1.flag = 1 THEN
                 (
                   SELECT t2.id
                     FROM t2
                    WHERE t2.customer_id   = t1.customer_id
                      AND t2.timestamp_t2 <= t1.timestamp_t1
                 ORDER BY t2.timestamp DESC
                    LIMIT 1
                 )
               ELSE
                 (
                   SELECT t2.id
                     FROM t2
                    WHERE t2.customer_id   = t1.customer_id
                      AND t2.timestamp_t2 >= t1.timestamp_t1
                 ORDER BY t2.timestamp ASC
                    LIMIT 1
                 )
               END

Oh, you haven't included an id column in your example, this works similarly...

SELECT
  *
FROM
  t1
INNER JOIN
  t2
    ON  t2.customer_id  = t1.customer_id
    AND t2.timestamp_t2
        =
        CASE WHEN t1.flag = 1 THEN
          (
            SELECT MAX(t2.timestamp_t2)
              FROM t2
             WHERE t2.customer_id   = t1.customer_id
               AND t2.timestamp_t2 <= t1.timestamp_t1
          )
        ELSE
          (
            SELECT MIN(t2.timestamp_t2)
              FROM t2
             WHERE t2.customer_id   = t1.customer_id
               AND t2.timestamp_t2 >= t1.timestamp_t1
          )
        END
  • Related