Home > database >  BigQuery join on closest date match
BigQuery join on closest date match

Time:07-20

I'm trying to join two tables in BigQuery based on an id and the closest date match.

Transaction Table:

transactionId dateTime productId
4a50665e 2022-05-13T14:12:55 abc
7d5889cd 2022-05-22T16:10:21 abc

Product Log Table (log of when each productId is updated to a new version):

dateTime productId version
2022-05-19T06:37:24 abc v2
2022-05-12T04:38:23 xyz v1
2022-05-10T09:57:54 abc v1

I want to add a version column to the transaction table by looking up from the product log table, based on productId match and the dateTime match from the product table. To get the active product version at the time of the transaction.

Desired Result:

transactionId dateTime productId version
4a50665e 2022-05-13T14:12:55 abc v1
7d5889cd 2022-05-22T16:10:21 abc v2

Something like this

SELECT
  t.*,
  p.version
FROM
  transaction_table t
LEFT JOIN
  product_log_table p
ON
  t.productId = p.productId AND 
  t.datetime < p.dateTime

But that doesn't work. I've tried searching a lot and tried a number of solutions but can't get anything to work. Should be simple? How do I do this?

Thanks for any help.

CodePudding user response:

Consider below approach

select any_value(t).*, 
  string_agg(version, '' order by p.dateTime desc limit 1) as version
from Transaction_Table t
join Product_Log_Table p
on t.productId = p.productId
and t.dateTime >= p.dateTime 
group by format('%t', t)                  

if applied to sample data in your question - output is

enter image description here

  • Related