I want to join two tables from different datasets. It is possible to INNER JOIN
these two datasets but it does not work with a regular JOIN
.
I want to join a Google Analytics 4 (GA4) item id on the item id of the datawarehouse.
In order to access the GA4 item id I need to UNNEST
the GA4 items array.
When using the code below, I get the following error:
Unrecognized name: dwh_id; Did you mean dwh? at [9:79]
Here's the query I'm using now.
SELECT
event_date as ga4_date,
ga4_items.item_id AS ga4_id,
ga4_items.item_name,
ga4_items.price,
dwh.Product_SKU__Google_Analytics as dwh_id,
FROM `ga4-data` as ga4
JOIN `datawarehouse-data` as dwh ON dwh_id = ga4_id,
UNNEST(ga4.items) as ga4_items
Let me know if you have the answer :)
CodePudding user response:
My best quess of what you're trying to do:
CREATE TEMP TABLE `ga4-data` AS
SELECT '2022-01-01' AS event_date,
[STRUCT('item001' AS item_id, 'name1' AS item_name, 100 AS price),
STRUCT('item002' AS item_id, 'name2' AS item_name, 200 AS price)] AS items
;
CREATE TEMP TABLE `datawarehouse-data` AS
SELECT 'item001' AS Product_SKU__Google_Analytics,
'col1' AS col1;
SELECT event_date as ga4_date,
ga4_items.item_id AS ga4_id,
ga4_items.item_name,
ga4_items.price,
dwh.Product_SKU__Google_Analytics as dwh_id
FROM `ga4-data` as ga4, UNNEST(ga4.items) as ga4_items
JOIN `datawarehouse-data` as dwh
ON dwh.Product_SKU__Google_Analytics = ga4_items.item_id;
CodePudding user response:
Alright, I figured it out. It took a lot of trial and error but I got it:
WITH ga as
(
SELECT event_date as ga4_date,
ga4_items.item_id as id,
ga4_items.item_name,
ga4_items.price
FROM `name-ga4-dataset` as ga4, UNNEST(ga4.items) as ga4_items
),
dwh as
(
SELECT Product_SKU__Google_Analytics as dwh_id
FROM `name-dwh-dataset` as dwh
)
SELECT * FROM ga
JOIN dwh
ON ga.id = dwh_id