Home > Blockchain >  Snowflake Replace to joins with group by clause with single join
Snowflake Replace to joins with group by clause with single join

Time:02-15

I have to take the LATEST CLICK as the latest date for that ID where the Event_Type is click. And do the similar thing for Event_Type = 'open'. I have the data like :

Id Event_Type SRC_CREATED
1 click 2021-04-10 01:50:08.490
1 click 2021-03-10 02:221:05.115
1 open 2021-06-10 01:12:08.288
1 click 2021-08-10 01:50:08.490

Outcome required :

Id Event_Type SRC_CREATED
1 click 2021-08-10 01:50:08.490
1 open 2021-06-10 01:12:08.288

I have used the query mentioned below, which is working fine, but I want to replace the 2 somehow joins used with a single join to fetch both these data. Can anyone think of a workaround query for that?

CREATE OR REPLACE table CUSTOMER AS
select CONCAT(NVL(ADD1,''),NVL(ADD2,'')) AS ADDRESS,
    CASE
WHEN NVL(GENDER,'')='F' THEN 'Female'
WHEN NVL(GENDER,'')='M' THEN 'Male'
WHEN NVL(GENDER,'')='U' Then 'Other'
ELSE ''
END AS GENDER,
***CLICK AS LAST_CLICK,
OPENS AS LAST_EMAIL_CLICK,***
B.PHONE AS PHONE,
A.SRC_CREATED AS SINCE,
from "HXP_SANDBOX"."BASE_RA_TEST"."IDENTITY" A
inner join "HXP_SANDBOX"."BASE_RA_TEST"."EMAIL_INTERACTIONS" B
on A.DIGITALID = B.DIGITALID 
***inner JOIN (SELECT DIGITALID,MAX(SRC_CREATED) CLICK
 from "HXP_SANDBOX"."BASE_RA_TEST"."EMAIL_INTERACTIONS" t
  where EVENT_TYPE = 'click'
 group by DIGITALID) C
on A.DIGITALID = C.DIGITALID
inner JOIN (SELECT DIGITALID,MAX(SRC_CREATED) OPENS
 from "HXP_SANDBOX"."BASE_RA_TEST"."EMAIL_INTERACTIONS" t
  where EVENT_TYPE = 'open'
 group by DIGITALID) D
on A.DIGITALID = D.DIGITALID;***

CodePudding user response:

You should be able to do the following, where you use a CASE inside the MAX to restrict it to the event types you want.

CREATE OR REPLACE TABLE CUSTOMER
AS
SELECT CONCAT(NVL(ADD1,''), NVL(ADD2,'')) AS [ADDRESS]
    , CASE
        WHEN NVL(GENDER,'')='F' THEN 'Female'
        WHEN NVL(GENDER,'')='M' THEN 'Male'
        WHEN NVL(GENDER,'')='U' Then 'Other'
        ELSE ''
    END AS GENDER
    , C.CLICK AS LAST_CLICK
    , C.OPENS AS LAST_EMAIL_CLICK
    , B.PHONE AS PHONE
    , A.SRC_CREATED AS SINCE
FROM "HXP_SANDBOX"."BASE_RA_TEST"."IDENTITY" A
INNER JOIN "HXP_SANDBOX"."BASE_RA_TEST"."EMAIL_INTERACTIONS" B ON A.DIGITALID = B.DIGITALID 
INNER JOIN (
    SELECT DIGITALID
      , MAX(CASE WHEN EVENT_TYPE = 'click' THEN SRC_CREATED ELSE NULL END) CLICK
      , MAX(CASE WHEN EVENT_TYPE = 'open' THEN SRC_CREATED ELSE NULL END) OPENS
    FROM "HXP_SANDBOX"."BASE_RA_TEST"."EMAIL_INTERACTIONS" t
    GROUP BY DIGITALID
) C ON A.DIGITALID = C.DIGITALID;

CodePudding user response:

with this CTE to provide data

    SELECT column1 as id, column2 as event_type, to_timestamp(column3) as src_created
    FROM VALUES
    (1,'click','2021-04-10 01:50:08'),
    (1,'click','2021-03-10 02:21:05'),
    (1,'open','2021-06-10 01:12:08'),
    (1,'click','2021-08-10 01:50:08')
)

the following SQL gives the answers you want:

SELECT
    id
    ,event_type 
    ,max(src_created)
FROM fake_data
GROUP BY 1,2
ORDER BY 1,2;

which gives:

ID EVENT_TYPE MAX(SRC_CREATED)
1 click 2021-08-10 01:50:08.000
1 open 2021-06-10 01:12:08.000

BUT what you asked for is not what your SQL does, and Dale's answer shows how to do that.

Which is the the same in my example data as:

SELECT
    id
    ,max(iff(event_type='click', src_created, null)) as last_click
    ,max(iff(event_type='open', src_created, null)) as last_click
FROM fake_data
GROUP BY 1

you can use CASE or IFF, I prefer the IFF as it's shorter and explicit. If you are going to CASE path, you can drop the ELSE NULL as that is the default behavior.

Thus Dale's example can be:

  ,MAX(CASE WHEN event_type = 'click' THEN src_createdEND) AS click
  ,MAX(CASE WHEN event_type = 'open' THEN src_createdEND) AS opens
`` 
 
  • Related