I am trying to create a rule to filter on events based on a partition, and I am trying to do it in a better way than the one I am doing:
I have the following events table:
EVENT_ID,USER_PROPERTIES_KEY,USER_PROPERTIES_VALUE
1,country,us
1,country_id,fr
2,country,uk
3,country_id,it
4,platform,Android
4,country,cn
The objective is to add a column country_code, based on the following rule:
If for the same event_id, USER_PROPERTIES_KEY = country and USER_PROPERTIES_KEY = country_id, country_code will take USER_PROPERTIES_VALUE, where USER_PROPERTIES_KEY = country_id, for EVENT_ID = 1, it will be 'fr', otherwise it takes USER_PROPERTIES_VALUE of USER_PROPERTIES_KEY of the event_id, else if none of the country, and country_id are respected, it takes null.
The final expected result would be:
EVENT_ID,USER_PROPERTIES_KEY,USER_PROPERTIES_VALUE,COUNTRY_CODE
1,country_id,fr,fr
2,country,uk,uk
3,country_id,it,it
4,platform,Android,null
4,country,cn,cn
I have to partition on the EVENT_ID following that query and filter on 'NA', but I am not able to do that.
select
EVENT_ID,
USER_PROPERTIES_KEY,
USER_PROPERTIES_VALUE,
CASE
WHEN USER_PROPERTIES_KEY like 'country%' THEN
CASE WHEN count(user_properties_key like 'country%') over(partition by EVENT_ID) > 1
THEN
CASE
WHEN user_properties_key = 'country_id'
THEN UPPER(user_properties_value) ELSE 'NA'
END
WHEN count(user_properties_key like 'country%') over(partition by EVENT_ID) = 1
THEN UPPER(user_properties_value) ELSE 'NA'
END
END
AS country_code
from events;
CodePudding user response:
You can try first_value
ordered by the priority requested. Also you can flag the rows you want with case
and row_number
select
EVENT_ID,
USER_PROPERTIES_KEY,
USER_PROPERTIES_VALUE,
country_code
from (
select
EVENT_ID,
USER_PROPERTIES_KEY,
USER_PROPERTIES_VALUE,
case when USER_PROPERTIES_KEY like 'country%' then
row_number()over(partition by EVENT_ID order by case USER_PROPERTIES_KEY when 'country_id' then 1 when 'country' then 2 else 3 end)
else 1 end flag,
first_value(case when USER_PROPERTIES_KEY like 'country%' THEN UPPER(user_properties_value) END)
over(partition by EVENT_ID order by case USER_PROPERTIES_KEY when 'country_id' then 1 when 'country' then 2 else 3 end)
AS country_code
from events
) t
where flag = 1;