Home > Blockchain >  Filtering on columns partitioning on unique id without duplicates
Filtering on columns partitioning on unique id without duplicates

Time:05-10

I am trying to create a rule to filter on events based on a partition, and I am trying to do it in a better way than the one I am doing:

I have the following events table:

EVENT_ID,USER_PROPERTIES_KEY,USER_PROPERTIES_VALUE
1,country,us
1,country_id,fr
2,country,uk
3,country_id,it
4,platform,Android
4,country,cn

The objective is to add a column country_code, based on the following rule:

If for the same event_id, USER_PROPERTIES_KEY = country and USER_PROPERTIES_KEY = country_id, country_code will take USER_PROPERTIES_VALUE, where USER_PROPERTIES_KEY = country_id, for EVENT_ID = 1, it will be 'fr', otherwise it takes USER_PROPERTIES_VALUE of USER_PROPERTIES_KEY of the event_id, else if none of the country, and country_id are respected, it takes null.

The final expected result would be:

EVENT_ID,USER_PROPERTIES_KEY,USER_PROPERTIES_VALUE,COUNTRY_CODE
1,country_id,fr,fr
2,country,uk,uk
3,country_id,it,it
4,platform,Android,null
4,country,cn,cn

I have to partition on the EVENT_ID following that query and filter on 'NA', but I am not able to do that.

select
    EVENT_ID,
    USER_PROPERTIES_KEY,
    USER_PROPERTIES_VALUE,
    CASE
        WHEN USER_PROPERTIES_KEY like 'country%' THEN
       CASE WHEN count(user_properties_key like 'country%') over(partition by EVENT_ID) > 1
            THEN
            CASE
                WHEN user_properties_key = 'country_id'
                THEN UPPER(user_properties_value) ELSE 'NA'
            END
        WHEN count(user_properties_key like 'country%') over(partition by EVENT_ID) = 1
                THEN UPPER(user_properties_value) ELSE 'NA'
            END
        END
    AS country_code
from events;

CodePudding user response:

You can try first_value ordered by the priority requested. Also you can flag the rows you want with case and row_number

select
    EVENT_ID,
    USER_PROPERTIES_KEY,
    USER_PROPERTIES_VALUE,
    country_code
from (
  select
    EVENT_ID,
    USER_PROPERTIES_KEY,
    USER_PROPERTIES_VALUE,
    case when USER_PROPERTIES_KEY like 'country%' then
        row_number()over(partition by EVENT_ID order by case USER_PROPERTIES_KEY when 'country_id' then 1 when 'country' then 2 else 3 end)
      else 1 end flag,   
    first_value(case when USER_PROPERTIES_KEY like 'country%' THEN UPPER(user_properties_value) END)
          over(partition by EVENT_ID order by case USER_PROPERTIES_KEY when 'country_id' then 1 when 'country' then 2 else 3 end)
       AS country_code
  from events
) t
where flag = 1;
  • Related