Home > Back-end >  Creating an indicator based on the same ID in different years in PROC SQL/SQL
Creating an indicator based on the same ID in different years in PROC SQL/SQL

Time:04-23

I'd like to create an indicator based on the ID and product type. My data:

Year ID Purchase_Category
2020 1  Kitchen
2020 2  Home
2020 2  Kitchen
2020 3  Home
2021 1  Home
2021 2  Kitchen
2021 3  Kitchen

If someone with the same ID purchased Kitchen in 2020 and then Home in 2021 or vice versa, then they are deemed holistic. ID 2 in this case is not holistic because Home and Kitchen were purchased in the same year. The output should look like this:

ID Indicator
1  Holistic
2  Not Holistic
3  Holistic

CodePudding user response:

Something like this might work:

SELECT ID, CASE COUNT(*) WHEN 1 THEN 'Not Holistic' ELSE 'Holistic' END AS INDICATOR
FROM (SELECT ID, YEAR, COUNT(*) FROM DATA GROUP BY ID, YEAR)
GROUP BY ID

First, determine the distinct years per ID, then from that set, if an ID appears only once then everything was purchased in same year, otherwise there were products purchased in different years.

CodePudding user response:

You just need a distinct count on the Year column per ID. No need for two steps.

select ID,
    case when count(distinct "Year") > 1
        then 'Holistic' else 'Not Holistic' end as Indicator
from T
group by ID

It would be just as easy to say:

    case when max("Year") > min("Year") then ...

I don't know which one seems more natural. If you have a lot of data the second approach is potentially faster.

  • Related