I have the following table:
ID Fruit
A apple,orange,grape
A apple,orange,cucumber
A apple,orange
B orange,grape
B apple
B grape
C grape,banana
C orange,banana
C banana
I am hoping to add new columns to denote whether a certain ID has had specific characters in the fruit row
ID Fruit apple banana orange
A apple,orange,grape yes no yes
A apple,orange,cucumber yes no yes
A apple,orange yes no yes
B orange,grape yes no yes
B apple yes no yes
B grape yes no yes
C grape,banana no yes yes
C orange,banana no yes yes
C banana no yes yes
I asked a similar question a while back here.
SELECT a.*,
case when count(case when a.Fruit like 'apple' then 1 end) over (partition by ID) > 0 then 'yes' else 'no' end [Apple],
case when count(case when a.Fruit like 'banana' then 1 end) over (partition by ID) > 0 then 'yes' else 'no' end [Banana],
case when count(case when a.Fruit like 'orange' then 1 end) over (partition by ID) > 0 then 'yes' else 'no' end [Orange]
FROM MyTable a
This, however, is being very resource heavy and slow, given the size of my tables. In terms of performance, is there a more efficient solution to this with coalesce
or join
or any other method?
CodePudding user response:
so your SQL needs some tweak to work in Snowflake, the [square bracket] alias form is only valid in T-SQL.
SELECT a.*,
case when count(case when a.Fruit like 'apple' then 1 end) over (partition by ID) > 0 then 'yes' else 'no' end as "Apple",
case when count(case when a.Fruit like 'banana' then 1 end) over (partition by ID) > 0 then 'yes' else 'no' end as "Banana",
case when count(case when a.Fruit like 'orange' then 1 end) over (partition by ID) > 0 then 'yes' else 'no' end as "Orange"
FROM MyTable as a
in fact you can use COUNT_IF and IFF to tighten up that to:
SELECT a.*,
iff(count_if(a.Fruit like 'apple') over (partition by a.ID) > 0, 'yes', 'no') as "Apple",
iff(count_if(a.Fruit like 'banana') over (partition by a.ID) > 0, 'yes', 'no') as "Banana",
iff(count_if(a.Fruit like 'orange') over (partition by a.ID) > 0, 'yes', 'no') as "Orange"
FROM MyTable as a
But this is SLOW, because you are do the count for every row even though the answer will be the same for all rows of A, B, etc. Thus you can join the aggregated data:
select *
FROM MyTable as a
natural join (
SELECT c.id,
iff(count_if(c.Fruit like 'apple') > 0, 'yes', 'no') as "Apple",
iff(count_if(c.Fruit like 'banana') > 0, 'yes', 'no') as "Banana",
iff(count_if(c.Fruit like 'orange') > 0, 'yes', 'no') as "Orange"
FROM MyTable as c
GROUP BY 1
) as b
ID | FRUIT | Apple | Banana | Orange |
---|---|---|---|---|
A | apple,orange,grape | no | no | no |
A | apple,orange,cucumber | no | no | no |
A | apple,orange | no | no | no |
B | orange,grape | yes | no | no |
B | apple | yes | no | no |
B | grape | yes | no | no |
C | grape,banana | no | yes | no |
C | orange,banana | no | yes | no |
C | banana | no | yes | no |