Home > Back-end >  snowflake help to create 3rd variable while grouping
snowflake help to create 3rd variable while grouping

Time:02-14

I have a dataset as below, for which i need to group by it by 3 variables also shown as below:

This is my sample data, here you have tokens assigned to each channel, for which few tokens are part of multiple channels, i need to group it by month,channel as show in below image

result needed as below:

I Need the data grouped as below with 3rd category to identify with tokens which is part of multiple channels

CodePudding user response:

I'm sure you I know how to group by month and channel and do a count so the only tricky thing here is to work out the "both"

If you group by token and take a look at the min vs max channel you'll either see the minmax be the same or different. If they're different it's "both"

SELECT
  MONTH(date) as mnth,
  CASE WHEN min(channel) = max(channel) THEN min(channel) ELSE 'both' END as channel
FROM t
  GROUP BY MONTH(date), token

When the token is 1, the min is "email" and the max is "text" - because they differ it will become "both". For token 3 the min and max are the same "email", so we just use the min value to deliver "email". The end result is a unique set of token values and either email/text/both. Even if there are repeated values so long as there are no typos then it works out (but if you have a token=3,channel=emial then the method cannot work; clean your data)

You can feed the results of this query into another grouping that does your month/channel/count (put all this query as a sub query or CTE of a query that does another group by)

  • Related