I need to figure out how to take the results from the table below and only show me the green highlighted rows in the image. Within each account_id/site_id combo (there are two shown in the example - 123/usa and 456/can), I only need to keep the group_ids where there are a larger number of Y (flag) for the brand (as seen in the side_id "can" - there are more Y in group_id 1 for that brand than there are in group_id 2). Within the account_id/site combo, if there are two group_ids that have more than two Y for a brand, then keep them both (as seen in the site_id "usa").
group_id | account_id | site_id | brand | flag |
---|---|---|---|---|
1 | 123 | usa | toyota | N |
1 | 123 | usa | toyota | N |
1 | 123 | usa | tesla | Y |
1 | 123 | usa | tesla | Y |
2 | 123 | usa | toyota | Y |
2 | 123 | usa | toyota | N |
2 | 123 | usa | tesla | Y |
2 | 123 | usa | tesla | N |
3 | 123 | usa | toyota | Y |
3 | 123 | usa | toyota | N |
3 | 123 | usa | tesla | Y |
3 | 123 | usa | tesla | N |
4 | 123 | usa | toyota | Y |
4 | 123 | usa | toyota | Y |
4 | 123 | usa | tesla | N |
4 | 123 | usa | tesla | N |
1 | 456 | can | honda | Y |
1 | 456 | can | honda | Y |
1 | 456 | can | honda | Y |
2 | 456 | can | honda | Y |
2 | 456 | can | honda | N |
2 | 456 | can | honda | Y |
CodePudding user response:
By using a combination of window functions, you can count the number of Y flags for each group/account/site/brand, then find the maximum number of Y flags of those partitioned with another analytic function, then only return the rows where those two values match.
Query
WITH
grps (GROUP_ID,
account_id,
site_id,
brand,
flag)
AS
(SELECT 1, 123, 'usa', 'toyota', 'N' FROM DUAL
UNION ALL
SELECT 1, 123, 'usa', 'toyota', 'N' FROM DUAL
UNION ALL
SELECT 1, 123, 'usa', 'tesla', 'Y' FROM DUAL
UNION ALL
SELECT 1, 123, 'usa', 'tesla', 'Y' FROM DUAL
UNION ALL
SELECT 2, 123, 'usa', 'toyota', 'Y' FROM DUAL
UNION ALL
SELECT 2, 123, 'usa', 'toyota', 'N' FROM DUAL
UNION ALL
SELECT 2, 123, 'usa', 'tesla', 'Y' FROM DUAL
UNION ALL
SELECT 2, 123, 'usa', 'tesla', 'N' FROM DUAL
UNION ALL
SELECT 3, 123, 'usa', 'toyota', 'Y' FROM DUAL
UNION ALL
SELECT 3, 123, 'usa', 'toyota', 'N' FROM DUAL
UNION ALL
SELECT 3, 123, 'usa', 'tesla', 'Y' FROM DUAL
UNION ALL
SELECT 3, 123, 'usa', 'tesla', 'N' FROM DUAL
UNION ALL
SELECT 4, 123, 'usa', 'toyota', 'Y' FROM DUAL
UNION ALL
SELECT 4, 123, 'usa', 'toyota', 'Y' FROM DUAL
UNION ALL
SELECT 4, 123, 'usa', 'tesla', 'N' FROM DUAL
UNION ALL
SELECT 4, 123, 'usa', 'tesla', 'N' FROM DUAL
UNION ALL
SELECT 1, 456, 'can', 'honda', 'Y' FROM DUAL
UNION ALL
SELECT 1, 456, 'can', 'honda', 'Y' FROM DUAL
UNION ALL
SELECT 1, 456, 'can', 'honda', 'Y' FROM DUAL
UNION ALL
SELECT 2, 456, 'can', 'honda', 'Y' FROM DUAL
UNION ALL
SELECT 2, 456, 'can', 'honda', 'N' FROM DUAL
UNION ALL
SELECT 2, 456, 'can', 'honda', 'Y' FROM DUAL)
SELECT g3.GROUP_ID,
g3.account_id,
g3.site_id,
g3.brand,
g3.flag
FROM (SELECT g2.*, MAX (g2.y_flag_cnt) OVER (PARTITION BY account_id) AS max_y_flag_cnt
FROM (SELECT g.*,
SUM (CASE WHEN flag = 'Y' THEN 1 ELSE 0 END)
OVER (PARTITION BY GROUP_ID,
account_id,
site_id,
brand) AS y_flag_cnt
FROM grps g) g2) g3
WHERE flag = 'Y' AND y_flag_cnt = max_y_flag_cnt
ORDER BY account_id, GROUP_ID;
Output
GROUP_ID ACCOUNT_ID SITE_ID BRAND FLAG
___________ _____________ __________ _________ _______
1 123 usa tesla Y
1 123 usa tesla Y
4 123 usa toyota Y
4 123 usa toyota Y
1 456 can honda Y
1 456 can honda Y
1 456 can honda Y