I have a table that contains 3 distinct names in one field, the respective account those names are associated to, Unique IDs, the total counts for each pair, and a rank column based on the total counts value. Example here:
LOGO | Account | ID | Count_Per_Logo | Rank
Walmart Acct_A ABC 3 1
Walmart Acct_A DEF 3 1
Walmart Acct_A GHI 3 1
Vudu Acct_A JKL 1 2
Bonobos Acct_A MNO 1 2
My goal is to 'consolidate' the LOGO field into just one Logo to one Account - so for each of the unique ID's - they should only be associated to 1 logo (instead of the 3 shown above).
Desired Output
LOGO | Account | ID | Count_Per_Logo | Rank | Consolidated_LOGO
Walmart Acct_A ABC 3 1 Walmart
Walmart Acct_A DEF 3 1 Walmart
Walmart Acct_A GHI 3 1 Walmart
Vudu Acct_A JKL 1 2 Walmart
Bonobos Acct_A MNO 1 2 Walmart
Right now I am using a count with partition by to get the rank - but I'm unsure how to create a new field that has the 'desired logo' based on the rank values - would a lag function be necessary for this? Any help would be greatly appreciated
Current logic
SELECT LOGO, ACCOUNT, ID FROM
(SELECT *, DENSE_RANK() OVER (PARTITION BY ACCOUNT ORDER BY LOGO_NAME_CNT DESC) AS LOGO_RANK
FROM
(SELECT *, count(ACCOUNT) over (partition by LOGO) AS LOGO_NAME_CNT FROM "TABLE1" WHERE ACCOUNT = 'Acct_A')
)
WHERE LOGO_RANK = 1
CodePudding user response:
So it looks like your base SQL is:
SELECT
a.*
,dense_rank() over (partition by account order by count_per_logo desc) as rank
FROM (
SELECT
logo
,account
,id
,count(logo) over (partition by account, logo) as count_per_logo
FROM VALUES
('walmart', 'acct_a', 'abc'),
('walmart', 'acct_a', 'def'),
('walmart', 'acct_a', 'ghi'),
('vudu', 'acct_a', 'jkl'),
('bonobos', 'acct_a', 'mno')
v(logo, account, id)
) AS a
ORDER BY 4 DESC, 5;
then all you need to add is a FIRST_VALUE like:
SELECT
a.*
,dense_rank() over (partition by account order by count_per_logo desc) as rank
,first_value(logo) over (partition by account order by count_per_logo desc) as Consolidated_LOGO
FROM (
SELECT
logo
,account
,id
,count(logo) over (partition by account, logo) as count_per_logo
FROM VALUES
('walmart', 'acct_a', 'abc'),
('walmart', 'acct_a', 'def'),
('walmart', 'acct_a', 'ghi'),
('vudu', 'acct_a', 'jkl'),
('bonobos', 'acct_a', 'mno')
v(logo, account, id)
) AS a
ORDER BY 4 DESC, 5;
which gives::
LOGO | ACCOUNT | ID | COUNT_PER_LOGO | RANK | CONSOLIDATED_LOGO |
---|---|---|---|---|---|
walmart | acct_a | abc | 3 | 1 | walmart |
walmart | acct_a | def | 3 | 1 | walmart |
walmart | acct_a | ghi | 3 | 1 | walmart |
vudu | acct_a | jkl | 1 | 2 | walmart |
bonobos | acct_a | mno | 1 | 2 | walmart |
CodePudding user response:
Simeon's answer should work as long as you don't have ties. Actually, if you don't care for ties, it gets even simpler
select logo,
account,
id,
mode(logo) over (partition by account) as consolidated_logo
from your_table;
If you want the result set to replicate for each instance of logo that ties on count, consider below
with cte as
(select logo, account
from your_table
group by logo, account
qualify max(count(*)) over (partition by account)=count(*))
select a.*, b.logo as consolidated_logo
from your_table a
join cte b on a.account=b.account
order by b.logo;