Home > Net >  How to find most frequent code(varchar) from a table
How to find most frequent code(varchar) from a table

Time:05-26

I would like to find most frequent code within CodeID which is in same code_group from a table.

For example, from original table

ID     CodeID      Name      Code     Code_group
1         1         A        101          0
2         1         A        102          0
3         1         B        102          0
4         2         C        201          0
5         2         C        201          0
6         2         D        202          0
7         2         E        202          0
8         3         F        101          1
9         3         G        103          1
10        3         G        104          1
11        3         G        104          1

I want output like the below.

ID     CodeID      Name      Code     Code_group    Selected_code
1         1         A        101          0             102
2         1         A        102          0             102
3         1         B        102          0             102
4         2         C        201          0             NULL
5         2         C        201          0             NULL
6         2         D        202          0             NULL
7         2         E        202          0             NULL
8         3         F        101          1             104
9         3         G        103          1             104
10        3         H        104          1             104 
11        3         H        104          1             104

Even though code of 8th ID is same in CodeID: 1,it is not in the same Code_group.

So For CodeID: 1, Selected_code would be 102. it must be counted within exactly same Code_group.

=======================================

I have tried it like the below. I should not use ID for this one. From TableA

with m as
(
    select 
        CodeID,
        Name,
        Code,
        Code_group,
        cnt,
        Selected_code = ROW_NUMBER() over (partition by Code_group order by cnt desc)

        from( select CodeID, Name, Code,Code_group
        ,count(*) over (partition by Code,CodeID) as cnt from tableA
        group by CodeID, Name, Code, Code_group,
    ) as t  
    group by  CodeID,
        Name,
        Code,
        Code_group, cnt
)
select a.CodeID,
        a.Name,
        a.Code,
        a.Code_group, b.Code as Selected_code, cnt 
from(select
     CodeID,
        Name,
        Code,
        Code_group,Selected_code,
        cnt
    from m) as a left outer join
    (select CodeID,
        Name,
        Code,
        Code_group,Selected_code,
        cnt
 from m where selected_Code=1) as b on a.CodeID = b.CodeID and a.Code_Group = b.Code_Group 
    order by a.CodeID, a.Code_Group

The problem of this is With statment makes my table distinct. It shows only one row if there is exactly same data such as ID 1,2. Also, I cannot make NULL if there is exactly same frequencies.


What should I add to get my desired output? Or is there any better approach for this?

CodePudding user response:

CTE cte find the highest frequency code by Code_group and CodeID using dense_rank()

CTE selected check for any Code with same frequency and exclude them.

Final query just select from the original table and LEFT JOIN the selected

with 
cte as
(
    select Code_group, CodeID, Code
    from
    (
        select Code_group, CodeID, Code, 
               r = dense_rank() over (partition by Code_Group, CodeID
                                          order by count(*) desc)
        from   tableA
        group by Code_group, CodeID,  Code
    ) c
    where c.r = 1
),
selected as
(
    select Code_group, CodeID, Code
    from
    (
        select Code_group, CodeID, Code, 
               cnt = count(*) over (partition by Code_group, CodeID)
        from   cte
    ) s
    where s.cnt = 1
)
select a.*, 
       Selected_Code = s.Code
from   tableA a
       left join selected s on  a.Code_Group = s.Code_Group
                            and a.CodeID     = s.CodeID;

db<>fiddle demo

  • Related