Home > front end >  Create an associated matching ID based on multiple columns in the same table
Create an associated matching ID based on multiple columns in the same table

Time:03-23

I have an interesting issue where I need to create a unique identifier based on match groups for a set of data. This is is based on multiple criteria, but generally what I need to happen is to take this input:

SOURCE_ID MATCH_ID PHONE
1 1 (999)9999999
1 2 (999)9999999
2 1 (999)9999999
213710 707187 (001)2548987
213710 759263 (100)8348243
213705 2416730 (156)6676200
213705 12116102 (132)3453523

And it needs to look like this as the output:

SOURCE_ID MATCH_ID PHONE GENERATED_ID
1 1 (999)9999999 1
1 2 (999)9999999 1
2 1 (999)9999999 1
213710 707187 (001)2548987 2
213710 759263 (100)8348243 2
213705 2416730 (156)6676200 3
213705 12116102 (132)3453523 3

I've utilized the DENSE_RANK() function to create two separate IDs, one sorting on PHONE, the other SOURCEID column. The PHONE sort gives me the correct output for lines 1-3, but incorrect for 4-7, while the SOURCE_ID sort works on lines 4-6, but not 1-3.

How can I combine these in a way that gives the desired output above? I've tried combining the columns in every format possible, but no luck there either.

Output from testing, highlighted correct results. Each TEST## column is noted below

The SQL for reference:

SELECT SOURCE_ID, 
    MATCH_ID, 
    PHONE, 
    DENSE_RANK() OVER(ORDER BY PHONE) AS PHONE_SORT
    DENSE_RANK() OVER(ORDER BY SOURCE_ID) AS SOURCE_ID_SORT
    DENSE_RANK() OVER(ORDER BY MATCH_ID, INTERNAL_ROW_ID) AS TEST1,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID) AS TEST2,
    DENSE_RANK() OVER(ORDER BY MATCH_ID) AS TEST3,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, SOURCE_ID, PHONE) AS TEST4,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE, SOURCE_ID) AS TEST5,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID, PHONE) AS TEST6,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE, MATCH_ID) AS TEST7,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID, MATCH_ID) AS TEST8,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID, SOURCE_ID) AS TEST9,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID) AS TEST10,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID) AS TEST11,     
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE) AS TEST12,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE) AS TEST13      
FROM MY_TABLE; 

TIA!

CodePudding user response:

that's some rather obtuse logic.

SELECT 
    column1
    ,column2
    ,column3
    ,dense_rank() over (order by rankable)
FROM (    
    SELECT *
        ,count(column1) over (partition by column1) c_c1
        ,count(column3) over (partition by column3) c_c3
        ,iff(c_c1> c_c3, column1::text, column3) as rankable
    FROM VALUES
    (1,1,'(999)9999999'),
    (1, 2,'(999)9999999'),
    (2, 1,'(999)9999999'),
    (213710,    707187,'(001)2548987'),
    (213710,    759263,'(100)8348243'),
    (213705,    2416730,'(156)6676200'),
    (213705,    12116102,'(132)3453523')
)

gives:

COLUMN1 COLUMN2 COLUMN3 DENSE_RANK() OVER (ORDER BY RANKABLE)
1 1 (999)9999999 1
1 2 (999)9999999 1
2 1 (999)9999999 1
213,705 2,416,730 (156)6676200 2
213,705 12,116,102 (132)3453523 2
213,710 707,187 (001)2548987 3
213,710 759,263 (100)8348243 3
  • Related