Create an associated matching ID based on multiple columns in the same table-CodePudding

I have an interesting issue where I need to create a unique identifier based on match groups for a set of data. This is is based on multiple criteria, but generally what I need to happen is to take this input:

SOURCE_ID	MATCH_ID	PHONE
1	1	(999)9999999
1	2	(999)9999999
2	1	(999)9999999
213710	707187	(001)2548987
213710	759263	(100)8348243
213705	2416730	(156)6676200
213705	12116102	(132)3453523

And it needs to look like this as the output:

SOURCE_ID	MATCH_ID	PHONE	GENERATED_ID
1	1	(999)9999999	1
1	2	(999)9999999	1
2	1	(999)9999999	1
213710	707187	(001)2548987	2
213710	759263	(100)8348243	2
213705	2416730	(156)6676200	3
213705	12116102	(132)3453523	3

I've utilized the DENSE_RANK() function to create two separate IDs, one sorting on PHONE, the other SOURCEID column. The PHONE sort gives me the correct output for lines 1-3, but incorrect for 4-7, while the SOURCE_ID sort works on lines 4-6, but not 1-3.

How can I combine these in a way that gives the desired output above? I've tried combining the columns in every format possible, but no luck there either.

Output from testing, highlighted correct results. Each TEST## column is noted below

The SQL for reference:

SELECT SOURCE_ID, 
    MATCH_ID, 
    PHONE, 
    DENSE_RANK() OVER(ORDER BY PHONE) AS PHONE_SORT
    DENSE_RANK() OVER(ORDER BY SOURCE_ID) AS SOURCE_ID_SORT
    DENSE_RANK() OVER(ORDER BY MATCH_ID, INTERNAL_ROW_ID) AS TEST1,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID) AS TEST2,
    DENSE_RANK() OVER(ORDER BY MATCH_ID) AS TEST3,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, SOURCE_ID, PHONE) AS TEST4,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE, SOURCE_ID) AS TEST5,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID, PHONE) AS TEST6,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE, MATCH_ID) AS TEST7,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID, MATCH_ID) AS TEST8,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID, SOURCE_ID) AS TEST9,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID) AS TEST10,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID) AS TEST11,     
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE) AS TEST12,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE) AS TEST13      
FROM MY_TABLE;

TIA!

CodePudding user response：

that's some rather obtuse logic.

SELECT 
    column1
    ,column2
    ,column3
    ,dense_rank() over (order by rankable)
FROM (    
    SELECT *
        ,count(column1) over (partition by column1) c_c1
        ,count(column3) over (partition by column3) c_c3
        ,iff(c_c1> c_c3, column1::text, column3) as rankable
    FROM VALUES
    (1,1,'(999)9999999'),
    (1, 2,'(999)9999999'),
    (2, 1,'(999)9999999'),
    (213710,    707187,'(001)2548987'),
    (213710,    759263,'(100)8348243'),
    (213705,    2416730,'(156)6676200'),
    (213705,    12116102,'(132)3453523')
)

gives:

COLUMN1	COLUMN2	COLUMN3	DENSE_RANK() OVER (ORDER BY RANKABLE)
1	1	(999)9999999	1
1	2	(999)9999999	1
2	1	(999)9999999	1
213,705	2,416,730	(156)6676200	2
213,705	12,116,102	(132)3453523	2
213,710	707,187	(001)2548987	3
213,710	759,263	(100)8348243	3