Home > Blockchain >  How to find records in PostgreSQL matching a combination of a pair of nullable String columns
How to find records in PostgreSQL matching a combination of a pair of nullable String columns

Time:11-11

Suppose a PostgreSQL table, articles, contains two nullable String columns of name and alt_name. Now, I want to find records (rows) in the table that have

  • a combination of String name and alt_name matches another combination of the same type in the same table:
    • i.e., [a.name, a.alt_name] is equal to either [b.name, b.alt_name] or [b.alt_name, b.name]
  • where name or alt_name may be NULL or an empty String, and in any circumstances NULL and an empty String should be treated as identical;
    • e.g., when [a.name, a.alt_name] == ["abc", NULL], a record of [b.name, b.alt_name] == ["", "abc"] should match, because one of them is "abc" and the other is NULL or empty String.

Is there any neat query to achieve this?

I thought if there is a way to concatenate both columns with a UTF-8 replacement character (U FFFD) in between, where NULL is converted into an empty String, that would solve the problem. Say, if the function were magic_fn(), the following would do a job, providing there is a unique column id:

SELECT * FROM articles a INNER JOIN places b ON a.id = b.id
  WHERE a.id <> b.id AND
    (   magic_fn(a.name, a.alt_name) =  magic_fn(b.name, b.alt_name)
     OR magic_fn(a.name, a.alt_name) =  magic_fn(b.alt_name, b.name));

However, concatnation is not a built-in function in PostgreSQL and I don't know how to do this.

Or, maybe there is simply a better way?

CodePudding user response:

try this

SELECT  *   FROM articles a
cross join articles b    
where  
(ARRAY[COALESCE(a.name,''),COALESCE(a.alt_name,'')]@> ARRAY[COALESCE(b.name,''),COALESCE(b.alt_name,'')]) and
 a.id<>b.id

CodePudding user response:

You can create a function which takes in the name and alt_name, then returns an aggregated string with nulls converted to empty strings and the results sorted:

create function magic_fn(a text, b text) returns text
  return (select json_agg(t.v) from (
    select t1.* from (
      select coalesce(a, '') v
      union all
      select coalesce(b, '') v) t1 
    order by t1.v) t);
create table articles (id int, name text, alt_name text);
insert into articles values (1, 'abc', null), (2, 'abc', ''), (3, null, 'abc'), (4, 'aaa', 'a'), (5, 'aaa', 'a'), (6, 'a', 'aaa')

Usage:

select * from articles a join articles b 
on a.id <> b.id and magic_fn(a.name, a.alt_name) = magic_fn(b.name, b.alt_name)

See fiddle

CodePudding user response:

you can try to use

  • coalesce for convert null to empty
  • || for concatenate string

and then compare string like this sql:

(coalesce(a.name,'') || coalesce(a.altname,'')) =  (coalesce(b.name,'') || coalesce(b.altname,'')) 
 or 
 (coalesce(a.name,'') || coalesce(a.altname,'')) =  (coalesce(b.altname,'') || coalesce(b.name,'')) 
  • Related