Suppose a PostgreSQL table, articles
, contains two nullable String columns of name
and alt_name
.
Now, I want to find records (rows) in the table that have
- a combination of String
name
andalt_name
matches another combination of the same type in the same table:- i.e.,
[a.name, a.alt_name]
is equal to either[b.name, b.alt_name]
or[b.alt_name, b.name]
- i.e.,
- where
name
oralt_name
may beNULL
or an empty String, and in any circumstancesNULL
and an empty String should be treated as identical;- e.g., when
[a.name, a.alt_name] == ["abc", NULL]
, a record of[b.name, b.alt_name] == ["", "abc"]
should match, because one of them is"abc"
and the other is NULL or empty String.
- e.g., when
Is there any neat query to achieve this?
I thought if there is a way to concatenate both columns with a UTF-8 replacement character (U FFFD
) in between, where NULL is converted into an empty String, that would solve the problem. Say, if the function were magic_fn()
, the following would do a job, providing there is a unique column id
:
SELECT * FROM articles a INNER JOIN places b ON a.id = b.id
WHERE a.id <> b.id AND
( magic_fn(a.name, a.alt_name) = magic_fn(b.name, b.alt_name)
OR magic_fn(a.name, a.alt_name) = magic_fn(b.alt_name, b.name));
However, concatnation is not a built-in function in PostgreSQL and I don't know how to do this.
Or, maybe there is simply a better way?
CodePudding user response:
try this
SELECT * FROM articles a
cross join articles b
where
(ARRAY[COALESCE(a.name,''),COALESCE(a.alt_name,'')]@> ARRAY[COALESCE(b.name,''),COALESCE(b.alt_name,'')]) and
a.id<>b.id
CodePudding user response:
You can create a function which takes in the name
and alt_name
, then returns an aggregated string with null
s converted to empty strings and the results sorted:
create function magic_fn(a text, b text) returns text
return (select json_agg(t.v) from (
select t1.* from (
select coalesce(a, '') v
union all
select coalesce(b, '') v) t1
order by t1.v) t);
create table articles (id int, name text, alt_name text);
insert into articles values (1, 'abc', null), (2, 'abc', ''), (3, null, 'abc'), (4, 'aaa', 'a'), (5, 'aaa', 'a'), (6, 'a', 'aaa')
Usage:
select * from articles a join articles b
on a.id <> b.id and magic_fn(a.name, a.alt_name) = magic_fn(b.name, b.alt_name)
CodePudding user response:
you can try to use
- coalesce for convert null to empty
- || for concatenate string
and then compare string like this sql:
(coalesce(a.name,'') || coalesce(a.altname,'')) = (coalesce(b.name,'') || coalesce(b.altname,''))
or
(coalesce(a.name,'') || coalesce(a.altname,'')) = (coalesce(b.altname,'') || coalesce(b.name,''))