I have a some_dictionary table and a some_dictionary_language table with data in multiple languages(and a reference to some_dictionary).
I need to get unique values on some_dictionary.id preferring the result for the given language and if not the default.
Given sql:
create table some_dictionary(
id bigserial primary key,
some_text varchar(5),
some_array integer[]
);
create table some_dictionary_language(
id bigserial primary key,
some_dictionary_id bigint not null REFERENCES some_dictionary(id),
language varchar(64) not null,
name varchar(128) not null
);
insert into some_dictionary (some_text, some_array)
values
('text2', '{1, 32, 2}'),
('text1', '{5, 9, 1}'),
('text4', '{1, 97, 4}'),
('text3', '{616, 1, 55}'),
('text5', '{8, 1}'),
('text6', '{1}');
insert into some_dictionary_language (some_dictionary_id, language, name)
values
(2, 'POLISH', 'nazwa2'),
(1, 'ENGLISH', 'name1'),
(3, 'ENGLISH', 'name3'),
(2, 'ENGLISH', 'name2'),
(1, 'POLISH', 'nazwa1'),
(1, 'SPANISH', 'nombre1'),
(4, 'SPANISH', 'nombre1'),
(5, 'ENGLISH', '5name'),
(6, 'ENGLISH', '6name'),
(6, 'POLISH', 'nazwa5'),
(5, 'POLISH', 'nazwa6');
Given conditions params:
langugage = 'POLISH' or if not, default = 'ENGLISH'
phrase in some_text or name = 'na'
element in some_array = 1
page = 1 size = 10
My select statement without distinct:
select d.id, d.some_text, d.some_array, dl.name, dl.language
from some_dictionary d
join some_dictionary_language dl on d.id = dl.some_dictionary_id
where dl."language" in ('POLISH', 'ENGLISH')
and (d.some_text ilike '%na%' or dl.name ilike '%na%')
and 1 = ANY(d.some_array)
select result:
d.id d.some_text d.some_array dl.name dl.lanugage
2 text1 {5,9,1} nazwa2 POLISH
1 text2 {1,32,2} name1 ENGLISH
3 text4 {1,97,4} name3 ENGLISH
2 text1 {5,9,1} name2 ENGLISH
1 text2 {1,32,2} nazwa1 POLISH
5 text5 {8,1} 5name ENGLISH
6 text6 {1} 6name ENGLISH
6 text6 {1} nazwa5 POLISH
5 text5 {8,1} nazwa6 POLISH
expected select result with distinct on d.id and prefered language POLISH else default ENGLISH:
d.id d.some_text d.some_array dl.name dl.lanugage
1 text2 {1,32,2} nazwa1 POLISH
2 text1 {5,9,1} name2 POLISH
3 text4 {1,97,4} name3 ENGLISH (default!)
5 text5 {8,1} nazwa6 POLISH
6 text6 {1} 6name POLISH
I tried to do something like this:
select distinct on (id) * from (
select d.id, d.some_text, d.some_array, dl.name, dl.language
from some_dictionary d
join some_dictionary_language dl on d.id = dl.some_dictionary_id
where dl."language" in ('POLISH', 'ENGLISH')
and (d.some_text ilike '%na%' or dl.name ilike '%na%')
and 1 = ANY(d.some_array)
order by case when dl."language" = 'POLISH' then 1 end
) sub offset 0 row fetch next 10 rows only;
but it did not work properly:
d.id d.some_text d.some_array dl.name dl.lanugage
1 text2 {1,32,2} nazwa1 POLISH
2 text1 {5,9,1} name2 ENGLISH
3 text4 {1,97,4} name3 ENGLISH
5 text5 {8,1} nazwa6 POLISH
6 text6 {1} 6name ENGLISH
CodePudding user response:
The ORDER BY that determine how the DISTINCT ON selects its record should be on the same level as the DISTINCT ON itself, not in a subquery.
If you did that, the error message would tell you the problem, you have to sort first by the DISTINCT ON columns, then the tie-breaker columns after. Like this:
select distinct on (id) d.id, d.some_text, d.some_array, dl.name, dl.language
from some_dictionary d
join some_dictionary_language dl on d.id = dl.some_dictionary_id
where dl."language" in ('POLISH', 'ENGLISH')
and (d.some_text ilike '%na%' or dl.name ilike '%na%')
and 1 = ANY(d.some_array)
order by id, case when dl."language" = 'POLISH' then 1 end;
When the DISTINCT ON query doesn't have an ORDER BY, it just makes one up consisting of only the DISTINCT ON columns, leaving the kept row within each group to be selected arbitrarily.