Home > Back-end >  PostgreSQL - Extract input number in range or specific number
PostgreSQL - Extract input number in range or specific number

Time:04-27

given an input number, for example 70025, I need to check with a query if it is contained in this string:

70014-70015;70000;70001;70002-70004

The numbers with - has to be treated as a range, so if 70025 in 70014-70015 or 70002-70004, while the ones between or after the ; as single and specific value, so 70025 is 70000 or 70001

Thanks in advice

CodePudding user response:

As Tim already commented, the data processing is not very easy with such a string. If you can't find another way, try a solution like this (works in postgresql only and expects the data to be in correct format):

with tmp as (
  select unnest(string_to_array('70014-70015;70000;70001;70002-70004', ';')) x
)
select
    generate_series(
      split_part(tmp.x,'-',1) :: bigint,
        coalesce(NULLIF(split_part(tmp.x,'-',2), '') ,
                 split_part(tmp.x,'-',1)):: bigint)
from tmp

CodePudding user response:

I would write a function for this:

create or replace function is_contained(p_value integer, p_list text)
  returns boolean
as
$$
  select exists (
    select *
    from (         
      select string_to_array(x.item, '-')::int[] as elements
      from unnest(string_to_array(p_list, ';')) as x(item)
    ) t 
    where case
            when cardinality(elements) = 1 then elements[1] = p_value
            else p_value between elements[1] and elements[2]
          end
  ); 
$$
language sql
immutable;

So the main part:

select *
from (         
  select string_to_array(x.element, '-')::int[] as elements
  from unnest(string_to_array('70014-70015;70000;70001;70002-70004', ';')) as x(element)
) t  

returns an array with one or two elements. This can then be used to write a CASE expression that checks if the input value matches at least one of the elements, e.g. this:

select elements,
       case
        when cardinality(elements) = 1 then elements[1] = 70015
        else 70015 between elements[1] and elements[2]
      end 
from (         
  select string_to_array(x.item, '-')::int[] as elements
  from unnest(string_to_array('70014-70015;70000;70001;70002-70004', ';')) as x(item)
) t  

returns:

elements      | case 
-------------- ------
{70014,70015} | true 
{70000}       | false
{70001}       | false
{70002,70004} | false

The CASE expression returns a boolean, so by moving it into the WHERE clause it only returns rows where the comparison value matches (where the CASE yields true).

As the SQL function should only return a single boolean, we use exists to test if there is at least one row based on the WHERE condition.

You can use it e.g. like this:

select *
from badly_designed_table
where is_contained(70014, the_column);

With Postgres 14 or later, another option is to create a function that converts the string value into a multirange which has the advantage that this can be used to create an index.

create function make_multirange(p_list text)
  returns int4multirange 
as
$$ 
  select range_agg(int4range(elements[1], coalesce(elements[2], elements[1]), '[]'))
  from (         
    select string_to_array(x.element, '-')::int[] as elements
    from unnest(string_to_array(p_list, ';')) as x(element)
  ) t  
$$
language sql
immutable;

Note that the ranges are defined as inclusive on both ends, so 70014-70015 includes both values in the range (which matches the BETWEEN operator I used in the first function). This will fail however if any of the input strings to to follow the format you have shown.

With that function you can query the table like this:

select *
from badly_designed_table
where make_multirange(the_column) @> 70014

The expression can be used to create an index:

create index on badly_designed_table
  using gist ((make_multirange(the_column));

The correct solution however, is to change your data model

  • Related