Home > Software engineering >  Postgres Date checking - is_date fn shortcircuit
Postgres Date checking - is_date fn shortcircuit

Time:11-03

I have a set of records in a table, with some records having invalid date. I wanted to ignore those invalid records and do a check with rest of the records. I framed a query like below but I don't find it working.

select * from tbl_name i
             where is_date(i.dob) and i.dob::date > CURRENT_DATE;

I got to know that sql doesn't short circuit so it also consider invalid record and end up in date/time out of range. Please help me alter this query in a way i could eliminate invalid dates and do date comparison on valid dates only.

CodePudding user response:

There is no guarantee for short-circuiting in Postgres. Neither in a "plain" WHERE clause, nor when using a derived table (from (select ...) where ...). One way to force the evaluation in two steps would be a materialized common table expressions:

with data as materialized (
  select * 
  from tbl_name i
  where is_date(i.dob)
)
select *
from data
where dob::date > CURRENT_DATE;

The materialized keyword prevents the optimizer from pushing the condition of the outer query into the CTE.

Obviously this assumes that is_date() will never return false positives

CodePudding user response:

Using CASE in the WHERE to differentiate between a valid date and an invalid one and run the > comparison for valid date otherwise return FALSE.

create or replace function is_date(s varchar) returns boolean as $$
begin
  if s is null then
     return false;
  end if;
  perform s::date;
  return true;
exception when others then
  return false;
end;
$$ language plpgsql;

create table date_str (id integer, dt_str varchar);

insert into date_str values (1, '2022-11-02'), (2, '1234'), (3, '2022-12-03');
insert into date_str values (4, 'xyz'), (5, '2022-01-01'), (6, '2023-02-02');

select * from date_str;
 id |   dt_str   
---- ------------
  1 | 2022-11-02
  2 | 1234
  3 | 2022-12-03
  4 | xyz
  5 | 2022-01-01
  6 | 2023-02-02

select current_date;
 current_date 
--------------
 11/02/2022


SELECT
    *
FROM
    date_str
WHERE
    CASE WHEN is_date (dt_str) = 't' THEN
        dt_str::date > CURRENT_DATE
    ELSE
        FALSE
    END;

 id |   dt_str   
---- ------------
  3 | 2022-12-03
  6 | 2023-02-02


  • Related