Home > Blockchain >  SQL - efficient way to aggregate boolean values (postgresql)
SQL - efficient way to aggregate boolean values (postgresql)

Time:01-27

Let's assume table with 3 columns (originally it's a big table): id, is_deleted, date. I have to check if given id's are deleted or not and create new column with this value (TRUE or FALSE). Let's simplify it to below table (before):

id is_deleted date
A False 03-07-2022
A True 04-07-2022
B False 05-07-2022
B False 06-07-2022
C True 07-07-2022

(after):

id is_deleted date deleted
A True 03-07-2022 TRUE
A False 04-07-2022 TRUE
B False 05-07-2022 FALSE
B False 06-07-2022 FALSE
C True 07-07-2022 TRUE

So we can see that row with ids A and C should have True value in new column. For given id could be more than one TRUE value in is_deleted column. If any id has at least one TRUE value, all rows with given id should be deleted (TRUE value in new column). I need to do it inside this table, without group by, cuz by choosing group by, I have to create another CTE to join it with and it complicates a problem and performance.

I want to just create single column inside this table with new deleted value.

I've found bool_or function, but it won't work with window functions in redshift, my code:

bool_or(is_deleted) over(partition by id) as is_del

I can't use max, sum functions on boolean. Casting bool to int worsens the performance. Is there any other way to do it using booleans and keep good performance?

Thank you.

CodePudding user response:

It should be possible to emulate such behaviour with MIN/MAX functions and explicit casting:

SELECT MAX(is_deleted::INT) OVER (PARTITION BY id)
FROM ...;
-- if all is_deleted are false, then result is 0, 1 otherwise 

If the result should be boolean, then: MAX(is_deleted::INT) OVER (PARTITION BY id) = 1 or ( MAX(is_deleted::INT) OVER (PARTITION BY id))::BOOLEAN

CodePudding user response:

This select statement should give the needed output:

select
   id,  
   is_deleted,
   date
   case when yt2.is_deleted then true else false end as deleted
from yourtabletable yt1
left join yourtabletable yt2 on yt2.id = yt1.id and yt2.is_deleted 
  • Related