Home > front end >  PostgreSQL Update Statement Performance
PostgreSQL Update Statement Performance

Time:10-15

I have a table with these columns:

id (int) 
col1 (int) 
col2 (varchar) 
date1 (date) 
col3 (int) 
cumulative_col3 (int) 

and about 750k rows.

I want to update the cumulative_col3 with the sum of col3 of same col1, col2 and previous to date of date1.

I have indexes on (date1), (date1, col1, col2) and (col1, col2).

I have tried the following query but it takes a long time to complete.

update table_name
set cumulative_col3 = (select sum(s.col3)
                       from table_name s
                       where s.date1 <= table_name.date1
                         and s.col1 = table_name.col1
                         and s.col2 = table_name.col2);

What can I do to improve the performance of this query?

CodePudding user response:

You might try adding the following index to your table:

CREATE INDEX idx ON table_name (date1, col1, col2, col3);

This index, if used, should allow the correlated sum subquery to be evaluated faster.

CodePudding user response:

You can try to calculate the running sum in a derived table instead:

update table_name
  set cumulative_col3 = t.cum_sum
from (
  select id,
         sum(s.col3) over (partition by col1, col2 order by date1) as cum_sum
  from table_name
) s  
where s.id = table_name.id;

This assumes that id is the primary key of the table.

  • Related