I have a table with these columns:
id (int)
col1 (int)
col2 (varchar)
date1 (date)
col3 (int)
cumulative_col3 (int)
and about 750k rows.
I want to update the cumulative_col3
with the sum of col3
of same col1, col2
and previous to date of date1
.
I have indexes on (date1)
, (date1, col1, col2)
and (col1, col2)
.
I have tried the following query but it takes a long time to complete.
update table_name
set cumulative_col3 = (select sum(s.col3)
from table_name s
where s.date1 <= table_name.date1
and s.col1 = table_name.col1
and s.col2 = table_name.col2);
What can I do to improve the performance of this query?
CodePudding user response:
You might try adding the following index to your table:
CREATE INDEX idx ON table_name (date1, col1, col2, col3);
This index, if used, should allow the correlated sum subquery to be evaluated faster.
CodePudding user response:
You can try to calculate the running sum in a derived table instead:
update table_name
set cumulative_col3 = t.cum_sum
from (
select id,
sum(s.col3) over (partition by col1, col2 order by date1) as cum_sum
from table_name
) s
where s.id = table_name.id;
This assumes that id
is the primary key of the table.