How do I flatten the complexity of this select statement in postgresql?-CodePudding

I have the following table:

create table account_values
(
    account_id   bigint    not null,
    timestamp    timestamp not null,
    value1       numeric   not null,
    value2       numeric   not null,
    primary key (timestamp, account_id)
);

I also have the following query which produces an array of every value1 value2 of the row with the closest (before) timestamp to an evenly spaced generated series:

select array [(trunc(extract(epoch from gs) * 1000))::text, COALESCE((values.value1   values.value2), 0.000000)::text]
from generate_series((now() - '1 year'::interval)::timestamp, now(), interval '1 day') gs
         left join lateral (select value1, value2
                            from account_values
                            where timestamp <= gs and account_id = ?
                            order by timestamp desc
                            limit 1) equity on (TRUE);

The issue with this method of generating such an array becomes apparent when inspecting the output of explain analyse:

Nested Loop Left Join  (cost=0.45..3410.74 rows=1000 width=32) (actual time=0.134..3948.546 rows=366 loops=1)
  ->  Function Scan on generate_series gs  (cost=0.02..10.02 rows=1000 width=8) (actual time=0.075..0.244 rows=366 loops=1)
  ->  Limit  (cost=0.43..3.36 rows=1 width=26) (actual time=10.783..10.783 rows=1 loops=366)
        ->  Index Scan Backward using account_values_pkey on account_values  (cost=0.43..67730.27 rows=23130 width=26) (actual time=10.782..10.782 rows=1 loops=366)
"              Index Cond: ((""timestamp"" <= gs.gs) AND (account_id = 459))"
Planning Time: 0.136 ms
Execution Time: 3948.659 ms

Specifically: loops=366

This problem will only get worse if I ever decide to decrease my generated series interval time.

Is there a way to flatten this looped select into a more efficient query? If not, what are some other approaches I can take to improving the performance?

edit;

One hard requirement is that the result of the statement cannot be altered. For example I don't want the range to round to the closest day. The range should always start the second the statement is invoked and each interval precisely one day before.

CodePudding user response：

based on Edouard answer.

   with a(timestamp,values_agg) as
(select timestamp, array_agg(value1   value2) as values_agg from account_values
where account_id = ?
and tsrange(now()::timestamp - '1 year'::interval, now()::timestamp)
group by 1)
select
array((trunc(extract(epoch from timestamp) *1000))::text,
    coalesce(values_agg, 0.00000)::text]);

CodePudding user response：

Not sure you will get the exact same result, but it should be faster :

select array [ (trunc(extract(epoch from date_trunc('day', timestamp)) * 1000))::text
             , (array_agg(value1   value2 ORDER BY timestamp DESC))[1] :: text
             ]
  from account_values
 where account_id = ?
   and timestamp <@ tsrange(now() - '1 year'::interval, now())
 group by date_trunc('day', timestamp)