How to find any 10% drop for a certain period?-CodePudding

This task is overwhelming for me. I have a db with :

subject  date value
A        UNIX  6
B        UNIX  2
D        UNIX  4
A        UNIX  7
B        UNIX  9

Where each subject is a different time series with all dates of the year and values. (so each subject contain all unix dates)

I need to find for subject A and C, when did they dropped 10% in value. or:

during 2020 find when A or C fall 10% in value

I would usually use lag with a window like:

LAG(value,5) OVER (PARTITION BY name ORDER BY date_num)

but this will help to find a slop based on a certain window.

I need to find the 10% slop on any possible window between certain dates, and get the first time we have a 10% drop (in case it kept falling to 20%).

CodePudding user response：

select yt2.subject, min(yt2.date)
from yourtable yt1
join yourtable yt2
on yt1.subject = yt2.subject and
   yt1.date < yt2.date and
   0.9 * yt1.value >= yt2.value
left join yourtable nonexistent
on yt1.subject = nonexistent.subject and
   yt1.date < nonexistent.date and
   nonexistent.date < yt2.date
where yt1.subject in ('A', 'C') and
      nonexistent.subject is null
group by y2.subject;

Explanation:

we join yt1 and yt2 to have pairs of the same subject where 10% of the value dropped
we also join a hypothetical record, called nonexistent that has the same subject and whose date is between the date of yt1 and yt2, to see whether there is any change between the two
in the where clause we ensure that nonexistent is nonexistent indeed
we also ensure that the subject is in the set we are interested about
we group by y2.subject to get the separate values and aggregate in the select clause accordingly

CodePudding user response：

This will anchor your "drops" to each possible starting date and then return only the first successive date where the value had fallen relative to the anchor. If such drops occurs for different anchor dates they'll all be returned. There won't be checks to see if there are overlaps in those date ranges.

select t1.subject, t1.dt as date1, min(t2.dt) as date2
from T t1 inner join T t2
    on t2.subject = t1.subject and t2.dt > t1.dt
        and t2.value <= 0.9 * t1.value
where t1.subject in ('A', 'C')
group by t1.subject, t1.dt

A lateral join works pretty well to get all the values back:

select t1.subject, t1.dt as date1, date2, t1.value as value1, t2.value2,
    (t1.value - value2) * 1.0 / t1.value as pct
from T t1 inner join lateral (
    select distinct
        min(t2.dt) over () as date2,
        first_value(t2.value) over (order by t2.dt) as value2
    from T t2
    where t2.subject = t1.subject and t2.dt > t1.dt and t2.value <= 0.9 * t1.value
) t2 on true
where t1.subject in ('A', 'C')
order by t1.subject, t1.dt

https://dbfiddle.uk/?rdbms=postgres_10&fiddle=abe57498a15abcb53794818f32ae444e