This task is overwhelming for me. I have a db with :
subject date value
A UNIX 6
B UNIX 2
D UNIX 4
A UNIX 7
B UNIX 9
Where each subject
is a different time series with all dates of the year and values. (so each subject contain all unix dates)
I need to find for subject
A
and C
, when did they dropped 10% in value
.
or:
during 2020 find when A or C fall 10% in value
I would usually use lag
with a window like:
LAG(value,5) OVER (PARTITION BY name ORDER BY date_num)
but this will help to find a slop based on a certain window.
I need to find the 10% slop on any possible window between certain dates, and get the first time we have a 10% drop (in case it kept falling to 20%).
CodePudding user response:
select yt2.subject, min(yt2.date)
from yourtable yt1
join yourtable yt2
on yt1.subject = yt2.subject and
yt1.date < yt2.date and
0.9 * yt1.value >= yt2.value
left join yourtable nonexistent
on yt1.subject = nonexistent.subject and
yt1.date < nonexistent.date and
nonexistent.date < yt2.date
where yt1.subject in ('A', 'C') and
nonexistent.subject is null
group by y2.subject;
Explanation:
- we join
yt1
andyt2
to have pairs of the same subject where 10% of the value dropped - we also join a hypothetical record, called
nonexistent
that has the samesubject
and whosedate
is between thedate
ofyt1
andyt2
, to see whether there is any change between the two - in the
where
clause we ensure thatnonexistent
is nonexistent indeed - we also ensure that the
subject
is in the set we are interested about - we
group by y2.subject
to get the separate values and aggregate in theselect
clause accordingly
CodePudding user response:
This will anchor your "drops" to each possible starting date and then return only the first successive date where the value had fallen relative to the anchor. If such drops occurs for different anchor dates they'll all be returned. There won't be checks to see if there are overlaps in those date ranges.
select t1.subject, t1.dt as date1, min(t2.dt) as date2
from T t1 inner join T t2
on t2.subject = t1.subject and t2.dt > t1.dt
and t2.value <= 0.9 * t1.value
where t1.subject in ('A', 'C')
group by t1.subject, t1.dt
A lateral join works pretty well to get all the values back:
select t1.subject, t1.dt as date1, date2, t1.value as value1, t2.value2,
(t1.value - value2) * 1.0 / t1.value as pct
from T t1 inner join lateral (
select distinct
min(t2.dt) over () as date2,
first_value(t2.value) over (order by t2.dt) as value2
from T t2
where t2.subject = t1.subject and t2.dt > t1.dt and t2.value <= 0.9 * t1.value
) t2 on true
where t1.subject in ('A', 'C')
order by t1.subject, t1.dt
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=abe57498a15abcb53794818f32ae444e