I'm trying to classify time series data using SQL. I have data for a reference data point that occurs over 3 years. So the reference occurs 36 times, one for each month. Sometimes the quantity is 0, other times it may be 25 or even higher for each row. What I want to know is how to calculate these equations using SQL (MSSQL in particular).
Then, similarly, I want to classify the data into Erratic
, Smooth
, Lumpy
, and/or Intermittent
Please ask if you need more clarity. My SQL skills are barely basic.
CodePudding user response:
with data as (
select
CP_REF,
count(*) * 1.0 /
nullif(count(case when QUANTITY > 0 then 1 end), 0) as ADI,
stdevp(QUANTITY) / nullif(avg(QUANTITY), 0) as COV
from DF_ALL_DEMAND_BY_ROW_V
where parent is not null
group by CP_REF
)
select
CP_REF, ADI, COV,
case
when ADI < 1.32 and COV < 0.49 then 'Smooth'
when ADI >= 1.32 and COV < 0.49 then 'Intermittent'
when ADI < 1.32 and COV >= 0.49 then 'Erratic'
when ADI >= 1.32 and COV >= 0.49 then 'Lumpy'
else 'Smooth'
end as DEMAND
from data;
Double check that you want to use stdevp()
and not stdev
. I wish I were more knowledgeable about statistics.