Home > OS >  How do I repeat a value in a dataset until the next value appears - SQL (Teradata)
How do I repeat a value in a dataset until the next value appears - SQL (Teradata)

Time:05-27

I have a dataset based on products that change on certain days and some products that change value daily.However its possible for customers to purchase that product up until the date it changes. So when I pull through the data it looks like this EG:

 --------- ------- ------------ 
| Product | Value |    Date    |
 --------- ------- ------------ 
| B       |     5 | 21/05/2022 |
| A       |     1 | 27/05/2022 |
| B       |     2 | 28/05/2022 |
| C       |     3 | 27/05/2022 |
| C       |     4 | 28/05/2022 |
| A       |     7 | 29/05/2022 |
| C       |     5 | 29/05/2022 |
 --------- ------- ------------ 

I am trying to get it into this format:

 ------------ --- --- --- 
|    Date    | A | B | C |
 ------------ --- --- --- 
| 27/05/2022 | 1 | 5 | 3 |
| 28/05/2022 | 1 | 2 | 4 |
| 29/05/2022 | 7 | 2 | 5 |
 ------------ --- --- --- 

Whats the best way to do this in Teradata SQL (note the example is a bit small, its likely the minimum i would need to repeat certain products is 7 days)

CodePudding user response:

You could try the use of pivot e.g.

SEL date, a, b, c
FROM your_table 
PIVOT (
    MAX(value)
    FOR product IN ('a','b','c')
) piv;

CodePudding user response:

Pivot over all dates (or at least include the previous x days/weeks to get rows for products like 'B'), apppy LAST_VALUE IGNORE NULLS on each product and then filter the range of dates.

with cte as
 (
   select 
      date
     ,last_value(a ignore nulls) over (order by date) as a
     ,last_value(b ignore nulls) over (order by date) as b
     ,last_value(c ignore nulls) over (order by date) as c
   from tab
   PIVOT (
       MAX(value_)
       FOR product IN ('a' as a
                      ,'b' as b
                      ,'c' as c)
   ) as pvt
 )
select * 
from cte   
where date between date '2022-05-27' 
               and date '2022-05-29'

But using old-style MAX(CASE) will probably get a slightly better plan and it's easier create dynamically if needed.

select 
   date
  ,last_value(max(case when product = 'a' then value end) ignore nulls) over (order by date)
  ,last_value(max(case when product = 'b' then value end) ignore nulls) over (order by date)
  ,last_value(max(case when product = 'c' then value end) ignore nulls) over (order by date)
from tab
group by 1
qualify date between date '2022-05-27'
                 and date '2022-05-29'
  • Related