Home > Enterprise >  how to aggregate one record multiple times based on condition
how to aggregate one record multiple times based on condition

Time:09-01

I have a bunch of records in the table below.

product_id       produced_date    expired_date
123              2010-02-01       2012-05-31
234              2013-03-01       2014-08-04
345              2012-05-01       2018-02-25

...              ...             ...

I want the output to display how many unexpired products currently we have at the monthly level. (Say, if a product expires on August 04, we still count it in August stock)

Month             n_products
2010-02-01        10
2010-03-01        12

...
2022-07-01        25
2022-08-01        15

How should I do this in Presto or Hive? Thank you!

CodePudding user response:

You can use below SQL.
Here we are using case when to check if a product is expired or not(produced_date >= expired_date ), if its expired, we are summing it to get count of product that has been expired. And then group that data over expiry month.

select 
TRUNC(expired_date, 'MM') expired_month,
SUM( case when produced_date >= expired_date then 1 else 0 end) n_products
from mytable
group by 1

CodePudding user response:

We can use unnest and sequence functions to create a derived table; Joining our table with this derived table, should give us the desired result.

    Select m.month,count(product_id) as n_products
    (Select 
    (select x
    from unnest(sequence(Min(month(produced_date)), Max(month(expired_date)), Interval '1' month)) t(x)
    ) as month
    from table) m
    left join table t on m.month >= t.produced_date and m.month <= t.expired_date
    
    group by 1 
    order by 1 
  • Related