Home > Back-end >  unpivot columns, sum of value into one row group by other column based on start_month column that is
unpivot columns, sum of value into one row group by other column based on start_month column that is

Time:07-30

I have these data,

item start_month month_m_qty month_m1_qty month_m2_qty month_m3_qty month_m4_qty month_m5_qty
A 202207 1000 200 4000 480 4000 0
B 202207 10 2400 0 234 304 0
C 202207 300 2000 480 1000 0 0
D 202207 900 700 200 0 4000 0

I would like to pivot this table to have one column time_period et one column with qty and this group by Item.

The column month goes to month_m17_qty but, i would like to retrieve only month of the year of the start_month like this

result expected:

item time_period qty
A 202207 1000
A 202208 200
A 202209 4000
A 202210 480
A 202211 4000
A 202212 0
B 202207 10
B 202208 2400
B 202209 0
B 202210 234
B 202211 304
B 202212 0

At the moment i used this script:

DECLARE get_col_names STRING;
SET get_col_names = (
  SELECT CONCAT('(', STRING_AGG( column_name, ','), ')'),
From(
SELECT column_name, 
FROM path_table.INFORMATION_SCHEMA.COLUMNS
where table_name ="table" and column_name LIKE ("%quantity")));


EXECUTE IMMEDIATE format("""
CREATE OR REPLACE TABLE tableA AS (
select item, start_month, time_period, qty
from table
unpivot
(
  qty for time_period IN %s
) )
""", get_col_names);

WITH TEMP AS (
SELECT 
item, PARSE_DATE('%Y%m', CAST (start_month AS STRING)) as start_month, time_period, qty as forecast_invoices_qty
FROM tableA),

TEMP2 AS (
SELECT item, 
CASE 
WHEN time_period = 'month_m_quantity'  THEN start_month
WHEN time_period = 'month_m1_quantity' THEN DATE_ADD(start_month, INTERVAL 1 MONTH)
WHEN time_period = 'month_m2_quantity' THEN DATE_ADD(start_month, INTERVAL 2 MONTH)
WHEN time_period = 'month_m3_quantity' THEN DATE_ADD(start_month, INTERVAL 3 MONTH)
WHEN time_period = 'month_m4_quantity' THEN DATE_ADD(start_month, INTERVAL 4 MONTH)
WHEN time_period = 'month_m5_quantity' THEN DATE_ADD(start_month, INTERVAL 5 MONTH)
WHEN time_period = 'month_m6_quantity' THEN DATE_ADD(start_month, INTERVAL 6 MONTH)
WHEN time_period = 'month_m7_quantity' THEN DATE_ADD(start_month, INTERVAL 7 MONTH)
WHEN time_period = 'month_m8_quantity' THEN DATE_ADD(start_month, INTERVAL 8 MONTH)
WHEN time_period = 'month_m9_quantity' THEN DATE_ADD(start_month, INTERVAL 9 MONTH)
WHEN time_period = 'month_m10_quantity' THEN DATE_ADD(start_month, INTERVAL 10 MONTH)
WHEN time_period = 'month_m11_quantity' THEN DATE_ADD(start_month, INTERVAL 11 MONTH)
WHEN time_period = 'month_m12_quantity' THEN DATE_ADD(start_month, INTERVAL 12 MONTH)
END AS time_period,
start_month,
forecast_invoices_qty
FROM TEMP)

SELECT 
*
FROM TEMP2
WHERE EXTRACT(YEAR FROM time_period) = EXTRACT(YEAR FROM start_month)



Is there a better way to optimize the renaming in date format into new column like this month_m_qty -> 202207 , month_m1_qty -> 202208, month_m2_qty -> 202209 etc

How can i make this ?

Many thanks in advance for your help

CodePudding user response:

Consider below approach

select item, format_date('%Y%m', time_period) time_period, value
from (
  select *, 
    date_add(parse_date('%Y%m', '' || start_month), 
      interval ifnull(cast(regexp_extract(month, r'\d ') as int64), 0) month
    ) time_period
  from (
    select * from your_table
    unpivot (value for month in (
      month_m_qty, month_m1_qty, month_m2_qty, month_m3_qty, month_m4_qty, month_m5_qty
      # ,month_m6_qty, month_m7_qty, month_m8_qty, month_m9_qty, month_m10_qty, month_m11_qty
      # ,month_m12_qty, month_m13_qty, month_m14_qty, month_m15_qty, month_m16_qty, month_m17_qty
    ))
  )
)
where extract(year from parse_date('%Y%m', '' || start_month)) = extract(year from time_period)          

if applied to sample data in your question - output is

enter image description here

  • Related