I have these data,
item | start_month | month_m_qty | month_m1_qty | month_m2_qty | month_m3_qty | month_m4_qty | month_m5_qty |
---|---|---|---|---|---|---|---|
A | 202207 | 1000 | 200 | 4000 | 480 | 4000 | 0 |
B | 202207 | 10 | 2400 | 0 | 234 | 304 | 0 |
C | 202207 | 300 | 2000 | 480 | 1000 | 0 | 0 |
D | 202207 | 900 | 700 | 200 | 0 | 4000 | 0 |
I would like to pivot this table to have one column time_period et one column with qty and this group by Item.
The column month goes to month_m17_qty but, i would like to retrieve only month of the year of the start_month like this
result expected:
item | time_period | qty |
---|---|---|
A | 202207 | 1000 |
A | 202208 | 200 |
A | 202209 | 4000 |
A | 202210 | 480 |
A | 202211 | 4000 |
A | 202212 | 0 |
B | 202207 | 10 |
B | 202208 | 2400 |
B | 202209 | 0 |
B | 202210 | 234 |
B | 202211 | 304 |
B | 202212 | 0 |
At the moment i used this script:
DECLARE get_col_names STRING;
SET get_col_names = (
SELECT CONCAT('(', STRING_AGG( column_name, ','), ')'),
From(
SELECT column_name,
FROM path_table.INFORMATION_SCHEMA.COLUMNS
where table_name ="table" and column_name LIKE ("%quantity")));
EXECUTE IMMEDIATE format("""
CREATE OR REPLACE TABLE tableA AS (
select item, start_month, time_period, qty
from table
unpivot
(
qty for time_period IN %s
) )
""", get_col_names);
WITH TEMP AS (
SELECT
item, PARSE_DATE('%Y%m', CAST (start_month AS STRING)) as start_month, time_period, qty as forecast_invoices_qty
FROM tableA),
TEMP2 AS (
SELECT item,
CASE
WHEN time_period = 'month_m_quantity' THEN start_month
WHEN time_period = 'month_m1_quantity' THEN DATE_ADD(start_month, INTERVAL 1 MONTH)
WHEN time_period = 'month_m2_quantity' THEN DATE_ADD(start_month, INTERVAL 2 MONTH)
WHEN time_period = 'month_m3_quantity' THEN DATE_ADD(start_month, INTERVAL 3 MONTH)
WHEN time_period = 'month_m4_quantity' THEN DATE_ADD(start_month, INTERVAL 4 MONTH)
WHEN time_period = 'month_m5_quantity' THEN DATE_ADD(start_month, INTERVAL 5 MONTH)
WHEN time_period = 'month_m6_quantity' THEN DATE_ADD(start_month, INTERVAL 6 MONTH)
WHEN time_period = 'month_m7_quantity' THEN DATE_ADD(start_month, INTERVAL 7 MONTH)
WHEN time_period = 'month_m8_quantity' THEN DATE_ADD(start_month, INTERVAL 8 MONTH)
WHEN time_period = 'month_m9_quantity' THEN DATE_ADD(start_month, INTERVAL 9 MONTH)
WHEN time_period = 'month_m10_quantity' THEN DATE_ADD(start_month, INTERVAL 10 MONTH)
WHEN time_period = 'month_m11_quantity' THEN DATE_ADD(start_month, INTERVAL 11 MONTH)
WHEN time_period = 'month_m12_quantity' THEN DATE_ADD(start_month, INTERVAL 12 MONTH)
END AS time_period,
start_month,
forecast_invoices_qty
FROM TEMP)
SELECT
*
FROM TEMP2
WHERE EXTRACT(YEAR FROM time_period) = EXTRACT(YEAR FROM start_month)
Is there a better way to optimize the renaming in date format into new column like this month_m_qty -> 202207 , month_m1_qty -> 202208, month_m2_qty -> 202209 etc
How can i make this ?
Many thanks in advance for your help
CodePudding user response:
Consider below approach
select item, format_date('%Y%m', time_period) time_period, value
from (
select *,
date_add(parse_date('%Y%m', '' || start_month),
interval ifnull(cast(regexp_extract(month, r'\d ') as int64), 0) month
) time_period
from (
select * from your_table
unpivot (value for month in (
month_m_qty, month_m1_qty, month_m2_qty, month_m3_qty, month_m4_qty, month_m5_qty
# ,month_m6_qty, month_m7_qty, month_m8_qty, month_m9_qty, month_m10_qty, month_m11_qty
# ,month_m12_qty, month_m13_qty, month_m14_qty, month_m15_qty, month_m16_qty, month_m17_qty
))
)
)
where extract(year from parse_date('%Y%m', '' || start_month)) = extract(year from time_period)
if applied to sample data in your question - output is