I'm facing an issue where ARRAY_AGG is passed with 'IGNORE NULLS' but null values are still included in the output.
Is it because I have STRUCTs inside the ARRAY_AGGs?
Could someone please tell me what I'm doing wrong?
Sample code:
WITH t AS (
SELECT 1 as batch_id, 1 as sensor_id, 1 as product_part_id, 2 as defect_id, 5 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 1 as product_part_id, 2 as defect_id, 6 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 2 as product_part_id, 3 as defect_id, 7 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 3 as defect_id, 8 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 4 as defect_id, 9 as count_defects, null as event_date
UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 3 as product_part_id, 5 as defect_id, 10 as count_defects, '2018-7-1' as event_date
),
defect_nesting as (
SELECT
batch_id,
sensor_id,
product_part_id,
array_agg(STRUCT(defect_id, count_defects, event_date) IGNORE NULLS ORDER BY defect_id) defectInfo
FROM t
GROUP BY 1, 2, 3
),
product_nesting as (
SELECT
batch_id,
sensor_id,
array_agg(STRUCT(product_part_id, defectInfo) IGNORE NULLS ORDER BY product_part_id) productInfo
FROM defect_nesting
GROUP BY 1,2
)
SELECT
batch_id,
array_agg(STRUCT(sensor_id, productInfo) IGNORE NULLS ORDER BY sensor_id) sensorInfo
FROM product_nesting
GROUP BY 1
My end result would be just be blanks instead of nulls.
CodePudding user response:
Hi can you try this query?
Query:
WITH t AS (
SELECT 1 as batch_id, 1 as sensor_id, 1 as product_part_id, 2 as defect_id, 5 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 1 as product_part_id, 2 as defect_id, 6 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 2 as product_part_id, 3 as defect_id, 7 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 3 as defect_id, 8 as count_defects, '2018-7-1' as event_date
UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 4 as defect_id, 9 as count_defects, null as event_date
UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 3 as product_part_id, 5 as defect_id, 10 as count_defects, '2018-7-1' as event_date
),
t2 as (
SELECT
* except(event_date),
ifnull(event_date,'') as event_date
from t
),
defect_nesting as (
SELECT
batch_id,
sensor_id,
product_part_id,
array_agg(STRUCT(defect_id, count_defects, event_date) ORDER BY defect_id) defectInfo
FROM t2
GROUP BY 1, 2, 3
),
product_nesting as (
SELECT
batch_id,
sensor_id,
array_agg(STRUCT(product_part_id, defectInfo) ORDER BY product_part_id) productInfo
FROM defect_nesting
GROUP BY 1,2
)
SELECT
batch_id,
array_agg(STRUCT(sensor_id, productInfo) ORDER BY sensor_id) sensorInfo
FROM product_nesting
GROUP BY 1
I added a new CTE that replaces nulls with blanks in event_date column and used this table to the rest of the query. (Also removed ignore nulls as its just exlclude nulls as mentioned by ricco D.
(If the other columns is expected to have null values too, you can apply the same query/logic to other columns (IFNULL) )