Home > Enterprise >  SQL - NULL values included in array when using 'IGNORE NULLS'
SQL - NULL values included in array when using 'IGNORE NULLS'

Time:09-01

I'm facing an issue where ARRAY_AGG is passed with 'IGNORE NULLS' but null values are still included in the output.

Is it because I have STRUCTs inside the ARRAY_AGGs?

Could someone please tell me what I'm doing wrong?

Sample code:

WITH t AS (
  SELECT 1 as batch_id, 1 as sensor_id, 1 as product_part_id, 2 as defect_id,  5 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 1 as product_part_id, 2 as defect_id,  6 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 2 as product_part_id, 3 as defect_id,  7 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 3 as defect_id,  8 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 4 as defect_id,  9 as count_defects, null as event_date
  UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 3 as product_part_id, 5 as defect_id, 10 as count_defects, '2018-7-1' as event_date
),
defect_nesting as (
  SELECT 
    batch_id, 
    sensor_id, 
    product_part_id, 
    array_agg(STRUCT(defect_id, count_defects, event_date) IGNORE NULLS ORDER BY defect_id) defectInfo
  FROM t 
  GROUP BY 1, 2, 3
  ),

product_nesting as (  
  SELECT 
    batch_id,
    sensor_id,
    array_agg(STRUCT(product_part_id, defectInfo) IGNORE NULLS ORDER BY product_part_id) productInfo
  FROM defect_nesting
  GROUP BY 1,2
)

SELECT 
  batch_id,
  array_agg(STRUCT(sensor_id, productInfo) IGNORE NULLS ORDER BY sensor_id) sensorInfo
FROM product_nesting
GROUP BY 1

Sample output: nulls present in output even though IGNORE NULLS is called

My end result would be just be blanks instead of nulls.

CodePudding user response:

Hi can you try this query?

Query:

WITH t AS (
  SELECT 1 as batch_id, 1 as sensor_id, 1 as product_part_id, 2 as defect_id,  5 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 1 as product_part_id, 2 as defect_id,  6 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 2 as sensor_id, 2 as product_part_id, 3 as defect_id,  7 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 3 as defect_id,  8 as count_defects, '2018-7-1' as event_date
  UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 2 as product_part_id, 4 as defect_id,  9 as count_defects, null as event_date
  UNION ALL SELECT 1 as batch_id, 3 as sensor_id, 3 as product_part_id, 5 as defect_id, 10 as count_defects, '2018-7-1' as event_date
),
t2 as (
SELECT 
  * except(event_date), 
  ifnull(event_date,'') as event_date 
from t
),
defect_nesting as (
  SELECT 
    batch_id, 
    sensor_id, 
    product_part_id, 
    array_agg(STRUCT(defect_id, count_defects, event_date) ORDER BY defect_id) defectInfo
  FROM t2 
  GROUP BY 1, 2, 3
  ),

product_nesting as (  
  SELECT 
    batch_id,
    sensor_id,
    array_agg(STRUCT(product_part_id, defectInfo) ORDER BY product_part_id) productInfo
  FROM defect_nesting
  GROUP BY 1,2
)

SELECT 
  batch_id,
  array_agg(STRUCT(sensor_id, productInfo) ORDER BY sensor_id) sensorInfo
FROM product_nesting
GROUP BY 1

I added a new CTE that replaces nulls with blanks in event_date column and used this table to the rest of the query. (Also removed ignore nulls as its just exlclude nulls as mentioned by ricco D.

Output: enter image description here

(If the other columns is expected to have null values too, you can apply the same query/logic to other columns (IFNULL) )

  • Related