Home > database >  Flatten Nested Array and Aggregate in Snowflake
Flatten Nested Array and Aggregate in Snowflake

Time:07-06

My table column has nested arrays in a Snowflake database. I want to perform some aggregations using SQL (Snowflake SQL).

My table name is: DATA

The PROJ column is of VARIANT data type. The nested arrays will not always be 3, and I demonstrated that in the DATA table.


| ID |             PROJ              |
|----|-------------------------------|
| 1  |[[0, 4], [1, 30], [10, 20]]    |
| 2  |[[0, 2], [1, 20]]              |
| 3  |[[0, 8], [1, 10], [10, 100]]   |

Desired Output:

| Index | Min | Max | Mean|
|-------|-----|-----|-----|
| 0     |  2  |  8  | 4.66|
| 1     |  10 |  30 | 20  |
| 10    |  20 |  100| 60  |

CodePudding user response:

First the nested array should be flattened, then Index is the first element of subarray and Value is the second element(array is 0-based):

CREATE OR REPLACE TABLE DATA
AS
SELECT 1 AS ID, [[0, 4], [1, 30], [10, 20]]   AS PROJ UNION
SELECT 2 AS ID, [[0, 2], [1, 20]]             AS PROJ UNION
SELECT 3 AS ID, [[0, 8], [1, 10], [10, 100]]  AS PROJ;

Query:

SELECT s.VALUE[0]::INT AS Index,
       MIN(s.VALUE[1]::INT) AS MinValue,
       MAX(s.VALUE[1]::INT) AS MaxValue,
       AVG(s.VALUE[1]::INT) AS MeanValue
FROM DATA
,LATERAL FLATTEN(input=> PROJ) s
GROUP BY s.VALUE[0]::INT
ORDER BY Index;

Output:

enter image description here

  • Related