My table column has nested arrays in a Snowflake database. I want to perform some aggregations using SQL (Snowflake SQL).
My table name is: DATA
The PROJ column is of VARIANT data type. The nested arrays will not always be 3, and I demonstrated that in the DATA table.
| ID | PROJ |
|----|-------------------------------|
| 1 |[[0, 4], [1, 30], [10, 20]] |
| 2 |[[0, 2], [1, 20]] |
| 3 |[[0, 8], [1, 10], [10, 100]] |
Desired Output:
| Index | Min | Max | Mean|
|-------|-----|-----|-----|
| 0 | 2 | 8 | 4.66|
| 1 | 10 | 30 | 20 |
| 10 | 20 | 100| 60 |
CodePudding user response:
First the nested array should be flattened, then Index is the first element of subarray and Value is the second element(array is 0-based):
CREATE OR REPLACE TABLE DATA
AS
SELECT 1 AS ID, [[0, 4], [1, 30], [10, 20]] AS PROJ UNION
SELECT 2 AS ID, [[0, 2], [1, 20]] AS PROJ UNION
SELECT 3 AS ID, [[0, 8], [1, 10], [10, 100]] AS PROJ;
Query:
SELECT s.VALUE[0]::INT AS Index,
MIN(s.VALUE[1]::INT) AS MinValue,
MAX(s.VALUE[1]::INT) AS MaxValue,
AVG(s.VALUE[1]::INT) AS MeanValue
FROM DATA
,LATERAL FLATTEN(input=> PROJ) s
GROUP BY s.VALUE[0]::INT
ORDER BY Index;
Output: