Home > Back-end >  SQL BigQuery - SUM over ARRAY_AGG - Error: No matching signature for argument type: ARRAY
SQL BigQuery - SUM over ARRAY_AGG - Error: No matching signature for argument type: ARRAY

Time:03-03

I'm trying to run this line of code:

SUM(
array_agg(
        case when type = "jump" then 1 else 0 end
        )
    over (partition by case_id order by start_at_local_true_01
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
)
as jump_count_per_case_id_in_future,

But I'm running into this error:

No matching signature for aggregate function SUM
for argument types: ARRAY<INT64>.
Supported signatures: SUM(INT64); SUM(FLOAT64); SUM(NUMERIC); SUM(BIGNUMERIC); SUM(INTERVAL) at [33:1]

This is the table I'm starting with:

Row case_id start_at_local_true_01 type
1 12123 2022-02-01 11:00:00 null
2 12123 2022-02-01 11:15:00 run
3 12123 2022-02-01 11:21:00 jump
4 12123 2022-02-04 11:31:00 run
5 12123 2022-02-05 11:34:00 jump
6 12555 2022-02-08 12:38:00 credit
7 12555 2022-02-01 11:15:00 null

And this is the table I'd want to end with:

Row case_id start_at_local_true_01 type jump_count_per_case_id_in_future
1 12123 2022-02-01 11:00:00 null 2
2 12123 2022-02-01 11:15:00 run 2
3 12123 2022-02-01 11:21:00 jump 2
4 55555 2022-02-04 11:31:00 run 0
5 12123 2022-02-05 11:34:00 jump 1
6 12123 2022-02-08 12:38:00 credit 0
7 55555 2022-02-01 11:15:00 null 0

This is my code so far (including sample table):

with data_table as(
    select * FROM UNNEST(ARRAY<STRUCT<
    case_id INT64, start_at_local_true_01 DATETIME, type STRING>>
    [
         (12123, DATETIME("2022-02-01 11:00:00"), null)
        ,(12123, DATETIME("2022-02-01 11:15:00"), 'run')
        ,(12123, DATETIME("2022-02-01 11:21:00"), 'jump')
        ,(55555, DATETIME("2022-02-04 11:31:00"), 'run')
        ,(12123, DATETIME("2022-02-05 11:34:00"), 'jump')
        ,(12123, DATETIME("2022-02-08 12:38:00"), 'credit')
        ,(55555, DATETIME("2022-02-01 11:15:00"), null)
    ]
    )
)

select 
data_table.*,

array_agg(
        IFNULL(type,"-")
        )
    over (partition by case_id order by start_at_local_true_01
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
as type_array_per_case_id,

array_agg(
        case when type = "jump" then 1 else 0 end
        )
    over (partition by case_id order by start_at_local_true_01
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
as type_array_per_case_id_to_numbers,

SUM(
array_agg(
        case when type = "jump" then 1 else 0 end
        )
    over (partition by case_id order by start_at_local_true_01
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
)
as jump_count_per_case_id_in_future,

from data_table

Thanks for the help!

CodePudding user response:

use below instead

COUNTIF(type = "jump")
    over (partition by case_id order by start_at_local_true_01
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
)
as jump_count_per_case_id_in_future,
  • Related