I've made a query to calculate some ranges (or buckets) and now I want to count of many elements are inside in each one of them.
For example, this could be a set of rows:
id | tx_value | date |
---|---|---|
1 | 30 | 2022-03-04 |
2 | 0.30 | 2022-03-04 |
1 | 300 | 2022-03-03 |
4 | 3000 | 2022-03-05 |
5 | 30 | 2022-03-04 |
I've calculated the range with the following clause:
ARRAY(SELECT tx_value_range * avg_tx_value
FROM UNNEST([0.001, 0.01, 0.1, 1, 10, 100]) AS tx_value_range) AS tx_size_buckets
This is a possible range:
[0.003, 0.03, 0.30, 3.0, 30.0, 300.0]
Then what I'm struggling to get is to count the amount of rows placed into each bucket, something like:
0.003, 3
0.03, 1
0.30, 4
I simply can't come up with even a test query to calculate this, I think that I'll need to iterate the bucket array for each transaction row in order to determine where to place the row but I can't seem to articulate it into a query.
CodePudding user response:
Consider below
select any_value(ranges[offset(range_pos)]) `range` , count(*) rows_count
from your_table,
unnest([struct([0.003, 0.03, 0.30, 3.0, 30.0, 300.0] as ranges)]),
unnest([struct(range_bucket(tx_value, ranges) - 1 as range_pos)])
group by range_pos
if applied to sample data in your question - output is