Home > Software engineering >  Bigquery - How to get only one max value when a window function has all the same values
Bigquery - How to get only one max value when a window function has all the same values

Time:11-30

I am using a window function in Google BigQuery to the the biggest value attached to a customer like this:

SELECT customer_key, store_location,
    FIRST_VALUE(store_key) OVER (
        PARTITION BY customer_key
        ORDER BY visits DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS fav_store
FROM VISIT_DB
ORDER BY customer_key ASC

And it works just fine in giving me the store with more visits by customer. However there are cases when a customer has the same number of visits in 3 different stores and this function returns the same 3 values.

For example, when customer 111 has 3 visits in store A, B, and C, as they all have the same number of visits and I want any of them to be returned not all three of them.

I also tried using LAST_VALUE and MAX but when the number of visits is the same, the three of them return all the stores.

How can I make it so that it returns only one?

CodePudding user response:

I was able to get your expected output with this script:

SELECT customer_key, store_location, store_key as fav_store 
FROM (
    SELECT *, ROW_NUMBER() OVER(partition by customer_key order by visits desc) rn
    FROM mydataset.mytable
) t1
WHERE rn = 1
ORDER BY customer_key

Sample data:

customer_key    store_key   store_location  visits
111                 A             A           3
111                 C             C           3
111                 B             B           3 
111                 D             D           2
222                 D             D           5
222                 A             A           3
222                 B             B           3
222                 C             C           3
333                 B             B           3
333                 A             A           1
444                 C             C           4

Result:

enter image description here

CodePudding user response:

Consider below approach

select as value array_agg(t order by visits desc limit 1)[offset(0)]
from your_table t
group by customer_key
  • Related