Home > Enterprise >  SQL/BigQuery - How to eliminate duplicates based on two columns
SQL/BigQuery - How to eliminate duplicates based on two columns

Time:08-17

I have a table in BigQuery with the list of items sold together within a sales database:

original_SKU  bought_with  quantity
12345         98765        130
98765         12345        130
abcde         fghij        88
fghij         abcde        88

however you can see that the combinations are repeated... apparently an easy command but I'm having difficulties lol

thanks in advance

CodePudding user response:

Consider below approach

select any_value(struct(original_SKU, bought_with)).*, sum(quantity) quantity
from your_table
group by least(original_SKU, bought_with) || greatest(original_SKU, bought_with)              

if applied to sample data in your question - output is

enter image description here

  • Related