Home > Net >  Inner merge adds more rows when join on shared id BQ SQL
Inner merge adds more rows when join on shared id BQ SQL

Time:07-07

I have two tables that I want to merge. Both of them have shared ID I am merging on. And the same number of rows. I just want to add one more column from table B to table A. Here is table A:

date           custom_key                                            pageviews
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0           1
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0           1
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0           1
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0           1
2022-06-29     2022-06-29_13b6c84c-c369-4c27-92c1-612b87a8e6e0           2

Table B:

date           custom_key                                            custom_date
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0       2022-01-05
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0       2022-01-05
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0       2022-01-05
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0       2022-01-05
2022-06-29     2022-06-29_13b6c84c-c369-4c27-92c1-612b87a8e6e0       2022-01-05

The desired output is:

date           custom_key                                          custom_date  pageviews
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0     2022-01-05       1
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0     2022-01-05       1
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0     2022-01-05       1
2022-01-05     2022-01-05_13b6c84c-c369-4c27-92c1-612b87a8e6e0     2022-01-05       1
2022-06-29     2022-06-29_13b6c84c-c369-4c27-92c1-612b87a8e6e0     2022-01-05       2

It is crucial to keep the same number of rows as I am doing additional analysis afterward by counting the rows. I tried the following code but instead of 5 rows, it returned 17 rows.

select 
a.*,
b.custom_date
from 
 `table_a` a 
INNER JOIN  `table_b` b ON a.custom_key = b.custom_key

CodePudding user response:

Consider below approach

select a.* except(pos), b.custom_date
from (select *, row_number() over(partition by custom_key, date) pos from table_a) a
join (select *, row_number() over(partition by custom_key, date) pos from table_b) b
using(custom_key, date, pos)
      

if applied to sample data in your question - output is

enter image description here

  • Related