I have two tables Table_1 and Table_2 in the database as below:
The datatype of Name
column is STRING and that of Matches
column is ARRAY<INT64>. For each name, there can be one or more matches. For example, in Table_1
, N1 has matches [1, 2]. I want to combine both Table_1
and Table_2
, and generate a Result
table like below:
In the Result table, the distinct match ID will come from the Matches
column of both the tables and the Names
column will show the names from both table related to the specific ID. For example, In Table_1
, both N1
and N2
has a match with 1. So, in the Result
table, we will have a row for 1 with N1, N2 as a comma separated in the Names
column.
How can I achieve using Group BY and Aggregation in Google BigQuery?
CodePudding user response:
Consider below approach
select match, string_agg(name) names from (
select name, match from table_1 t, t.matches match
union all
select name, match from table_2 t, t.matches match
)
group by match
if applied to sample data in your question - output is
CodePudding user response:
WITH Table_1 AS
(
SELECT 'N1' AS Name, [1,2] AS Matches UNION ALL
SELECT 'N2' AS Name, [1,4] AS Matches UNION ALL
SELECT 'N3' AS Name, [2,3] AS Matches
),
Table_2 AS
(
SELECT 'N2' AS Name, [10,11] AS Matches UNION ALL
SELECT 'N4' AS Name, [12] AS Matches UNION ALL
SELECT 'N5' AS Name, [10] AS Matches
)
SELECT
Match,
STRING_AGG(Name,', ' ORDER BY Name ASC) AS Names
FROM
(
SELECT
Name,
Match
FROM
Table_1,
UNNEST(Table_1.Matches) AS Match
UNION ALL
SELECT
Name,
Match
FROM
Table_2,
UNNEST(Table_2.Matches) AS Match
)
GROUP BY
Match
A slight addition to STRING_AGG to explicitly specify delimiter format as requested in the question.