Home > Back-end >  SQL UNION ALL but with lots of columns on BigQuery?
SQL UNION ALL but with lots of columns on BigQuery?

Time:04-21

BG Collection

Above image is a screenshot of my table just as a quick initial reference.

The focal point are the multiple mech columns (mech1, mech2, mech3, and mech4).

Board games in this tables have multiple attributes called mechanisms so I've separated them into 4 different columns.

So I've learned how to combine columns vertically via UNION ALL so that I can query the count of all unique game mechanisms in my table.

However, it got me wondering if there's a shorter and more efficient way to achieve what I've done:

WITH mechanism_info AS
        (
            WITH
                mechanism_col_combined AS
                    (
                        SELECT mech1 AS all_mech_columns_combined
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        UNION ALL
                        ## There's no IS NOT NULL condition defined for column 'mech1' since there's at least one mechanism noted for a game.
                        SELECT mech2
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        WHERE mech2 IS NOT NULL
                        UNION ALL
                        SELECT mech3
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        WHERE mech3 IS NOT NULL
                        UNION ALL
                        SELECT mech4
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        WHERE mech4 IS NOT NULL
                    )
                    ## Temporary table with all mechanism column in the collection combined.
            SELECT DISTINCT(all_mech_columns_combined) AS unique_mechanisms, COUNT(*) AS count
            FROM mechanism_col_combined
            GROUP BY all_mech_columns_combined
            ORDER BY all_mech_columns_combined
        )
SELECT *
FROM mechanism_info

By querying this temp. table, SQL returns the information that I've anticipated as below:

unique_mechanisms | count
Acting            |   1
Action Points     |   3
Action Queue      |   1
Action Retrieval  |   1
Area Movement     |   1
Auction/Bidding   |   5
Bag Building      |   1
Betting & Bluffing|   2
Bingo             |   1
Bluffing          |   7

Now, I want to shorten my code and I know there has to be a way to shorten the repetitive process of combining columns with UNION ALL.

And if there's any other tips or methods on how to shorten my query, please let me know!

Thank you.

CodePudding user response:

You could send join into the table, but the performance would not improve and the query would be just as long.
You can simplify as follows:

SELECT
  mech_column,
  count(*) "number"
FROM  (
       SELECT mech1 AS mech_column
       FROM `ckda-portfolio-2022.bg_collection.base`
         UNION ALL
       SELECT mech2
       FROM `ckda-portfolio-2022.bg_collection.base`
         UNION ALL
       SELECT mech3
       FROM `ckda-portfolio-2022.bg_collection.base`
         UNION ALL
       SELECT mech4
       FROM `ckda-portfolio-2022.bg_collection.base`
       ) m
WHERE mech_column IS NOT NULL
GROUP BY mech_column
ORDER BY mech_column;
       

CodePudding user response:

You can convert the multiple columns [mech1, mech2, ...] into a column of array mech_arr and then using UNNEST to convert the column to have scalar value in each row.

For example:

WITH table1 AS (
    SELECT 'AA' AS mech1, 'BB' AS mech2, 'CC' AS mech3,
    UNION ALL SELECT 'AA' AS mech1, 'CC' AS mech2, 'EE' AS mech3
),
table2 AS (SELECT [mech1, mech2, mech3] AS mech_arr FROM table1)

SELECT mech, COUNT(*) AS mech_counts
FROM table2, UNNEST(mech_arr) AS mech
GROUP BY mech

Output

mech    mech_counts
AA  2
BB  1
CC  2
EE  1
  • Related