Home > Enterprise >  Create a new column with an array of column names based on condition
Create a new column with an array of column names based on condition

Time:09-28

I need to create a new column in a Bigquery table. This new column values must be an array of the colour column names based on a condition. My table "Users" looks like:

User Red Blue Green Yellow Other columns not related
First 1 1 0 1
Second 0 0 1 0
Third 1 1 0 0

and the expected table or view must be something like this:

User Red Blue Green Yellow Array_col
First 1 1 0 1 Red, Blue, Yellow
Second 0 0 1 0 Green
Third 1 1 0 0 Red, Blue

I am relatively new to SQL and Bigquery sintax, I've tried creating a view and also with

ALTER TABLE myproject.mydataset.users`

and I get the column names with this:

SELECT column_names
FROM `myproject.mydataset`.INFORMATION_SCHEMA.COLUMNS
WHERE (table_name = 'users')

But I can't find how to create the array list and make a column with them. I'm crashing against a wall.

CodePudding user response:

Maybe this will be ok:

with mytable as (
  select 'First' as user, 1 as Red, 1 as Blue, 0 as Green, 1 as Yellow union all
  select 'Second' as user, 0 as Red, 0 as Blue, 1 as Green, 0 as Yellow union all
  select 'Third' as user, 1 as Red, 1 as Blue, 0 as Green, 0 as Yellow
)
select *, split(trim(if(Red=1, 'Red', '') || if(Blue=1, ' Blue', '') || if(Green=1, ' Green', '') || if(Yellow=1, ' Yellow', ''), ' '), ' ')
from mytable

CodePudding user response:

Consider below approach

select *, 
  regexp_extract_all(translate(to_json_string(t), '{}"', ''), r'(\w ):1') as Array_col
from mytable t       

if applied to sample data in your question - output is

enter image description here

CodePudding user response:

Now you can try super new syntax for BigQuery! Let's do this with UNPIVOT statement:

WITH mytable AS (
  SELECT 'first' AS user, 1 AS Red, 1 AS Blue, 0 AS Green, 1 AS Yellow UNION ALL
  SELECT 'second' AS user, 0 AS Red, 0 AS Blue, 1 AS Green, 0 AS Yellow UNION ALL
  SELECT 'third' AS user, 1 AS Red, 1 AS Blue, 0 AS Green, 0 AS Yellow
)
SELECT t.*, Array_col FROM (
    SELECT user, ARRAY_AGG(color_name) Array_col FROM mytable 
    UNPIVOT(check_color FOR color_name IN (Red,Blue,Green,Yellow))
    WHERE check_color = 1
    GROUP BY user    
)
JOIN mytable t USING (user) 
  • Related