Home > database >  How to fin and extract substring BIGQUERY
How to fin and extract substring BIGQUERY

Time:02-16

A have a string column at BigQuery table for example:

name
WW_for_all_feed
EU_param_1_for_all_feed
AU_for_all_full_settings_18
WW_for_us_param_5_for_us_feed
WW_for_us_param_5_feed
WW_for_all_25

and also have a list of variables, for example :

param_1_for_all
param_5_for_us
param_5 
full_settings

And if string at column "name" contains one of this substrings needs to extract it :

name param
WW_for_all_feed None
EU_param_1_for_all_feed param_1_for_all
AU_for_all_full_settings_18 full_settings
WW_for_us_param_5_for_us_feed param_5_for_us
WW_for_us_param_5_feed param_5
WW_for_all_25 None

I want to try regexp and replace, but don't know pattern for find substring

CodePudding user response:

Use below

select name, param
from your_table
left join params 
on regexp_contains(name, param)    

if apply to sample data as in your question

with your_table as (
  select 'WW_for_all_feed' name union all
  select 'EU_param_1_for_all_feed' union all
  select 'AU_for_all_full_settings_18 ' union all
  select 'WW_for_us_param_5_for_us_feed' union all
  select 'WW_for_all_25 ' 
), params as (
  select 'param_1_for_all' param union all
  select 'param_5_for_us' union all
  select 'full_settings' 
)    

output is

enter image description here

but I have an another issue (updated question) If one of params is substring for another?

use below then

select name, string_agg(param order by length(param) desc limit 1) param
from your_table
left join params 
on regexp_contains(name, param)
group by name

if applied to your updated data sample - output is

enter image description here

  • Related