Home > Back-end >  Regex to retrieve dataset and table name using dot in SQL BigQUery
Regex to retrieve dataset and table name using dot in SQL BigQUery

Time:10-15

I have an input table which has a list of all queries run in Bigquery. I need to create two columns one with the dataset name and other with table name.

The queries example is as follows (some could be nested). I need to separate the dataset name and tablename using the dot in the string

Queries dataset table
select * from advp.accounts ; advp accounts
select * from gqd.customers where id in (select id from gwq.facts) ; gqd customers
gwq facts
create table hsq.sales (id int64, ss string); hsq sales

CodePudding user response:

You can try using REGEXP_EXTRACT_ALL to extract all instances of "<database>.<table>", then unnest the corresponding arrays.

WITH cte AS (
    SELECT *, REGEXP_EXTRACT_ALL(Queries, r'([a-zA-Z0-9] )\.') AS db_names, 
              REGEXP_EXTRACT_ALL(Queries, r'(\.[a-zA-Z0-9] )') AS table_names
    FROM tab
)
SELECT cte.Queries, dataset, table_
FROM cte, cte.db_names AS dataset, cte.table_names AS table_
  • Related