Home > Net >  How to apply 'where' filter to a transformed column in BigQuery?
How to apply 'where' filter to a transformed column in BigQuery?

Time:04-27

I have a column event_title with values:

event_title
2/24/2022 - event foo
2/25/2022 - event bar
event baz

I want to parse the date strings as a date, and filter out records where no date string is present.

My query is:

SELECT
    event_title,
    PARSE_DATE('%D', REGEXP_EXTRACT(event_title,r'\d{1,2}\/\d{1,2}\/\d{1,4}')) AS event_date
FROM my_table
WHERE event_date is not null;

with expected output being:

event_title event_date
2/24/2022 - event foo 2/24/2022
2/25/2022 - event bar 2/25/2022

I get an error saying Unrecognized name: "event_date"

Seems like it should be possible to filter by a transformed column with an alias, am I mistaken? Any suggestions on best practices for implementing this query?

CodePudding user response:

The answer by Mikhail above explains perfectly. You could also use your own column as a filter as well :

SELECT
  event_title,
  PARSE_DATE('%D',REGEXP_EXTRACT(event_title,r'\d{1,2}\/\d{1,2}\/\d{1,4}')) AS event_date
FROM my_table
WHERE PARSE_DATE('%D',REGEXP_EXTRACT(event_title,r'\d{1,2}\/\d{1,2}\/\d{1,4}')) IS NOT NULL;

OR the CTE:

WITH events AS
  (
     SELECT
       event_title,
       PARSE_DATE('%D',REGEXP_EXTRACT(event_title,r'\d{1,2}\/\d{1,2}\/\d{1,4}')) AS event_date
     FROM my_table
  )

SELECT * FROM events WHERE event_date IS NOT NULL;

CodePudding user response:

Column alias is available after result/output is created, while WHERE clause is used for that output creation.

SO, obviously there are many ways to rewrite your query - one way is below

SELECT
    event_title,
    PARSE_DATE('%m/%d/%Y', REGEXP_EXTRACT(event_title,r'\d{1,2}\/\d{1,2}\/\d{1,4}')) AS event_date
FROM my_table
WHERE REGEXP_CONTAINS(event_title,r'\d{1,2}\/\d{1,2}\/\d{1,4}')    

if applied to sample data in your question - output is

enter image description here

  • Related