Home > Software engineering >  Select rows in Big Query using CONTAINS_SUBSTR with multiple substrings
Select rows in Big Query using CONTAINS_SUBSTR with multiple substrings

Time:05-12

I am trying to unnest Google Analytics data in Google Big Query. My goal is to find page performance indicators for a selected group of pages of which I only have the code not the entire PagePath.

To do this I am using the CONTAINS_SUBSTR function in the WHERE clause.

When I run the code like this, it works perfectly:

WHERE
    _table_suffix BETWEEN "20210501"
    AND "20210831"
    AND  CONTAINS_SUBSTR(hits.page.PagePath, '/62150/')

However, I am interested in more pages than just /62150/ (~30 pages). Whenever I try to add more pages to the where contains_substring function, it doesnt work. E.g.

    _table_suffix BETWEEN "20210501"
    AND "20210831"
    AND  CONTAINS_SUBSTR(hits.page.PagePath, IN('/62150/', '/27000/'))

Does anyone have any suggestions?

Best,

CodePudding user response:

Consider this one instead of using CONTAINS_SUBSTR:

SELECT * 
  FROM `your_sharded_tables_*`
 WHERE _TABLE_SUFFIX BETWEEN "20210501" AND "20210831"
   AND REGEXP_CONTAINS(hits.page.PagePath, r'\/(62150|27000)\/')
  • Related