Trying use REGEXP_SUBSTR to get the type/fruit extracted and group on this.
Data sample e.g
www.fruits.com/type/banana?=asdasdasdwa
www.fruits.com/type/Apple?=asdasdasdwa
www.fruits.com/type/Apple?=asdasd21414
www.fruits.com/type/Orange?=asdasdasdwa
Using following SQL
SELECT DATE
COUNT(*) AS Sessions
,REGEXP_SUBSTR(hits.page.pagePath, '/type/?$',1, 1)
FROM
`159484249.ga_sessions_*` WITH_VISIT
LEFT JOIN UNNEST(hits) AS hits
LEFT JOIN UNNEST(hits.customdimensions) AS customdimension
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 YEAR)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
AND ((totals.pageviews>1 or totals.pageviews is null) or totals.totalTransactionRevenue is not null or totals.newVisits is null or totals.timeOnSite is not null)
AND hits.page.pagePath LIKE '%/type/%' AND hits.type = 'PAGE'
GROUP BY DATE,REGEXP_SUBSTR(hits.page.pagePath, '/type/?$',1, 1)
ORDER BY DATE ASC, Sessions ASC;
Desired result
Date | Sessions | Type
01/01/2021 | 22342 | type/apple
02/01/2021 | 1000 | type/orange
03/01/2021 | 2000 | type/banana
My query above returns null for Type
CodePudding user response:
Below you can see the usage of the regex string I provided in the comment to the question:
This would produce the correct substring you are looking for
with sample_data as (
select 'www.fruits.com/type/banana?=asdasdasdwa' as url union all
select 'www.fruits.com/type/Apple?=asdasdasdwa' union all
select 'www.fruits.com/type/Apple?=asdasd21414' union all
select 'www.fruits.com/type/Orange?=asdasdasdwa' union all
select 'www.fruits.com/type/ban-a-na?=asdasdasdwa'
)
select regexp_substr(url, r'type\/\w ') original
, regexp_substr(url, r'type\/([\w|-] )?') dashses
from sample_data
This produces the following