I'm trying to extract a string from multiple URLs that all have one thing in common even though they are built differently. Let me give you a few examples:
/cz/category/79478/productname
/https://www.store.net/de/category/49448/productname
/https://www.store.net/category/62448/productname
/category/79455/productname
I'm using BigQuery and I'm able to write a Regexp_extract
clause for individual examples, however, I'm looking for one way of extracting the number (as string) after category/
, (79478
from the first url). All the addresses have /category/
part in common so it should be doable from my point of view.
Here's the expression that I've been trying to use:
regexp_extract(page_path, '[^category/] /([^/] )/')
But it doesn't work. Any idea what I'm doing wrong here?
CodePudding user response:
Use a noncapture group for the leading /category/
?
regexp_extract(page_path, '(?:/category/)([^/] )')