I'm trying to find the substring of a list of URLs but having trouble due to an extra character (forward slash) at the end of one of the URL. Below are two examples of the data:
-
CodePudding user response:
You can just trim off the final slash like this:
select REGEXP_SUBSTR(trim(URL, '/'), '/([^/] )$', 1, 1, 'e', 1) as URL_EXTRACT
Parsing the URL to a JSON as Lukasz showed will cover many more possibilities, but this will be faster for large data sets if this is the only problem to correct.
CodePudding user response:
You can use Positive Lookahead for this task:
[^\/\n] (?=\/?$)
Explanation:
[^\/\n]
: any combination of characters other than slash and new line (your match)(?=\/?$)
: positive lookahead (checks if after your match there's)\/?
: optional slash$
: end string
Try it here.