Home > OS >  I need a SQL query that will return the URLs in my table with https:// and / or www. removed
I need a SQL query that will return the URLs in my table with https:// and / or www. removed

Time:03-25

I am running a query on a table to return the URL of different organizations and I am trying to remove any instance of 'https://', 'http://' AND 'www.' at the beginning, and any '/' at the end

So for example if I have a URL that is currently returning 'https://www.pizzatest.com/' and I need it to just be pizzatest.com

This is for importing organization domains into Zendesk

I have found a query that works for removing https:// and http:// and any / or ? at the end of the URL, but I can not seem to figure out how to remove the 'www.' from the beginning.

The query was taken from this question here - credit to Fenton Extract hostname from a URL

/* Get just the host name from a URL */     SUBSTRING(@WebAddress,         /* Starting Position (After any '//') */         (CASE WHEN CHARINDEX('//', @WebAddress)= 0 THEN 1 ELSE CHARINDEX('//', @WebAddress)   2 END),         /* Length (ending on first '/' or on a '?') */         CASE             WHEN CHARINDEX('/', @WebAddress, CHARINDEX('//', @WebAddress)   2) > 0 THEN CHARINDEX('/', @WebAddress, CHARINDEX('//', @WebAddress)   2) - (CASE WHEN CHARINDEX('//', @WebAddress)= 0 THEN 1 ELSE CHARINDEX('//', @WebAddress)   2 END)             WHEN CHARINDEX('?', @WebAddress, CHARINDEX('//', @WebAddress)   2) > 0 THEN CHARINDEX('?', @WebAddress, CHARINDEX('//', @WebAddress)   2) - (CASE WHEN CHARINDEX('//', @WebAddress)= 0 THEN 1 ELSE CHARINDEX('//', @WebAddress)   2 END)             ELSE LEN(@WebAddress)         END     ) AS 'HostName'

CodePudding user response:

You can use regexp_replace() to remove the different parts, then use trim() to get rid of any leading of trailing /

trim(regexp_replace(the_column, '(http://)|(https://)|(www.)', '', 'g'), '/')
  • Related