Home > Software engineering >  How to extract only domain country from URL (Data Studio / regex)
How to extract only domain country from URL (Data Studio / regex)

Time:05-16

I'm struggling with extracting from URL only country for example .pl from https://www.google.pl.

At this moment I'm able to extract google.pl from provided url using the following code:

TRIM(REGEXP_EXTRACT(REGEXP_REPLACE(REGEXP_REPLACE(URL, "https?://", ""), R"^(w{3}\.)?", ""), "([^/?] )"))

What is needed to change in this code to provide only .pl instead of example.pl?

Thanks in advance.

CodePudding user response:

You can use

REGEXP_EXTRACT(URL, r'https?://[^/]*\.([^/] )')

See the regex demo. Details:

  • https?:// - https:// or http://
  • [^/]* - zero or more chars other than /
  • \. - a . char
  • ([^/] ) - Group 1: one or more chars other than /.
  • Related