Home > OS >  Regex match domain that contain certain subdomain
Regex match domain that contain certain subdomain

Time:06-06

I have this regex (not mine, taken from here)

^[^\.] \.example\.org$

The regex will match *.example.org (e.g. sub.example.org), but will leaves out sub-subdomain (e.g. sub.sub.example.org), that's great and it is what I want.

But I have other requirement, I want to match subdomain that contain specific string, in this case press. So the regex will match following (literally any subdomain that has word press in it).

free-press.example.org
press.example.org
press23.example.org

I have trouble finding the right syntax, have looked on the internet and mostly they works only for standalone text and not domain like this.

CodePudding user response:

Ok, let's break down what the "subdomain" part of your regex does:

[^\.] means "any character except for ., at least once".

You can break your "desired subdomain" up into three parts: "the part before press", "press itself", and "the part after press".

For the parts before and after press, the pattern is basically the same as before, except that you want to change the (one or more) to a * (zero or more), because there might not be anything before or after press.

So your subdomain pattern will look like [^\.]*press[^\.]*.

Putting it all together, we have ^[^\.]*press[^\.]*\.example\.org$. If we put that into Regex101 we see that it works for your examples.

Note that this isn't a very strict check for valid domains. It might be worth thinking about whether regexes are actually the best tool for the "subdomain checking" part of this task. You might instead want to do something like this:

  • Use a generic, more thorough, domain-validation regex to check that the domain name is valid.
  • Split the domain name into parts using String.split('.').
  • Check that the number of parts is correct (i.e. 3), and that the parts meet your requirements (i.e. the first contains the substring press, the second is example, and the third is org).

CodePudding user response:

If you're looking for a regex that matches URLs whose subdomains contain the word press then use

^[^\.]*press[^\.]*\.example\.org$

See the demo

  • Related