Home > Enterprise >  Find all links with specific domain in the text with regex
Find all links with specific domain in the text with regex

Time:02-15

I have a text with links inside, so I try to match them with regex, but somehow the last step I miss..

Link to regex - https://regex101.com/r/pXzZvA/1

The text:

Some text with many letters and some kind of bla bla text
With links - -https://sub.mydomain.com/products/art-for-selling-1   - another word

-https://sub.mydomain.com/products/art-for-selling-1 
https://sub.mydomain.com/products/art-for-selling-1 

paf paf

pew pew 

sub.mydomain.com/products/art-for-selling-1

Here is the regex I use:

/(?:https?:\/\/)?(?:[^\.] \.)?sub.mydomain.com(\/.*)$/gm

What do I miss is with all the matches I have match of "https://sub.mydomain.com/products/art-for-selling-1 - another word" with " - another word" together. I need to add space exclusion to the end of the regex.

CodePudding user response:

use: (?:https?:\/\/)?(?:\w \.)?sub\.mydomain\.com\/(?:\w -?\/?)

(?:https?:\/\/) : contain https:// or not (http or https)

(?:\w \.)? : contain a word follow by . or not

sub\.mydomain\.com\/ : must contain sub.mydomain.com/

(?:\w -?\/?) : contain many forms of abc/abc/abc/.../... or not (and may be contained a - after the word or not)

  • Related