Home > Software engineering >  How to convert a capturing group to a non-capturing group in this regex?
How to convert a capturing group to a non-capturing group in this regex?

Time:12-19

I am trying to replace all urls in the text with hyperlink using regular expression. The urls must start with either http:// or https://. And they must contain some TLD, e.g. .com, .org, or .co.uk etc.

Below is my regex pattern in PHP:

$pattern = "/(http) (s)?:\/\/(\S) (\.){1}/i";

So if you use the following code:

$str = "this http://dd is a String http://www.example.com and this a String https://anotherexample.co.uk";

echo preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);

It gives me following output:

enter image description here

You can see that the TLD part is not included in the hyperlink. So how can I convert capturing group (\.){1} to non-capturing group to also cover TLD?

CodePudding user response:

Use the following pattern:

https?:\/\/[a-z] \.[a-z] [.a-z]* /i

  1. Keep the 's' in 'https' optional using ?
  2. Use [a-z] to capture the first set of letters after 'https'
  3. Ensure there is at least one '.' followed by one or more letters
  4. The rest of the slug is optional and can appear zero or more times [.a-z]*

Demo

CodePudding user response:

You can try this:

$pattern = "/https?:\/\/(\S) (\.\w{1,4}) /i"
echo preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);

In the pattern:

https? : http or https

(\.\w{1,4}) : TLDs like .co .com or something like .co.uk the maximum length of each TLD is 4 here but you can change that.

  • Related