How to match `.[domain]` for this regex that i have, rather than the first `.`-CodePudding

Given this URL regex

((https?:\/\/?(www\.))?|(www\.))[-a-zA-Z0-9@:%._ ~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_ .~#?&//=]*)

How can I properly make it trigger for the following

www.google.com
https://www.google.com
google.com

but also make it only trigger after the .[domain]

the current regex matches google.c but starts to match www.g right off the bat

edit: new regex

(https:\/\/(www\.)|(www\.))[-a-zA-Z0-9@:%._ ~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_ .~#?&//=]*)

the above regex works as expected for https://www.google.com and www.google.com, but does not yet work for google.com

CodePudding user response：

You just need to remove \b and replace * with . Also, if you don't want to match www.google, this is what needs to be done:

^ - Matches the beginning of the string.
$ - Matches the end of the string.
(?!www) - Negative lookahead, that excludes the string www.

^((https?:\/\/?(www\.))?|(www\.))(?!www)[-a-zA-Z0-9@:%_ ~#=]{1,256}\.[a-zA-Z0-9()]{0,6}([-a-zA-Z0-9()@:%_ .~#?&//=] )$

See result: https://regexr.com/6pmqb

CodePudding user response：

You can use a positive lookahead to assert that what follows is a domain:

((https?:\/\/?(www\.))?|(www\.))[-a-zA-Z0-9@:%._ ~#=]{1,256}\.(?=[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_ .~#?&\/=]*))[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_ .~#?&\/=]*)