I am trying to find a regex that matches URLs that include or not 'www', is followed by valide strings that can indlude dots, but not two or more consecutive dots. For sake of simplicity, I am limiting the problem only to URLs with subdomains and with .com domain. For example:
www.aBC.com #MATCH
abc.com #MATCH
a_bc.de8f.com #MATCH
a.com #MATCH
abc #NO MATCH
abc..com #NO MATCH
The closest I got with my regex is \w .[\w] .com
but this does not match a simple "a.com". I am using "\w" instead of "." because otherwise I don't know how to avoid two or more dots in sequence.
Any help is appreciated.
CodePudding user response:
Use
(?:\w \.)*\w \.com
See regex proof.
EXPLANATION
-------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\w word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
\w word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
com 'com'