I need a url validator regex with this criteria:
- protocol (HTTP, HTTPS) is optional. But if any protocol is given, it must be in the correct format, i.e. protocol:domain, or protocol://domain.
- www is optional
- it's possible to use direct IP address for this.
So based on the criteria, these should pass:
- http://www.google.com
- google.com
- abc.def.ghi/hij
- https:216.239.38.120
- 216.239.38.120
These should not pass:
- hello
- hello/world
- abc://def.ghi
- ftp:google.com
The closest regex I've found is from here:
^((?:.|\n)*?)((http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9] ([\-\.]{1}[a-z0-9] )([-A-Z0-9.] )(/[-A-Z0-9 &@#/%=~_|!:,.;]*)?(\?[A-Z0-9 &@#/%=~_|!:,.;]*)?)
But unfortunately, google.com
doesn't pass. It needs to have www.
as a prefix. Can you improve this regex so www.
becomes optional?
CodePudding user response:
It looks like the following pattern matches your criteria:
^(?:https?:\/\/(?:www\.)?|https:(?:\/\/)?)?\w (?:[-.]\w ) (?:\/[^\/\s] )*$
See the regex demo. Details:
^
- start of the string(?:https?:\/\/(?:www\.)?|https:(?:\/\/)?)?
- an optional sequence of:https?:\/\/(?:www\.)?
-http
orhttps
,://
and then an optionalwww.
substring|
- orhttps:(?:\/\/)?
-https:
and then an optional//
string
\w
- one or more word chars(?:[-.]\w )
- one or more sequences of a.
or-
followed with one or more word chars(?:\/[^\/\s] )*
- an optional sequence of a/
an then one or more chars other than/
and whitespace$
- end of string.