I hope to use the following code to get all URLs from a string.
But I only the three URLs ,there are http://www.google.com
, https://www.twitter.com
and www.msn.com
.
I hope I can get all URLs include bing.com
in the result, how can I modifty the var expression = /(https?:\/\/(?:www\.| ...
?
function openURLs() {
let links = "http://www.google.com Hello https://www.twitter.com The www.msn.com World bing.com";
if (links) {
var expression = /(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-] [a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-] [a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9] \.[^\s]{2,}|www\.[a-zA-Z0-9] \.[^\s]{2,})/gi;
var url_array = links.match(expression);
if (url_array != null) {
url_array.forEach((url) => {
urlOK = url.match(/^https?:/) ? url : '//' url;
window.open(urlOK)
});
}
}
}
CodePudding user response:
Going off of what you currently have, you can just append |[a-zA-Z0-9] \.[^\s]{2,}
to the end of your expression. The resulting line will look like this:
var expression = /(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-] [a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-] [a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9] \.[^\s]{2,}|www\.[a-zA-Z0-9] \.[^\s]{2,})|[a-zA-Z0-9] \.[^\s]{2,}/gi;
This could be cleaner, but it'll do what you're asking.
Edit:
If you're okay with something slightly more permissive that can pull the same URLs out, you can try this expression:
var expression = /(?:https?:\/\/)?(?:www\.)?[\w.-] \.\S{2,}/gi;
CodePudding user response:
A permissive regular expression may be the following:
var expression = /(https?:\/\/)?[a-zA-Z0-9] \.[a-zA-Z0-9] \S*/
This expression is simpler and easier to debug. Furthermore, it will match any website, including the ones with query params (example.com?param=value
) or with no ASCII characters (example.com/你好
).
Here you can see a test.
On the other hand, it will match things that aren't websites as soon as they contain a dot, so things like foo.bar
will be matched. However, there is no reliable way to detect whether strings like foo.bar
are actually websites.