I am trying to extract some incomplete URLs from strings. Let me give you an example of what I mean by incomplete URL:
tny.sh/FJFCG8w
gka.co/cte3
google.com
cdn.ne/ecoe3
I have checked a bunch of solutions that use regex to detect the prefix like HTTP and stuff. but the above-mentioned links are links without prefixes. so does it possible to do it?
This is the method that I have tried to extract the URLs with it in a string:
protected LinkedList<string> ExtractLink(string txt)
{
var linkParser = new Regex(@"\b(?:https?://|www\.)\S \b", RegexOptions.Compiled | RegexOptions.IgnoreCase);
LinkedList<string> urls = new LinkedList<string>();
foreach (Match m in linkParser.Matches(txt))
urls.AddFirst(m.Value);
return urls;
}
And this is an example of calling the method:
ExtractLink("Hello, this is the link that you need to check tny.sh/FJFCG8w");
CodePudding user response:
You can use this regex instead
[-a-zA-Z0-9@:%._\ ~#=]{2,256}\.[a-z]{2,6}([-a-zA-Z0-9@:%_\ .~#?&\/=])*
If you also want to match the urls with the http(s) protocol use this
(https?:\/\/)?[-a-zA-Z0-9@:%._\ ~#=]{2,256}\.[a-z]{2,6}([-a-zA-Z0-9@:%_\ .~#?&\/=])*