Home > OS >  How to extract urls from a string that doesn't contain https or www
How to extract urls from a string that doesn't contain https or www

Time:06-23

Consider a string

let a =  "I visit google.com often times but.. not amazon.uk"

How to extract google.com and amazon.uk from the string above in JavaScript

CodePudding user response:

Try this :

let a =  "I visit google.com often times but.. not amazon.uk"
a.match(/("[^"] "|[^"\s] )/g);

Output:

[
    "I",
    "visit",
    "google.com",
    "often",
    "times",
    "but..",
    "not",
    "amazon.uk"
]

CodePudding user response:

Here is one way to do it

\s(\w )(.uk|.com)\b

https://regex101.com/r/HFyxEJ/1

Result [('google', '.com'), ('amazon', '.uk')]

CodePudding user response:

To solve this problem I've created an API to extract URLs from a string or an array of strings

Base Url -> https://urlsparser.herokuapp.com/

GET https://urlsparser.herokuapp.com/url

For a single string

{
  "string" : "More here http://action.mySite.com/trk.php?mclic=P4CAB9542D7F151&urlrv=http://jeu-centerparcs.com/#!/?idfrom=8&urlv=517b975385e89dfb8b9689e6c2b4b93d text<br/>And more here http://action.mySite.com/trk.php?mclic=P4CAB9542D7F151&urlrv=http://jeu-centerparcs.com/#!/?idfrom=8&urlv=517b975385e89dfb8b9689e6c2b4b93d"
}

For an array of strings

{
  "string" : ["string1","string2"....]
}

Screenshot

2

Advantages

  1. Has more than 900 domain extensions [.com,.io,....]
  2. Faster, extracts result in less than 20ms
  • Related