Home > Mobile >  Python/Regex: Find all the links in a string, and modify them
Python/Regex: Find all the links in a string, and modify them

Time:06-21

I want to use the python re regex library to find all the links inside a string, and modify them. Ideally, I want to get those links, encode them, and place them inside my own URL. Simple example:

"Hello StackOverflow, http://localhost:2000/login.
Please help me with this one https://www.facebook.com/"

to

"Hello StackOverflow, test/http://localhost:2000/login.
Please help me with this one test/https://www.facebook.com/"

CodePudding user response:

I would use re.sub here:

inp = """Hello StackOverflow, http://localhost:2000/login.
Please help me with this one https://www.facebook.com/"""
output = re.sub(r'(https?://)', r'test/\1', inp)
print(output)

This prints:

Hello StackOverflow, test/http://localhost:2000/login.
Please help me with this one test/https://www.facebook.com/

CodePudding user response:

Try this:

re.findall(r'(https?://[^\s] )', myString)

Then you can modify them and create a new string with the new URLs

CodePudding user response:

Here is my answer

import re
  
def Find(string):
  
    # findall() has been used 
    # with valid conditions for urls in string
    regex = r"(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-] [.][a-z]{2,4}/)(?:[^\s()<>] |\(([^\s()<>] |(\([^\s()<>] \)))*\)) (?:\(([^\s()<>] |(\([^\s()<>] \)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))"
    url = re.findall(regex,string)      
    return [x[0] for x in url]
      

string = "Hello StackOverflow, http://localhost:2000/login. Please help me with this one https://www.facebook.com/"
urls = Find(string)

for url in urls:
    string = string.replace(url, 'test/' url)
print(string)

Here is the output

Hello StackOverflow, test/http://localhost:2000/login. Please help me with this one test/https://www.facebook.com/
  • Related