I have a big problem with this regex.
I have a string, which can contain multiple and differents links anywhere inside. I need to take those links and make a list of them, then i elaborate them with an url shortener. Then have to replace them sequentially in the string with the new link i have. For the first part i've done this:
links = []
links_in_message = re.findall(r'(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-] [a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-] [a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9] \.[^\s]{2,}|www\.[a-zA-Z0-9] \.[^\s]{2,})', message.text)
if links_in_message:
links.extend(links_in_message)
And for example this string:
string = 'Hello www.fb.com/home how are you https://twitter.it/home ?'
should become (where the link are not a substitution of the domain with rere.me, but every link is taken sequentially from my links list):
//Result = 'Hello www.rere.me/home how are you https://rere.me/home ?'
I'm thinking about deleting the links from the string and help me saving link index in string to compose a new string but i was wondering if there was another way. Thank you.
CodePudding user response:
You could use string.replace
since you have the exact text (links) that you want to replace.
For example:
for link in links:
string = string.replace(link, shorten_link(link), 1)