Remove part of string within a loop in python-CodePudding

Keep in mind this in within a loop.

How can I remove everything from "?" and so on?

So that "something_else_1" gets deleted

Url_before = "https:www.something.com?something_else_1"


Url_wanted = "https:www.something.com?"

In practice it looks kinda like this:

find_href = driver.find_elements(By.CSS_SELECTOR, 'img.MosaicAsset-module__thumb___yvFP5')

with open("URLS/text_urls.txt", "a ") as textFile:
            for my_href in find_href:
                textFile.write(str(my_href.get_attribute("src")) "#do_something_to_remove_part_after_?_in_find_href" "\n")

CodePudding user response：

Provided there's only one instance of "?" in the string and you want to remove everything after it, you could find the index of this character with

i = Url_before.index("?")

and then remove everything after it:

Url_wanted = Url_before[:i 1]

CodePudding user response：

Use re:

import re
Url_before = "https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?k=20&m=503337620&s=612x612&w=0&h=3G6G_9rzGuNYLOm9EG4yiZkGWNWS7yadVoAen2N80IQ="
re.sub('\\?. ', '', Url_before)   "?"
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'

Alternatively you could split the string on ? and keep the first part:

Url_before.split("?")[0]   "?" # again adding the question mark
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'

EDIT: Added "?" because I realised you wanted to keep it.