I have a list of urls:
urls = ['https://www.website.com/407161a', 'https://www.website.com/359426a', 'https://www.website.com/441885a', 'https://www.website.com/331791a']
I'm trying to split each url string on /
and get the last element, for the following output: ['407161a', '359426a', '441885a', '331791a', '403123a']
.
I use this code to get the above: [u.rpartition('/')[2] for u in urls]
. The problem: it's somewhat slow on larger lists. It takes ~0.75 seconds on a list with 2 million urls on my machine. I'm trying to find a faster method since I'll be running this multiple times on lists containing 10 million elements.
Is there a way to make this code faster?
CodePudding user response:
You can use str.rsplit
with maxsplit parameter as 1
(using rsplit
with maxsplit parameter will just avoid unnecessary splits), then take the last element. You can use list-comprehension for each of the items.
>>> [i.rsplit('/',1)[-1] for i in urls]
['407161a', '359426a', '441885a', '331791a']
CodePudding user response:
You can use str.rindex
and find index of '/'
from last
then return all chars from that index to last string.
Try this:
[u[u.rindex('/') 1:] for u in URLs]
# ['407161a', '359426a', '441885a', '331791a']