I am trying to extract certain character from a variable using Python Indexing an Slicing. I have the following variable
myvar =
'https://mystorageacct.blob.core.windows.net/testcontainer/29112013 FDD_Exec Summary 29 Nov 2013.pdf'
I am trying to extract
29112013 FDD_Exec Summary 29 Nov 2013
I have tried such indexing as
grab = myvar[:10] but the result doesn't give me 29112013 FDD_Exec Summary 29 Nov 2013.
Any thoughts?
CodePudding user response:
I might suggest using str.rpartition
so that you don't need to separately go through the work of figuring out which indices to slice on:
>>> myvar = 'https://mystorageacct.blob.core.windows.net/testcontainer/29112013 FDD_Exec Summary 29 Nov 2013.pdf'
>>> myvar.rpartition("/")[2]
'29112013 FDD_Exec Summary 29 Nov 2013.pdf'
>>> myvar.rpartition("/")[2].rpartition(".")[0]
'29112013 FDD_Exec Summary 29 Nov 2013'
CodePudding user response:
I recommend using the built-in pathlib
for anything having to do with filepaths. Seems to work fine for URLs:
import pathlib
filename = pathlib.Path(myvar).name
Output:
'29112013 FDD_Exec Summary 29 Nov 2013.pdf'
CodePudding user response:
If you want the last 10 characters, you could use:
myvar[-10:]
so: start from the 10th to last character and go to the last.
If this should be more general, you would look at your strings structure and split it, e.g. by spaces and then take the correct value
CodePudding user response:
myvar = 'https://mystorageacct.blob.core.windows.net/testcontainer/29112013 FDD_Exec Summary 29 Nov 2013.pdf'
# Split the string by '/' and you get a list of
# ['https:', '', 'mystorageacct.blob.core.windows.net', 'testcontainer','29112013 FDD_Exec Summary 29 Nov 2013.pdf']
# [-1] index is to pick the last one
# .replace('.pdf','') is to remove the '.pdf'
extract = myvar.split('/')[-1].replace('.pdf','')
print(extract)
>>> 29112013 FDD_Exec Summary 29 Nov 2013