Home > OS >  Unable to use urllib.request to download file from website
Unable to use urllib.request to download file from website

Time:04-20

I am attempting to use the python urllib.request library to download .pdb (protein data bank) files with the full predicted molecular structure of a given protein from the alphafold website. In this example, I am attempting to download a protein with a uniprot ID of Q9BY15. The entry enter image description here

And the manually downloaded file has the following naming format;

enter image description here

Here is the block of code I am using (in its simplest form)

import os
import urllib
import urllib.request

url = 'https://alphafold.ebi.ac.uk/entry/'
prot = 'Q9BY15'
alphaname = 'AF-'   prot   '-F1-model_v2.pdb'
urllib.request.urlretrieve(url   prot, alphaname)

And here is the file that I get when I run the code;

enter image description here

As you can see, the file is far smaller than the actual size of the real file (despite having the exact same name), and is effectively empty when viewing it through protein identification programs. How would I rewrite this code to pull the actual file?

CodePudding user response:

I'm not sure if this will solve your problem but the correct url for downloading the pdb file of Q9BY15 is https://alphafold.ebi.ac.uk/files/AF-Q9BY15-F1-model_v2.pdb

Try replacing /entry/ in the link with /files/.

  • Related