I'm trying to use Python to download XML files from this site:
But the following examples are both leaving me with just an empty XML file. The first saves me an "InsecureRequestWarning" message but the outcome of both is the same.
r = requests.get('https://media.waec.wa.gov.au/2022 North West Central By-Election - LA VERBOSE RESULTS.xml', verify='~ file path for locally saved site certificate PEM file ~')
r.raw.decode_content = True
with open('~ file path for saved file ~', 'wb') as f:
shutil.copyfileobj(r.raw, f)
r = requests.get('https://media.waec.wa.gov.au/2022 North West Central By-Election - LA VERBOSE RESULTS.xml', verify=False)
r.raw.decode_content = True
with open('~ file path for saved file ~', 'wb') as f:
shutil.copyfileobj(r.raw, f)
CodePudding user response:
You receive an empty file, because you didn't receive a response. When I tried your snippet I received http 403 status code. This happened because this site didn’t accept a request without headers
Below you can find code, which makes me able to save the result to the xml file.
import requests
headers = {'User-Agent': 'Python User Agent'}
url = 'http://media.waec.wa.gov.au/2022 North West Central By-Election - LA VERBOSE RESULTS.xml'
res = requests.get(url, headers=headers)
with open('my_file.xml', 'w') as file:
file.write(res.text)