I am trying to figure out how to use Python to download files listed in a CSV file and use the CSV file to name the download. So my CSV file would look like this:
HTTP://www.example.com/filetodownload.jpg,mypicture1.jpg
HTTP://www.example.com/2ndfiletodownload.jpg,mypicture2.jpg
The script would read the CSV file, download it from the first field, and name it with the value in the second field. It would cycle through the CSV file until the end.
Does anyone have any suggestions?
EDIT: I didn't include what I have so far...sorry about that. This will download the files but does not rename them with the value after the comma.
import csv, sys
import requests
import urllib2
import os
filename = 'test.csv'
with open(filename, 'rb') as f:
reader = csv.reader(f)
try:
for row in reader:
if 'http' in row[0]:
#print row
rev = row[0][::-1]
i = rev.index('/')
tmp = rev[0:i]
#print tmp[::-1]
rq = urllib2.Request(row[0])
res = urllib2.urlopen(rq)
if not os.path.exists("./" tmp[::-1]):
pdf = open("./" tmp[::-1], 'wb')
pdf.write(res.read())
pdf.close()
else:
print "file: ", tmp[::-1], "already exist"
except csv.Error as e:
sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))
CodePudding user response:
You can use the requests
module to download the file, the os
module to rename them and the csv
module to read csv files. Here is a very simple example:
import csv
import requests
import os
filename = "file.csv"
with open(filename, 'r') as file:
# Get list from csv file
csvFile = csv.reader(file)
# data of download
dat = None
for item in csvFile:
# get the length of the list
args = len(item)
pairs = args // 2
for i in range(pairs):
try:
dat = requests.get(item[i], timeout=0.5)
except:
continue
dat = dat.content
with open(item[i 1], "wb") as download:
download.write(dat)