Home > OS >  Python Download Using CSV file
Python Download Using CSV file

Time:08-26

I am trying to figure out how to use Python to download files listed in a CSV file and use the CSV file to name the download. So my CSV file would look like this:

HTTP://www.example.com/filetodownload.jpg,mypicture1.jpg
HTTP://www.example.com/2ndfiletodownload.jpg,mypicture2.jpg

The script would read the CSV file, download it from the first field, and name it with the value in the second field. It would cycle through the CSV file until the end.

Does anyone have any suggestions?

EDIT: I didn't include what I have so far...sorry about that. This will download the files but does not rename them with the value after the comma.

import csv, sys
import requests
import urllib2
import os

filename = 'test.csv'
with open(filename, 'rb') as f:
    reader = csv.reader(f)
    try:
        for row in reader:
            if 'http' in row[0]:
                #print row
                rev  = row[0][::-1]
                i  = rev.index('/')
                tmp = rev[0:i]
                #print tmp[::-1]
                rq = urllib2.Request(row[0])
                res = urllib2.urlopen(rq)
                if not os.path.exists("./" tmp[::-1]):                
                    pdf = open("./"   tmp[::-1], 'wb')
                    pdf.write(res.read())
                    pdf.close()
                else:
                    print "file: ", tmp[::-1], "already exist"
    except csv.Error as e:
        sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))

CodePudding user response:

You can use the requests module to download the file, the os module to rename them and the csv module to read csv files. Here is a very simple example:

import csv
import requests
import os

filename = "file.csv"


with open(filename, 'r') as file:
  # Get list from csv file
  csvFile = csv.reader(file)

  # data of download
  dat = None
  for item in csvFile:
      # get the length of the list
      args = len(item)

      pairs = args // 2
      for i in range(pairs):
          
        try:
          dat = requests.get(item[i], timeout=0.5)
        except:
          continue
        dat = dat.content
        with open(item[i 1], "wb") as download:
          download.write(dat)

  • Related