Reading in csv file error: sys.argv[1], IndexError: list index out of range-CodePudding

I work with a student run nonprofit that pairs college students with elementary schoolers to mentor. An old member sent me a python script they wrote to create carpool groups of volunteers. I'm trying to get it to work again on my laptop. It's supposed to read three csv files: one for new applications, one for returning applications, and one for schools available, then it creates groups and returns them in an output text file.

I'm having trouble figuring out how the program is supposed to read in the cvs files in the folder. I pasted the entire code below, but the issue is with roughly the first 10 lines I believe. I also included the error message I'm getting the end.

from csv import reader
import sys

# USE LIKE:
# python sealsort.py /path/to/newm/survey /path/to/retm/survey /path/to/school/doc
newm_path = sys.argv[1] #'./s18_new.csv'
returnm_path = sys.argv[2] #'./s18_return.csv'
schools_path = sys.argv[3] #'./s18_schools.csv'

def psble(schools, mems):
  # Calculates the number of people avaliable for volunteering
  avail = [0]*len(schools['schools'])
  espan = [0]*len(schools['schools'])
  engav = [0]*len(schools['schools'])
  # Ignore drivers in number of people
  for i in range(len(mems['names'])):
    if(mems['assgn'][i] or mems['drive'][i] > 0):
      continue
    for j in range(len(mems['avail'][i])):
      ts = mems['avail'][i][j]
      # Iterate over volunteering slots
      for k in range(len(schools['schools'])):
        if(schools['tid'][k] == ts):
          avail[k]  = 1
          
  for i in range(len(mems['names'])):
    if(mems['assgn'][i] or mems['drive'][i] > 0):
      continue
    for j in range(len(mems['avail'][i])):
      ts = mems['avail'][i][j]
      # Iterate over volunteering slots
      for k in range(len(schools['schools'])):
        if(schools['tid'][k] == ts):
          espan[k]  = mems['espan'][i]
          
  for i in range(len(mems['names'])):
    if(mems['assgn'][i] or mems['drive'][i] > 0 or mems['espan'][i] > 0):
      continue
    for j in range(len(mems['avail'][i])):
      ts = mems['avail'][i][j]
      # Iterate over volunteering slots
      for k in range(len(schools['schools'])):
        if(schools['tid'][k] == ts):
          engav[k]  = 1
          
  return avail, espan, engav
        
def parsenew(newcsv, schools):
  # Parses the new member CSV file and returns a list of lists
  # Make lists to hold info
  order = []
  names = []
  avail = []
  drive = []
  espan = []
  assgn = []
  tasgn = []
  nasgn = []

  c = 0
  with open(newcsv,'rU') as f:
    for line in reader(f):
      if(line == []):
        continue
      order.append(c)
      c  = 1
      names.append(line[1])
      avail.append(line[6])
      drive.append(line[7])
      espan.append(line[14])
      assgn.append(False)
      tasgn.append(-1)
      nasgn.append(-1)
  
  # Map spanish options to numbers
  for i in range(len(espan)):
    if(espan[i] == 'Not at all'):
      espan[i] = 0
      continue
    elif(espan[i] == 'Un poquito'):
      espan[i] = 0
      #names[i]  = ' SPN-PQTO'
      continue
    elif(espan[i] == 'Conversational'):
      espan[i] = 2
      names[i]  = ' SPN-CONV'
      continue
    elif(espan[i] == 'Fluent'):
      espan[i] = 4
      names[i]  = ' SPN-FLNT'
      continue
   
  # Map time avaliability to numbers
  for i in range(len(avail)):
    alist = avail[i].split(',')
    avail[i] = [];
    for time in alist:
      time = time.split('(')[0].replace(' ','')
      if time in schools['tdict']:
        avail[i].append(schools['tdict'][time])
      else:
        print("Invalid time",time)
        
  # Map driving to numbers
  for i in range(len(drive)):
    print(drive[i])
    if(len(drive[i]) == 0):
      drive[i] = 0
      continue
    if(drive[i][0] != 'Y'):
      drive[i] = 0
    else:
      drive[i] = int(filter(str.isdigit, drive[i]))
      names[i]  = ' DRIVER('   str(drive[i])   ')'
  
  odict = {'names': names, 'order': order, 'avail': avail, 'drive': drive, 'espan': espan, 'assgn': assgn, 'tasgn': tasgn, 'nasgn': nasgn}
  return odict

def parsereturn(returncsv, schools):
    # Parses the new member CSV file and returns a list of lists
  # Make lists to hold info
  order = []
  names = []
  avail = []
  drive = []
  espan = []
  assgn = []
  tasgn = []
  nasgn = []

  c = 0
  with open(returncsv,'rU') as f:
    for line in reader(f):
      if(line == []):
        continue
      order.append(c)
      c  = 1
      names.append(line[2])
      avail.append(line[5])
      drive.append(line[6])
      espan.append(line[7])
      assgn.append(False)
      tasgn.append(-1)
      nasgn.append(-1)
  
  # Map spanish options to numbers
  for i in range(len(espan)):
    if(espan[i] == 'Not at all'):
      espan[i] = 0
      continue
    elif(espan[i] == 'Un poquito'):
      espan[i] = 0
      #names[i]  = ' SPN-PQTO'
      continue
    elif(espan[i] == 'Conversational'):
      espan[i] = 2
      names[i]  = ' SPN-CONV'
      continue
    elif(espan[i] == 'Fluent'):
      espan[i] = 4
      names[i]  = ' SPN-FLNT'
      continue
   
  # Map time avaliability to numbers
  for i in range(len(avail)):
    alist = avail[i].split(',')
    avail[i] = [];
    for time in alist:
      time = time.split('(')[0].replace(' ','')
      if time in schools['tdict']:
        avail[i].append(schools['tdict'][time])
      else:
        print("Invalid time",time)
        
  # Map driving to numbers
  for i in range(len(drive)):
    if(len(drive[i]) == 0):
      drive[i] = 0
      continue
    if(drive[i][0] != 'Y'):
      drive[i] = 0
    else:
      drive[i] = int(filter(str.isdigit, drive[i]))
      names[i]  = ' DRIVER('   str(drive[i])   ')'
  
  odict = {'names': names, 'order': order, 'avail': avail, 'drive': drive, 'espan': espan, 'assgn': assgn, 'tasgn': tasgn, 'nasgn': nasgn}
  return odict
  
def parseschools(schoolcsv):
  # Parses the schools CSV file and returns a list of lists
  # Make lists to hold info
  nvolu = []
  voday = []
  vtime = []
  schools = []
  
  with open(schoolcsv,'rU') as f:
    for line in reader(f):
      schools.append(line[1])
      nvolu.append(int(line[2]))
      voday.append(line[3].replace(' ',''))
      vtime.append(line[4].replace(' ',''))
  
  tidc = 0    
  tid = []
  tdict = {} 
  
  for i in range(len(voday)):
    key = voday[i] vtime[i]
    if(key in tdict):
        tid.append(tdict[key])
    else:
      tdict[key] = tidc
      tidc  = 1
      tid.append(tdict[key])
  
  odict = {'schools': schools, 'nvolu': nvolu, 'voday': voday, 'vtime': vtime, 'tid': tid, 'tdict': tdict}
  return odict
  
# Parse schools and make data structure
schools = parseschools(schools_path)
#for i in range(len(schools['schools'])):
#  print(schools['schools'][i],schools['nvolu'][i],schools['voday'][i],schools['vtime'][i],schools['tid'][i])
# Parse new members
newm = parsenew(newm_path, schools)

# Parse returning members
retm = parsereturn(returnm_path, schools)

print('total vols',len(retm['names']) len(newm['names']))
mems = dict(retm)

for i in range(len(newm['names'])):
  mems['names'].append(newm['names'][i])
  mems['order'].append(newm['order'][i])
  mems['avail'].append(newm['avail'][i])
  mems['drive'].append(newm['drive'][i])
  mems['espan'].append(newm['espan'][i])
  mems['assgn'].append(newm['assgn'][i])
  mems['tasgn'].append(newm['tasgn'][i])
  mems['nasgn'].append(newm['nasgn'][i])

# Dict to hold assignments
assign = {}
assign['schools'] = schools['schools']
assign['voday'] = schools['voday']
assign['vtime'] = schools['vtime']
assign['vters'] = [ [] for i in range(len(schools['vtime']))]
assign['drvrs'] = [0]*len(schools['vtime'])
assign['cspot'] = [0]*len(schools['vtime'])
assign['espan'] = [0]*len(schools['vtime'])
assign['tid'] = schools['tid']

# Greedy algorithm to optimize driving assignment
# Iterate over their avaliable times, find the volunteering loc
# with the minimum number of people avaliable to fill it, but
# still equal to or greater than the number of people the car
# can carry
psvol, psesp, pseng = psble(schools, mems)
for i in range(len(mems['drive'])):
  #for i in range(len(assign['schools'])):     
  #  print(assign['schools'][i], assign['voday'][i], assign['vtime'][i], psvol[i], psesp[i])
  # If they cant drive, or ar assigned don't assign them
  if(mems['drive'][i] == 0) or mems['assgn'][i]:
    continue
  #print(mems['names'][i])
  minp = 999999
  minpi = -1
  for j in range(len(mems['avail'][i])):
    ts = mems['avail'][i][j]
    # Iterate over volunteering slots
    for k in range(len(schools['schools'])):
      # Find the loc with the minimum possible people, 
      if(schools['tid'][k] == ts and psvol[k] < minp and psvol[k] >= mems['drive'][i] and psesp >= 4 and schools['nvolu'][k] > mems['drive'][i] and pseng[k] > (mems['drive'][i] - 2)):
        minp = psvol[k]
        minpi = k
  #print("CHOICE")
  if(minpi == -1):
    #print("NO SOLUTION")
    continue
  #print(assign['schools'][minpi], assign['voday'][minpi], assign['vtime'][minpi], psvol[minpi], psesp[minpi])
  # Assign driver to place with min spots      
  assign['drvrs'][minpi]  = 1
  #print(assign['cspot'][minpi], mems['drive'][i])
  assign['cspot'][minpi]  = mems['drive'][i]
  schools['nvolu'][minpi] -= 1
  assign['vters'][minpi].append(mems['names'][i])
  assign['espan'][minpi]  = mems['espan'][i]
  mems['assgn'][i] = True
  mems['tasgn'][i] = schools['tid'][minpi]
  mems['nasgn'][i] = minpi
  #print(assign['schools'][minpi], assign['voday'][minpi], assign['vtime'][minpi], assign['vters'][minpi],schools['nvolu'][minpi],assign['cspot'][minpi],psvol[minpi],psesp[minpi])
  
  # Assign non driving spanish speakers if necessary, first look for 1 fluent, then 2 conversational
  if(assign['espan'][minpi] < 4*assign['drvrs'][minpi]):
    for i1 in range(len(mems['espan'])):
      # If they can drive, are assigned, or are not fluent, skip them
      if(mems['drive'][i1] != 0 or mems['espan'][i1] != 4 or mems['assgn'][i1]):
        continue
        
      if(schools['tid'][minpi] in mems['avail'][i1]):
        assign['cspot'][minpi] -= 1
        schools['nvolu'][minpi] -= 1
        assign['vters'][minpi].append(mems['names'][i1])
        assign['espan'][minpi]  = mems['espan'][i1]
        mems['assgn'][i1] = True
        mems['tasgn'][i1] = schools['tid'][minpi]
        mems['nasgn'][i1] = minpi
        break
        
  # Now try twice to put a conversational spanish person inif necessary
  if(assign['espan'][minpi] < 4*assign['drvrs'][minpi]):
    for i1 in range(len(mems['espan'])):
      # If they can drive, are assigned, or are not fluent, skip them
      if(mems['drive'][i1] != 0 or mems['espan'][i1] != 2 or mems['assgn'][i1]):
        continue
        
      if(schools['tid'][minpi] in mems['avail'][i1]):
        assign['cspot'][minpi] -= 1
        schools['nvolu'][minpi] -= 1
        assign['vters'][minpi].append(mems['names'][i1])
        assign['espan'][minpi]  = mems['espan'][i1]
        mems['assgn'][i1] = True
        mems['tasgn'][i1] = schools['tid'][minpi]
        mems['nasgn'][i1] = minpi
        break
        
  # Second time on conversational
  if(assign['espan'][minpi] < 4*assign['drvrs'][minpi]):
    for i1 in range(len(mems['espan'])):
      # If they can drive, are assigned, or are not fluent, skip them
      if(mems['drive'][i1] != 0 or mems['espan'][i1] != 2 or mems['assgn'][i1]):
        continue
        
      if(schools['tid'][minpi] in mems['avail'][i1]):
        assign['cspot'][minpi] -= 1
        schools['nvolu'][minpi] -= 1
        assign['vters'][minpi].append(mems['names'][i1])
        assign['espan'][minpi]  = mems['espan'][i1]
        mems['assgn'][i1] = True
        mems['tasgn'][i1] = schools['tid'][minpi]
        mems['nasgn'][i1] = minpi
        break
        
  # Fill in with non spanish speakers, try one time for each open seat
  for i in range(assign['cspot'][minpi]):
    #print("NEED MORE PEOPLE")
    for i1 in range(len(mems['espan'])):
      # If they can drive, are assigned, or are spanish speaking, skip them
      if(mems['drive'][i1] != 0 or mems['espan'][i1] > 0 or mems['assgn'][i1]):
        continue
        
      if(schools['tid'][minpi] in mems['avail'][i1]):
        assign['cspot'][minpi] -= 1
        schools['nvolu'][minpi] -= 1
        assign['vters'][minpi].append(mems['names'][i1])
        assign['espan'][minpi]  = mems['espan'][i1]
        mems['assgn'][i1] = True
        mems['tasgn'][i1] = schools['tid'][minpi]
        mems['nasgn'][i1] = minpi
        break
    
  psvol, psesp, pseng = psble(schools, mems) 
  #print(assign['schools'][minpi], assign['voday'][minpi], assign['vtime'][minpi], psvol[minpi], psesp[minpi])
    
with open('output.txt','w') as f:
  for i in range(len(assign['schools'])):     
    f.write(assign['schools'][i]   " ")
    f.write(assign['voday'][i]   " ") 
    f.write(assign['vtime'][i]   "\n") 
    for j in range(len(assign['vters'][i])):
      f.write("\t"   assign['vters'][i][j]   "\n")
    f.write("\n")
  f.write("\n\nUNASSIGNED:\n")
    #print(assign['vters'][i], schools['nvolu'][i],assign['cspot'][i])
    #print(assign['cspot'][i])
    #print(schools['nvolu'][i])
    
  for i in range(len(mems['names'])):
    if(not mems['assgn'][i]):
      f.write(mems['names'][i]   "\n")

Here is the error:

Traceback (most recent call last):
  File "/Users/myname/Desktop/sealsort/sealsort.py", line 9, in <module>
    newm_path = sys.argv[1] #'./s18_new.csv'
IndexError: list index out of range

CodePudding user response：

As noted in the link posted by mkrieger1, the sys.argv[1] is referencing the first command line argument. So instead of running a command like this:

python program.py

You need to run it with three arguments like this:

python program.py /path/To/first.csv /path/To/second.csv /path/To/third.csv

Note: Line 5 gives a use example

python sealsort.py /path/to/newm/survey /path/to/retm/survey /path/to/school/doc

CodePudding user response：

It seems you are trying to run this script on a macOS machine. And I'm supposing you have the three CSV files in the same folder as the Python script.

You need to navigate, via terminal, to the folder the files are stored. So, first open the Terminal application, then navigate to the folder with this command: cd /Users/myname/Desktop/sealsort (here I'm using the same path that is in your question), then you will need to execute the script as described in the first comments:

# USE LIKE:
# python sealsort.py /path/to/newm/survey /path/to/retm/survey /path/to/school/doc

Supposing the files are: s18_new.csv, s18_return.csv, s18_schools.csv, execute the script specifying the name of these files as arguments to the program. If you do not specify any of the required arguments, one of the elements at the indexes 1, 2, and 3 will not be found and you will get that error (IndexError: list index out of range).

So, the correct command would be: python sealsort.py ./s18_new.csv ./s18_return.csv ./s18_schools.csv

This way, the element at index 0 (of argv) will be sealsort.py, the element 1 will be s18_new.csv, 2 will be s18_return.csv, and 3 will be s18_schools.csv.

I hope this helps.