Home > Back-end >  How to solve "IndexError: list index out of range" error when reading an CSV file
How to solve "IndexError: list index out of range" error when reading an CSV file

Time:08-27

I am not too sure how to ask, I am new to Python and programming as whole but here is my question. I hope it makes sense..

I am currently working on a populating a postgres database where I have this loop which iterates along with reading a csv file to get a certain output and then Insert/query that into a table. "Basically creating a handle"

But I get an error which I am assuming its due to empty values in the csv files. I have learned Pandas in the past to clean up data but the Github's code where I extracted this code does not seem to include Pandas to do so.

In the video which is from Youtuber/Engineer "Part Time Larry - Tracking ARK Invest ETFs with Python and PostgreSQL" he obviously uses the exact same code but he does not get the same Index Error. Not sure how to by pass this error, I have read and watched videos but it doesn't explain this specific scenario

Here is the index list:

['date', 'fund', 'company', 'ticker', 'cusip', 'shares', 'market value ($)', 'weight (%)']

PS. I do understand Generally, the index range of a list is 0 to n-1, with n being the total number of values in the list.

the range is from 0 to 7 making it 8 values in the index list

Here is the Github's link in case you want to check it out and the YT link Created a handle from postgres loop along with csv file

I get a portion of the rows but get the Index Error due to empty values I believe

CSV FILE

CodePudding user response:

If you're sure that the correct row is with 8 elements, you could add a check before proceeding to print, for example this section:

with open(f"Resources/{current_date}/{etf['symbol']}.csv") as f:
    reader = csv.reaader(f)
    for row in reader:
        if len(row) == 8:    #add this check
            ticker = row[3]
            if ticker:
                print(row)

Edit: To answer your question on how to fix your code, I guess you could start with understanding how list[index] works. For example run the following and see the error happens and understand why it happens:

lis = ['date', 'fund', 'company']
for i in [0, 1, 2]:
    print(i)
    print(lis[i])

for i in [0, 1, 2, 3, 4, 5, 6, 7, 8]:
    print(i)
    print(lis[i])    #this will result in "index out of range" because there is no item in `lis[3]`

CodePudding user response:

The line 33 in the csv is the issue
Energy or investment product

You can either remove that line and rerun the code. An approach mentioned by @perpetualstudent will also work, but in case if the data has a list of values separated by comma your data will be polluted. It is always a good practice to sanitize your data and reasonable checks like checking if first column is a date etc.

Github flagging the issue

  • Related