Error in Python Code Trying To Open and Access a File-CodePudding

Here is info from the .txt file I am trying to access:
Movies: Drama
Possession, 2002
The Big Chill, 1983
Crimson Tide, 1995

Here is my code:

fp = open("Movies.txt", "r")  
lines = fp.readlines()
for line in lines:  
    values = line.split(", ")   
    year = int(values[1])
    if year < 1990:  
        print(values[0])

I get an error message "IndexError: list index out of range". Please explain why or how I can fix this. Thank you!

CodePudding user response：

Assuming your .txt file includes the "Movies: Drama" line, as you listed, it's because the first line of the text file has no comma in it. Therefore splitting that first line on a comma only results in 1 element (element 0), NOT 2, and therefore there is no values[1] for the first line.

It's not unusual for data files to have a header line that doesn't contain actual data. Import modules like Pandas will typically handle this automatically, but open() and readlines() don't differentiate.

The easiest thing to do is just slice your list variable (lines) so you don't include the first line in your loop:

fp = open("Movies.txt", "r")
lines = fp.readlines()
for line in lines[1:]:  
    values = line.split(", ")   
    year = int(values[1])
    if year < 1990:  
        print(values[0])

Note the "lines[1:]" modification. This way you only loop starting from the second line (the first line is lines[0]) and go to the end.

CodePudding user response：

The application will run when it changes the contents of the Movies.txt file as below:

Possession, 2002
The Big Chill, 1983
Crimson Tide, 1995

Inside the Movies.txt file, you should not add anything other than the data you will parse.

CodePudding user response：

The first line of the text file does not have a ", ", so when you split on it, you get a list of size 1. When you access the 2nd element with values[1] then you are accessing outside the length of the array, hence the IndexError. You need to do a check on the line before making the assumption about the size of the list. Some options:

Check the length of values and continue if it's too short.
Check that ', ' is in the line before splitting on it.
Use a regex which will ensure the ', ' is there as well as can ensure that the contents after the comma represent a number.
Preemptively strip off the first line in lines if you know that it's the header.

CodePudding user response：

Your first line of your txt file has wrong index Just simple change your code to:

fp = open("Movies.txt", "r")  
lines = fp.readlines()
for line in lines:  
    try: #<---- Here
        values = line.split(", ")   
        year = int(values[1])
        if year < 1990:  
            print(values[0])
    except: #<--------And here
        pass