Home > Software engineering >  List Indexing using range() and len() not working
List Indexing using range() and len() not working

Time:11-20

I am trying to parse through a data structure and I have used a for loop initializing a variable i and using the range() function. I originally set my range to be the size of the records: 25,173 but then I kept receiving an

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In [63], line 41
     37         input_list.append(HOME_AVG_PTS)
     39     return input_list
---> 41 Data["AVG_PTS_HOME"]= extract_wl(AVG_PTS_HOME)

Cell In [63], line 28, in extract_wl(input_list)
     26         if i==25172:
     27             HOME_AVG_PTS[0]= HOME_AVG_PTS[0]/count
---> 28 elif int(Data(["TEAM_ID_AWAY"][i])) == const_home_team_id:
     29     if type(PTS_AWAY[i]) is int:
     30         count = 1

IndexError: list index out of range

So I tried changing my for loop to be in the range of the function with the issue i.e.

for i in range(len(Data["TEAM_ID_AWAY"])):

But I keep receiving the same error still

The Data variable holds the contents of a csv file which I have used the panda module to read and put into Data. You can assume all the column headers I have used are valid and furthermore that they all have range 25173. [Here is an image showing the range and values for the Data"TEAM_ID_HOME"

AVG_PTS_HOME = []

def extract_wl(input_list):
    for j in range(25173):
        const_season_id = Data["SEASON_ID"][j]
        #print(const_season_id)
        const_game_id = int(Data["GAME_ID"][j])
        #print(const_game_id)
        const_home_team_id = Data["TEAM_ID_HOME"][j]
        #print(const_home_team_id)

        #if j==10:
            #break
        
        print("Iteration #", j)
        print(len(Data["TEAM_ID_AWAY"]))
        count = 0
        HOME_AVG_PTS=[0.0]

        for i in range(len(Data["TEAM_ID_AWAY"])):
            if (int(Data["GAME_ID"][i]) < const_game_id and int(Data["SEASON_ID"][i]) == const_season_id):
                if int(Data["TEAM_ID_HOME"][i]) == const_home_team_id:
                    if type(PTS_HOME[i]) is int:
                        count = 1
                        HOME_AVG_PTS[0] = PTS_HOME[i]
                        if i==25172:
                            HOME_AVG_PTS[0]= HOME_AVG_PTS[0]/count
                elif int(Data(["TEAM_ID_AWAY"][i])) == const_home_team_id:
                    if type(PTS_AWAY[i]) is int:
                        count = 1
                        HOME_AVG_PTS[0] = PTS_AWAY[i]
                        if i==25172:
                            HOME_AVG_PTS[0]= float(HOME_AVG_PTS[0]/count)
                
    
        print(HOME_AVG_PTS)
        input_list.append(HOME_AVG_PTS)

    return input_list

Data["AVG_PTS_HOME"]= extract_wl(AVG_PTS_HOME)

Can anyone point out why I am having this error or help me resolve it? In the meantime I think I am going to just create a separate function which takes a list of all the AWAY_IDs and then parse through that instead.

CodePudding user response:

Hardcoding the length of the list is likely to lead to bugs down the line even if you're able to get it all working correctly with a particular data set. Iterating over the lists directly is much less error-prone and generally produces code that's easier to modify. If you want to iterate over multiple lists in parallel, use the zip function. For example, this:

    for j in range(25173):
        const_season_id = Data["SEASON_ID"][j]
        const_game_id = int(Data["GAME_ID"][j])
        const_home_team_id = Data["TEAM_ID_HOME"][j]
        # do stuff with const_season_id, const_game_id, const_home_team_id

can be written as:

    for (const_season_id, game_id, const_home_team_id) in zip(
        Data["SEASON_ID"], Data["GAME_ID"], Data["TEAM_ID_HOME"]
    ):
        const_game_id = int(game_id)
        # do stuff with const_season_id, const_game_id, const_home_team_id

The zip function will never give you an IndexError because it will automatically stop iterating when it reaches the end of any of the input lists. (If you want to stop iterating when you reach the end of the longest list instead, there's zip_longest.)

CodePudding user response:

elif int(Data(["TEAM_ID_AWAY"][i])) == const_home_team_id

Your parentheses are in the wrong place.

You have parentheses around (["TEAM_ID_AWAY"][i]), therefore it is trying to take the i'th index of the single-element list ["TEAM_ID_AWAY"].

You want Data["TEAM_ID_AWAY"][i], not Data(["TEAM_ID_AWAY"][i]).

  • Related