Home > Enterprise >  list index out of range for nested for loop
list index out of range for nested for loop

Time:05-30

This is my df df

I created a function below to get the trigram based on the part of speech tag of the reviews.

def get_trigram(pos_1, pos_2, pos_3):
    all_trigram = []

    for j in range(len(df)):

        trigram = []

        for i in range(len(df['pos'][j]['pos'])):

            if [value for value in df['pos'][j]['pos']][i-2] == pos_1 and [value for value in df['pos'][j]['pos']][i-1] == pos_2 and [value for value in df['pos'][j]['pos']][i] == pos_3:
                trigram.append([value for value in df['pos'][j]['word']][i-2]   " "   [value for value in df['pos'][j]['word']][i-1]   " "   [value for value in df['pos'][j]['word']][i])

        all_trigram.append(trigram)
      
    return all_trigram

There is no error when running the function but when I call my function

tri_adv_adj_noun = get_trigram('ADV', 'ADJ', 'NOUN')

it gives an error: IndexError: list index out of range

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-149-12b4d4ffff3d> in <module>()
----> 1 tri_adv_adj_noun = get_trigram('ADV', 'ADJ', 'NOUN')
      2 tri_noun_adv_adj = get_trigram('NOUN', 'ADV', 'ADJ')
      3 
      4 trigram = tri_adv_adj_noun   tri_noun_adv_adj

<ipython-input-148-60ed39e749d0> in get_trigram(pos_1, pos_2, pos_3)
      8         for i in range(len(df_long['pos'][j]['pos'])):
      9 
---> 10             if [value for value in df_long['pos'][j]['pos']][i-2] == pos_1 and [value for value in df_long['pos'][j]['pos']][i-1] == pos_2 and [value for value in df_long['pos'][j]['pos']][i] == pos_3:
     11                 trigram.append([value for value in df_long['pos'][j]['word']][i-2]   " "   [value for value in df_long['pos'][j]['word']][i-1]   " "   [value for value in df_long['pos'][j]['word']][i])
     12 

IndexError: list index out of range

Fyi,

df['pos'][0] returns a dictionary of 2 lists

enter image description here

enter image description here

range len enter image description here

CodePudding user response:

I'd assume that your problem resides in the part

[value for value in df_long['pos'][j]['pos']][i-2]

First of all, it may be the case that some of your 'pos' dictionary data in your 'pos' column is missing, in which case you should put a condition that first verifies if the dictionary is populated with data. Otherwise, when accessing a list with fewer elements than the value of the index that you're searching, you'll get that error (for example, i-2 will go back 2 places from the end of the list, and when it doesn't find enough elements to go back, it throws the "list index out of range" error) Ex:

if len(df['pos'][j]['pos']) >= 3:
   for i in range(len(df['pos'][j]['pos']):
      ...

Second of all, writing your code like this is redundant, since you're making a list with the data from a list. You could jsut write:

 if df_long['pos'][j]['pos'][i-2] == pos_1 and df_long['pos'][j]['pos'][i-1] == pos_2  etc..

Or enhance it's visibility even more by adding a variable with a descriptive name :

for j in range(len(df)):

    trigram = []
    pos_list = df['pos'][j]['pos']

    if len(post_list) >= 3:
       for i in range(len(pos_list)):
          if pos_list[i-2] == pos_1 and pos_list[i-1] == pos_2 ...

Hope this helps!

  • Related