Home > Software engineering >  Calling function from for loop - issue with parameters: NameError: name 'row' is not defin
Calling function from for loop - issue with parameters: NameError: name 'row' is not defin

Time:05-08

I am calculating commute distances (home to offices) for all employees. This works for a single office with this piece of code:

#Calculate distances from home to all other offices
def distances_to_offices(row):

    row['distance_to_office'] = round(distance.distance(row['Home_geocode'], row['Office_geocode']).km,0)
    return row

df_joined.apply(distances_to_offices, axis=1)

I have many offices though and would like to for loop through them, creating a new column for distances for each office. The function is called without an argument but "row" is set as parameter in the function definition. When i try to pass a city name, I need to call the function with the same number of arguments like in the function definition, yet that is not working out as "row" is not understood as argument:

NameError: name 'row' is not defined

I don´t understand why "row" works as argument in the function definition but not when I try to call it with that argument. Who can help shedding some light? I am thinking of something like this but struggle with chosing the right arguments:

# throws Name Error:
def distances_to_offices(row, city):
    col_name = city   "_distance"
    row[col_name] = round(distance.distance(row['Home_geocode'], row['Office_geocode']).km,0)
    return row

offices = ['NY', 'Rio', 'Tokyo']
for city in offices:
    df_joined.apply(distances_to_offices(row, city), axis=1)

CodePudding user response:

Can't say for sure because I don't know what df_joined is, but it looks like it is some object representing a data table on which you can call apply() passing in a function, and apply() will loop through the table, calling the function you have supplied while passing in a row index or name for every row of the data table.

The first way you did it works because you are passing the NAME of the function distances_to_offices to apply(), and apply() is calling it with a row which gets defined in some for loop that it sets up.

The second way you did it fails because you are no longer passing the NAME of distances_to_offices, but instead CALLING it with arguments row and city, then passing the result of that function call as an argument to apply(). That can't work because you haven't defined what row is, but Python needs to know what row is in order to do the function call distances_to_offices(row, city), which happens BEFORE apply sees it and has an opportunity to define row.

I'm not sure how df_joined.apply() works, so without more information I can't say the right way to do this. One strategy would be to create a new data table object in each iteration of the loop by filtering df_joined to retain only rows that have a city matching city.

  • Related