how can i solve error ''float' object is not iterable'-CodePudding

im new at python, and try to categorize places an df1 by distance to places in df2, but something gonna wrong

i have 2 dataframes whith coordinate of places

import pandas as pd
import geopy.distance

df1 = pd.DataFrame([['a', 55.88, 37.48],
                   ['b', 55.88, 37.53],
                   ['c', 55.89, 37.45]],
                   columns=['name', 'lat', 'lng']

df1 = pd.DataFrame([['f', 55.81, 37.12],
                   ['g', 55.79, 37.23],
                   ['h', 55.23, 37.21]],
                   columns=['name', 'lat', 'lng']
print(df1)
print(df2)

df1

name	lat	lng
a	55.88	37.48
b	55.88	37.53
c	55.89	37.45

df2

name	lat	lng
f	55.81	37.12
g	55.79	37.23
h	55.23	37.21

so, i try to calculate distance between a and f,g,h and if distance to one of this place less than 1000m, append category "close" and else category 'far', and do it for each name in df1

i want this df

print(df1)

name	lat	lng	dist_to_palce
a	55.88	37.48	far
b	55.88	37.53	close
c	55.89	37.45	far

i try this construction

def dist(df1):
    for i in range(len(df1)):
        for j in range(len(df2)):
            if geopy.distance.geodesic(
                tuple(data[['lat','lng']].iloc[i]),
                tuple(metro[['lat','lng']].iloc[j])).m <1000:
                    return 'close'
            else: return 'far'


 df1['dist_to_place'] = df1.apply(dist, axis=1)

but i got error 'float' object is not iterable

help me please :C

solution

def dist(df1_row):        
    for j in range(len(df2)):
        if geopy.distance.geodesic(
            tuple(df1_row[['lat','lng']]),
            tuple(df2[['lat','lng']].iloc[j])).m <1000:
                return 'close'
    return 'far'


df1['dist_to_place'] = df1.apply(dist, axis=1)

CodePudding user response：

import pandas as pd
import geopy.distance

df1 = pd.DataFrame({'name':['a', 'b'],'lat':[56.34, 76.56], 'lng':[23.42, 45.34]})
df2 = pd.DataFrame({'name':['f', 'g'],'lat':[56.45, 76.55], 'lng':[27.42, 40.34]})

def dist(df1_row):        
    for j in range(len(df2)):
        if geopy.distance.geodesic(
            tuple(df1_row[['lat','lng']]),
            tuple(df2[['lat','lng']].iloc[j])).m <1000:
                return 'close'
    return 'far'


df1['dist_to_place'] = df1.apply(dist, axis=1)

when you use apply, row is given to your function, not the whole dataframe
you need to return far after the cycle, not inside

CodePudding user response：

The error is happening because you are returning a string value 'close' or 'far' from the dist function and trying to assign it to the entire row of the df1 dataframe using df1.apply(dist, axis=1). Instead of returning a string, you should create a list of values with length equal to the number of rows in df1, and then assign the list to a new column in df1.

import geopy.distance

def dist(row):
    result = []
    for j in range(len(df2)):
        if geopy.distance.geodesic(
            (row['lat'], row['lng']),
            (df2.loc[j, 'lat'], df2.loc[j, 'lng'])).m < 1000:
                result.append('close')
        else: result.append('far')
    return result

df1['dist_to_place'] = df1.apply(dist, axis=1).apply(lambda x: x[0])

above code calculates the distance between each place in df1 and all the places in df2, but only the first occurrence of 'close' or 'far' is returned and assigned to the new column in df1.