Hi dear I'm in trouble using a numpy.where function to modify certain values in a pandas dataframe. I want to be clear: if I run this function in a single notebook cell, it runs well, but if i insert it into a function doesn't. The condition I try to implement is: if a certain row in a certain column has a certain value, so change the value na in the same row, but in other column, otherwise let's unchanged the value. If i write:
df.speed_limit = np.where(df.way.str.contains('link'), df.speed_limit.fillna(40), df.speed_limit)
it runs and does what i would, but if i write:
def change_speed_values(df):
df.speed_limit = np.where(df.way.str.contains('link'), df.speed_limit.fillna(40), df.speed_limit)
df.speed_limit = np.where(df.way.str.contains('track'), df.speed_limit.fillna(50), df.speed_limit)
return df
it runs, but really didn't do any changes. So you could help me to understand why it happens?
Thank you for your patience and your support. I wish you a good day!
CodePudding user response:
I test it and working well, added alternative solution with numpy.select
:
df = pd.DataFrame({'way':['link1','link2','track1','track2'],
'speed_limit':[np.nan, 2] * 2})
print (df)
way speed_limit
0 link1 NaN
1 link2 2.0
2 track1 NaN
3 track2 2.0
df.speed_limit = np.select([df.way.str.contains('link'),
df.way.str.contains('track')],
[df.speed_limit.fillna(40),
df.speed_limit.fillna(50)], df.speed_limit)
print (df)
way speed_limit
0 link1 40.0
1 link2 2.0
2 track1 50.0
3 track2 2.0
df = pd.DataFrame({'way':['link1','link2','track1','track2'],
'speed_limit':[np.nan, 2] * 2})
# print (df)
def change_speed_values(df):
df.speed_limit = np.where(df.way.str.contains('link'), df.speed_limit.fillna(40), df.speed_limit)
df.speed_limit = np.where(df.way.str.contains('track'), df.speed_limit.fillna(50), df.speed_limit)
return df
df = change_speed_values(df)
print(df)
way speed_limit
0 link1 40.0
1 link2 2.0
2 track1 50.0
3 track2 2.0