I have a pandas DataFrame with shape 12000x100
. I am attempting to apply a function to rows where a column values are NaN
. My function is an API call where I may receive different responses.
import pandas as pd
# api call function
def api_call(v):
try:
r = api(v)
return r
except:
return np.nan
pass
df = pd.DataFrame({
'id':[0, 1, 2, 3, 4],
'v': ['a','b','c','d','e'],
'w': [10, np.nan, np.nan, np.nan, np.nan]
})
# Apply function
df['w'].fillna(df.apply(lambda x: api_call(x['v']), axis=1), inplace=True)
If a match is found in the API call, I get an integer score between 1-100, otherwise if there is an Exception I get {"status": 41}
.
Here's a sample response:
54
{"status": 41}
{"status": 41}
39
When I run this on my real Pandas DataFrame and I do not see the NaNs
being populated. The DataFrame is exactly same in terms of # of NaNs
in pandas series. I can't seem to figure out why it is not saving / replacing the NaNs
.
It's a bit tricky to write reproducible code as the data is large and API calls require configuration etc.
CodePudding user response:
Because you slice your dataframe (df['w']
) and you fill nan values in place (inplace=True
). So you fill the copy not the original dataframe:
df['w'] = df['w'].fillna(df.apply(lambda x: api_call(x['v']), axis=1))
CodePudding user response:
id v w w
0 0 a 10.0 10.0
1 1 b NaN NaN
2 2 c NaN NaN
3 3 d NaN NaN
4 4 e NaN NaN
Are you certain that your w
column is unique? One possibility could be that df['w']
is actually returning multiple columns.
df['w'].fillna(0, inplace=True)
print(df)
Output:
<stdin>:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
id v w w
0 0 a 10.0 10.0
1 1 b NaN NaN
2 2 c NaN NaN
3 3 d NaN NaN
4 4 e NaN NaN
But... if this were the case, you should've gotten a warning if your Pandas version is updated~