I am trying to store an api response into a pandas column where values are np.nan
with a condition. Store function return / api response where rows are in the mlst
and c
is np.nan.
I tried the following, but it doesn't store the response into the a
column.
Here's an example:
import pandas as pd
import numpy as np
# mock dict
d = dict({'v1': 'xxx1', 'v2': 'xxx2'})
def api_res(val):
# make api call
res = d.get(val)
return res
df = pd.DataFrame({
'm': ['M1','M2','M3'],
'v': ['v1','v2','v3'],
'c': [np.nan, 1, np.nan],
'a': [np.nan, np.nan, np.nan]
})
mlst = ['M1','M2']
df[(df['m'].isin(mlst)) & (df['c'].isna())]['a'].fillna(df.apply(lambda x: api_res(x['v']), axis=1), inplace=True)
Expected output:
a
xxx1
np.nan
np.nan
CodePudding user response:
The problem arises from the chained indexing. You are indexing df
twice, so you are in fact filling in-place the NaNs of a different DataFrame object, not of the original df
. Therefore the original df
remains the same.
Try this
mask = df['m'].isin(mlst) & df['c'].isna()
df.loc[mask, 'a'] = df.loc[mask, 'a'].fillna(df.loc[mask, 'v'].apply(api_res))
Output:
>>> df
m v c a
0 M1 v1 NaN xxx1
1 M2 v2 1.0 NaN
2 M3 v3 NaN NaN
CodePudding user response:
def api_res(v, m):
return d.get(v) m
mlst = ['M1','M2']
mask = df.m.isin(mlst) & df.c.isna()
df.loc[mask, 'a'] = df[mask].apply(lambda x: api_res(x.v, x.m), axis=1)
Output:
m v c a
0 M1 v1 NaN xxx1M1
1 M2 v2 1.0 NaN
2 M3 v3 NaN NaN
If you explicitly want fillna
like functionality, that can be achieved by just be adding one more requirement to the mask:
mask = df.m.isin(mlst) & df.c.isna() & df.a.isna()
df.loc[mask, 'a'] = df[mask].apply(lambda x: api_res(x.v, x.m), axis=1)