Home > Software engineering >  Pandas column fillna from a function with condition
Pandas column fillna from a function with condition

Time:07-02

I am trying to store an api response into a pandas column where values are np.nan with a condition. Store function return / api response where rows are in the mlst and c is np.nan.

I tried the following, but it doesn't store the response into the a column.

Here's an example:

import pandas as pd
import numpy as np

# mock dict
d = dict({'v1': 'xxx1', 'v2': 'xxx2'})

def api_res(val):

    # make api call
    res = d.get(val)

    return res

df = pd.DataFrame({
                   'm': ['M1','M2','M3'],
                   'v': ['v1','v2','v3'],
                   'c': [np.nan, 1, np.nan],
                   'a': [np.nan, np.nan, np.nan]
                 })

mlst = ['M1','M2']

df[(df['m'].isin(mlst)) & (df['c'].isna())]['a'].fillna(df.apply(lambda x: api_res(x['v']), axis=1), inplace=True)

Expected output:

a 

xxx1
np.nan
np.nan

CodePudding user response:

The problem arises from the chained indexing. You are indexing df twice, so you are in fact filling in-place the NaNs of a different DataFrame object, not of the original df. Therefore the original df remains the same.

Try this

mask = df['m'].isin(mlst) & df['c'].isna()

df.loc[mask, 'a'] = df.loc[mask, 'a'].fillna(df.loc[mask, 'v'].apply(api_res))

Output:

>>> df

    m   v    c     a
0  M1  v1  NaN  xxx1
1  M2  v2  1.0   NaN
2  M3  v3  NaN   NaN

CodePudding user response:

def api_res(v, m):
    return d.get(v)   m

mlst = ['M1','M2']

mask = df.m.isin(mlst) & df.c.isna()
df.loc[mask, 'a'] = df[mask].apply(lambda x: api_res(x.v, x.m), axis=1)

Output:

    m   v    c       a
0  M1  v1  NaN  xxx1M1
1  M2  v2  1.0     NaN
2  M3  v3  NaN     NaN

If you explicitly want fillna like functionality, that can be achieved by just be adding one more requirement to the mask:

mask = df.m.isin(mlst) & df.c.isna() & df.a.isna()
df.loc[mask, 'a'] = df[mask].apply(lambda x: api_res(x.v, x.m), axis=1)
  • Related