Home > Back-end >  Applying a function to pandas cell and storing multiple returned values in the same row
Applying a function to pandas cell and storing multiple returned values in the same row

Time:10-15

So I'm on a mission to hunt down some memory hogs in my script, and having read about the problems with iterrows()/itertuples() I'm currently trying to figure out how to use vectorization to do what is probably quite simple.

The issue boils down to:

A dataframe:

df = pd.DataFrame(data=d)

   col1  col2  col3
0     1   NaN   NaN
1     2   NaN   NaN

A function returning two different values based on a single input:

def func(a):
     return a 1,a 5

Now I would like to apply this function to each cell in col1 that is equal to 1, and store the result in col2 and col3 of the same row.

At the moment, I'm doing this with a more extended version of this abomination:

for row in df.loc[df['col1'] == 1].itertuples():
    a,b = func(df.loc[row.Index, 'col1'])
    df.loc[row.Index,'col2'] = a
    df.loc[row.Index,'col3'] = b

Resulting in:

   col1  col2  col3
0     1   2.0   6.0
1     2   NaN   NaN

How would you re-write this to be vectorized/performant? Thanks

CodePudding user response:

try:

df['col2']=df.where(data["col1"]==1)["col1"].apply(lambda x:func(x)[0])
df['col3']=df.where(data["col1"]==1)["col1"].apply(lambda x:func(x)[1])

CodePudding user response:

def func(a):
    return a 1,a 5

df['col2'],df['col3'] = np.where(
                                df['col1']==1, # condition
                                func(df['col1']), # if true, do this
                                np.NaN # if false, do this
                                )

CodePudding user response:

for your example you can use apply function, we can add your condition in the function:

df=pd.DataFrame([[1,0,0],[2,0,0]],columns=['col1','col2','col3'])
def funct(a):
  if a==1:
     return(a 1,a 5)
df[['col2','col3']]=df.col1.apply(funct).tolist()
  • Related