Home > Mobile >  Modifying columns in a dataframe based off of other values in the row
Modifying columns in a dataframe based off of other values in the row

Time:11-05

With this data frame

A B C
'data' 2 3
'dota' 3 4

I would like to modify the values in columns B and C based off of the value in column A. Currently I can do it using something like

df['B'] = df['A'].map(lambda x: cache[x]['mapVal'])
df['C'] = df['A'].map(lambda x: cache[x]['diffMapVal'])

In this case the value of column B and C is reliant on the value of column A. They also require a unique key that's based off their own column (e.g. B has its own key and C has its own key).

To throw another twist into the mix, there are certain conditions where B and C should not be modified if foo(val(B)) == True where foo is a predicate and val gives the value of column B on whatever row we're on.

So essentially the task is

if foo(B) and foo(C) then continue
set value of B equal to cache[val(A)][keyOfB]
set value of C equal to cache[val(A)][keyOfC]

After looking at a few questions this question seems relevant. But I can't quite find out how to access the value of A in the dataframe given that it's being filtered by the where statement.

Thanks

CodePudding user response:

Use apply with axis=1 to apply your logic row-wise

def func(row):
   if not foo(row.B):
      row['B'] = cache[row.A]['mapVal']
   if not foo(row.C):
      row['C'] = cache[row.A]['diffMapVal']
   return row 

df = df.apply(func, axis=1)
  • Related