With this data frame
A | B | C |
---|---|---|
'data' | 2 | 3 |
'dota' | 3 | 4 |
I would like to modify the values in columns B and C based off of the value in column A. Currently I can do it using something like
df['B'] = df['A'].map(lambda x: cache[x]['mapVal'])
df['C'] = df['A'].map(lambda x: cache[x]['diffMapVal'])
In this case the value of column B and C is reliant on the value of column A. They also require a unique key that's based off their own column (e.g. B has its own key and C has its own key).
To throw another twist into the mix, there are certain conditions where B and C should not be modified if foo(val(B)) == True
where foo is a predicate and val gives the value of column B on whatever row we're on.
So essentially the task is
if foo(B) and foo(C) then continue
set value of B equal to cache[val(A)][keyOfB]
set value of C equal to cache[val(A)][keyOfC]
After looking at a few questions this question seems relevant. But I can't quite find out how to access the value of A in the dataframe given that it's being filtered by the where statement.
Thanks
CodePudding user response:
Use apply
with axis=1
to apply your logic row-wise
def func(row):
if not foo(row.B):
row['B'] = cache[row.A]['mapVal']
if not foo(row.C):
row['C'] = cache[row.A]['diffMapVal']
return row
df = df.apply(func, axis=1)