What I am trying to do is to replace a value based on the row value. But I couldn't figure out how to do it. Code is like below.
def get_new_value_b(db, value_a):
# Take a value_a and return value_b
...
a_list = ["a", "b", "c"]
df.loc[df["value_a"] in a_list, "value_b"] = get_new_value_b(db,df["value_a"])
What I want to do here is to replace a value_a
if the value exists in a_list
But on the condition, there is an error.
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Also if the condition is True, I want to get a new value by passing the value_a
to a function (get_new_value_b
)
but seems like it also refers to the entire column, not the specific value in the row.
How can I fix this issue?
CodePudding user response:
Use Series.isin
and Series.apply
:
df.loc[df["value_a"].isin( a_list), "value_b"] = df["value_a"].apply(get_new_value_b)
Better is call function only for matched rows:
df = pd.DataFrame({"value_a": ["a" ,"b" ,"d" ,"f","c" ]})
def get_new_value_b(value_a):
# Take a value_a and return value_b
return '_' value_a
a_list = ["a", "b", "c"]
m = df["value_a"].isin( a_list)
df.loc[m, "value_b"] = df.loc[m, "value_a"].apply(get_new_value_b)
print (df)
value_a value_b
0 a _a
1 b _b
2 d NaN
3 f NaN
4 c _c
EDIT: If need pass multiple parameters use lambda function:
m = df["value_a"].isin( a_list)
df.loc[m, "value_b"] = df.loc[m, "value_a"].apply(lambda x: get_new_value_b(df1, x))