Home > OS >  Pandas: How to refer to the row value while replacing a value
Pandas: How to refer to the row value while replacing a value

Time:05-31

What I am trying to do is to replace a value based on the row value. But I couldn't figure out how to do it. Code is like below.

def get_new_value_b(db, value_a):
    # Take a value_a and return value_b
    ...


a_list = ["a", "b", "c"]
df.loc[df["value_a"] in a_list, "value_b"] = get_new_value_b(db,df["value_a"]) 

What I want to do here is to replace a value_a if the value exists in a_list But on the condition, there is an error.

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

Also if the condition is True, I want to get a new value by passing the value_a to a function (get_new_value_b) but seems like it also refers to the entire column, not the specific value in the row.

How can I fix this issue?

CodePudding user response:

Use Series.isin and Series.apply:

df.loc[df["value_a"].isin( a_list), "value_b"] = df["value_a"].apply(get_new_value_b)

Better is call function only for matched rows:

df = pd.DataFrame({"value_a":  ["a" ,"b" ,"d" ,"f","c" ]})

def get_new_value_b(value_a):
    # Take a value_a and return value_b    
    return '_'   value_a

a_list = ["a", "b", "c"]
m = df["value_a"].isin( a_list)
df.loc[m, "value_b"] = df.loc[m, "value_a"].apply(get_new_value_b)
print (df)
  value_a value_b
0       a      _a
1       b      _b
2       d     NaN
3       f     NaN
4       c      _c

EDIT: If need pass multiple parameters use lambda function:

m = df["value_a"].isin( a_list)
df.loc[m, "value_b"] = df.loc[m, "value_a"].apply(lambda x: get_new_value_b(df1, x))
  • Related