I need to apply 2 different functions to 1 columns if it meets a condition. My dataframe looks like this. I want to apply the functions to the column produce: veg_pro if the category is a vegetable and fruit_pro if it's a fruit.
Produce Category
apple is good fruit
corn is bad vegetable
beans is good vegetable
grape if good fruit
My functions look like this:
def veg_pro(text):
reg_tokenizer = RegexpTokenizer('\s ', gaps = True)
terms = reg_tokenizer.tokenize(text)
return terms
def fruit_pro(text):
reg_tokenizer = RegexpTokenizer(“[\w .] “)
terms = reg_tokenizer.tokenize(text)
return terms
df['produce']= df['produce'].apply(lambda x:
veg_pro(x) if df['Category'] =='vegetable’ else
fruit_pro(x))
ValueError: The truth value of a Series is ambiguous. Use
a.empty, a.bool(), a.item(), a.any() or a.all().
CodePudding user response:
instead of using apply
on one column
use it on the dataframe
like this:
df['produce']= df.apply(lambda x:
veg_pro(x["produce"]) if x["Category"] =="vegetable" else fruit_pro(x["produce"]),axis=1)