My pandas dataframe looks like this . For each row I want to replace values in Q2 to "positive" if the term "xxpos" occurs within the "SNIPPET" column and if the value in Q2 == 1. Also I want to replace values in Q2 to "negative" if the term "xxneg" occurs within the "SNIPPET" column and the value in Q2 == 1 etc.
I tried a few things, including the following but without success:
df['Q2'] = np.where(("xxpos" in df["SNIPPET"]) & (df['Q2'] == 1) ,"Positive", df['Q2'])
What would be the easiest solution to deal with the multiple conditions?
CodePudding user response:
You can try with the following code.
df.loc[(df['Q2']==1) & (df['SNIPPET'].str.contains('xxpos')), 'Q2'] = 'Positive'
df.loc[(df['Q2']==1) & (df['SNIPPET'].str.contains('xxneg')), 'Q2'] = 'Negative'
CodePudding user response:
Use np.select
. This should be the most performant.
conds = [(df['SNIPPET'].str.contains('xxpos')) & (df['Q2'].eq(1)), (df['SNIPPET'].str.contains('xxneg')) & (df['Q2'].eq(1))]
choices = ['Positive', 'Negative']
output = np.select(conds, choices, default=df['Q2'])