I have a data frame df2
and want to generate a new column called 'tag' based on a if
logic on two existing columns.
import pandas as pd
df2 = pd.DataFrame({'NOTES': ["PREPAID_HOME_SCREEN_MAMO","SCREEN_MAMO",
"> Unable to connect internet>4G Compatible>Set",
"No>Not Barred>Active>No>Available>Others>",
"Internet Not Working>>>Unable To Connect To"],
'col_1': ["voice", "voice","data","other","voice"],
'col_2': ["DATA", "voice","VOICE","VOICE","voice"]})
The logic and my attempt are:
df2['Tag'] =
if df['col_1']=='data':
return "Yes"
elif df['col_2']:
return "Yes"
else:
return "No"
CodePudding user response:
The problem is that you are trying to assign a value with if-statement, which causes the syntax error.
There are many ways to do this, I provide one using pandas.DataFrame.apply
.
trans_fn = lambda row: "Yes" if row['col_1']=='data' && row['col_2'] else "No"
df2['tag'] = df2.apply(trans_fn, axis=1) # apply trans_fn to each row