I want to insert a new column called "Sponsor" and the values of this column comes from multiple columns.
> Current Data
Program Source Region Owner
A Global ECAN Girl
B Regional US Boy
C Delta Global EMEA Girl
> Insert Sponsor column and the values should be based on below logic
If, Program == "A" OR Program == "B" AND Source column contains "Global" then use the value from Owner column Else return the same value from Source column
I tried in below format but little confused
def SetSponsor(row):
if str(row['Source']).contains('Global') & (row['Program'] == 'A') | (row['Program'] == 'B') :
return (row['Owner'])
else :
return (row['Source'])
df['Sponsor'] = df.apply(lambda row: SetSponsor(row), axis=1)
CodePudding user response:
Use np.where
for complex conditions.
contains
function requires string context:str.contains()
- to check if an element is contained in list of values is convenient with
Series.isin(values)
function
df['Sponsor'] = np.where((df['Source'].str.contains('Global')) & (df['Program'].isin(['A','B'])),
df['Owner'], df['Source'])
Program Source Region Owner Sponsor
0 A Global ECAN Girl Girl
1 B Regional US Boy Regional
2 C Delta Global EMEA Girl Delta Global