New to python - I am trying to change all the values in one of the columns in my data frame where the text contains "employed or Employed" word. Should i use the lambda function to loop through the column? If no, then what's the most optimal way to do this?
df = pd.DataFrame([
['Self-employed',1,1],
['Self employed contract labour',1,1],
['Self Employed',1,0],
['N/A(Self employed)',1,0],
['SELF EMPLOYED',1,0]
], columns=['A', 'B', 'C'])
df
Expected Output:
['Self Employed',1,1],
['Self Employed',1,1],
['Self Employed',1,0],
['Self Employed',1,0],
['Self Employed',1,0]
CodePudding user response:
Looks like str.contains
and boolean indexing should do the trick:
df.loc[df['A'].str.contains('employed', case=False), 'A'] = 'Self Employed'
output:
A B C
0 Self Employed 1 1
1 Self Employed 1 1
2 Self Employed 1 0
3 Self Employed 1 0
4 Self Employed 1 0