I have 2 columns first with string including X or F and second is empty. If there is any X in column 1 I want to assign 'YES' to second column if there is no X assign 'NO' Every time I run my code it is assigning value 'YES' to all of them
This example how it should look like:
My code:
for row in df['Column2']:
if df['Column1'].str.contains('X').any():
df['Column2'] = 'YES'
else:
df['Column2'] = 'NO'
CodePudding user response:
You are executing vectorized operation each time through the loop. Every time through the loop you are assigning 'YES'
to the entire Column2
.
Using numpy
you could do:
import numpy as np
df['Column2'] = np.where(df['Column1'].str.contains('X'), 'YES', 'NO')
print(df)
Result
Column1 Column2
0 ....X.X.X.X..X.X. YES
1 ....X.X.X.X..X.X. YES
2 ....X.X.X.X..X.X. YES
3 ....X.X.X.X..X.X. YES
4 ....F.F.F NO
5 ....F.F.F NO
6 ....F.F.F NO
7 ....F.F.F NO
CodePudding user response:
you can use regex to find 'X'
if df['Column1'].str.find(r'X')>1:
you can even avoid the loop as follows
(df['Column1'].str.find(r'X')>1).map({True: 'Yes', False: 'No'})