I have two columns. I want to fill text in the second column based on the values in the first column.
Here is my code:
df = pd.DataFrame({'value':[100,10,-5,2],'text':['fine','good',np.nan,np.nan]})
df['text'] = np.where(df['value']<5,'bad')
Present output:
ValueError: either both or neither of x and y should be given
Expected output:
df =
value text
0 100 fine
1 10 good
2 -5 bad
3 2 bad
What is the issue with my code?
Update: Timing of the three answers given below and numpy takes the cake. This is on my original df containing quarter million rows:
%timeit df['text'] = np.where(df['value']<5,'bad',df['text'])
18.2 ms ± 1.4 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df.loc[df['value']<5,'text'] = 'bad'
31.3 ms ± 4.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit df['text'] = df['text'].mask(df['value']<5, 'bad')
22.8 ms ± 602 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
CodePudding user response:
Assign back
df.loc[df['value']<5, 'text'] = 'bad'
df
Out[67]:
value text
0 100 fine
1 10 good
2 -5 bad
3 2 bad
CodePudding user response:
You need to set the other part of your condition:
df['text'] = np.where(df['value'] < 5, 'bad', df['text'])
print(df)
# Output
value text
0 100 fine
1 10 good
2 -5 bad
3 2 bad
CodePudding user response:
As the error message states, you should provide a second value that np.where
should map to if the condition doesn't hold:
df['text'] = np.where(df['value']<5, 'bad', df['text'])
This outputs:
value text
0 100 fine
1 10 good
2 -5 bad
3 2 bad
CodePudding user response:
You can try Series.mask
df['text'] = df['text'].mask(df['value'] < 5, 'bad')
print(df)
value text
0 100 fine
1 10 good
2 -5 bad
3 2 bad