Home > Net >  Python dataframe: fill text in certain rows if other columns satisfied
Python dataframe: fill text in certain rows if other columns satisfied

Time:05-27

I have two columns. I want to fill text in the second column based on the values in the first column.

Here is my code:

df = pd.DataFrame({'value':[100,10,-5,2],'text':['fine','good',np.nan,np.nan]})
df['text'] = np.where(df['value']<5,'bad')

Present output:

ValueError: either both or neither of x and y should be given

Expected output:

df = 

   value  text
0    100  fine
1     10  good
2     -5   bad
3      2   bad

What is the issue with my code?

Update: Timing of the three answers given below and numpy takes the cake. This is on my original df containing quarter million rows:

%timeit df['text'] = np.where(df['value']<5,'bad',df['text'])
18.2 ms ± 1.4 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit df.loc[df['value']<5,'text'] = 'bad'
31.3 ms ± 4.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit df['text'] = df['text'].mask(df['value']<5, 'bad')
22.8 ms ± 602 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

CodePudding user response:

Assign back

df.loc[df['value']<5, 'text'] = 'bad'
df
Out[67]: 
   value  text
0    100  fine
1     10  good
2     -5   bad
3      2   bad

CodePudding user response:

You need to set the other part of your condition:

df['text'] = np.where(df['value'] < 5, 'bad', df['text'])
print(df)

# Output
   value  text
0    100  fine
1     10  good
2     -5   bad
3      2   bad

CodePudding user response:

As the error message states, you should provide a second value that np.where should map to if the condition doesn't hold:

df['text'] = np.where(df['value']<5, 'bad', df['text'])

This outputs:

   value  text
0    100  fine
1     10  good
2     -5   bad
3      2   bad

CodePudding user response:

You can try Series.mask

df['text'] = df['text'].mask(df['value'] < 5, 'bad')
print(df)

   value  text
0    100  fine
1     10  good
2     -5   bad
3      2   bad
  • Related