Home > front end >  "False" is being inserted to column instead of stated value when using fillna in pandas, p
"False" is being inserted to column instead of stated value when using fillna in pandas, p

Time:04-19

I'm trying to fill in values in one column with a specific value based on what value is in another column. E.g I want small to have one value in and medium nulls to have another value.

| Size | Number |
|:-----|:-------|
|Small | 34 |
|Small |  |
|Medium| 45|
|Large| 72|
|Medium|  |
|Small|  |
|Large| 89|
|Small| 45|
|Medium| 32|
|Medium|  |

I'm using

df['Number'].fillna(df.groupby(df['Size']=='Small') == 10, inplace = True)

but this resulting in:

| Size | Number |
|:-----|:-------|
|Small | 34 |
|Small | FALSE |
|Medium| 45|
|Large| 72|
|Medium|  |
|Small|  FALSE|
|Large| 89|
|Small| 45|
|Medium| 32|
|Medium|  |

I'm expecting '10' to be added where False has been put. I've been stuck on this for ages. Can anyone tell me why it isn't inserting the value? Its finding the right rows to fill in but just not adding the value I want.

Please note that I'm working with a much larger database loaded from file and this is just an example df for my issue. Thank you.

CodePudding user response:

Consider that empty values, are not necessarily considered as Nulls, and therefore fillna will fail. Also, because of the evaluation you are making, you are creating a boolean pd.Series which is being passed as argument to the fillna() function, and not the strict value 10 which I assumed you intend to use. I would go with np.where():

df['Number'] = np.where((df['Number'].isna()) & (df['Size'] == 'Small'),10,
                        np.where((df['Number'].isna()) & (df['Size']=='Medium'),50,
                                 df['Number']))

This will return specific values (10 and 50) when the conditions are met, otherwise return the original number.

Returning:

0   Small    34.0
1   Small    10.0
2  Medium    45.0
3   Large    72.0
4  Medium    50.0
5   Small    10.0
6   Large    89.0
7   Small    45.0
8  Medium    32.0
9  Medium    50.0

CodePudding user response:

You can try fillna on selected rows

mask = df['Size'].eq('Small')
df.loc[mask, 'Number'] = df.loc[mask, 'Number'].fillna(10)
print(df)

     Size  Number
0   Small    34.0
1   Small    10.0
2  Medium    45.0
3   Large    72.0
4  Medium     NaN
5   Small    10.0
6   Large    89.0
7   Small    45.0
8  Medium    32.0
9  Medium     NaN
  • Related