I'm trying to fill in values in one column with a specific value based on what value is in another column. E.g I want small to have one value in and medium nulls to have another value.
| Size | Number |
|:-----|:-------|
|Small | 34 |
|Small | |
|Medium| 45|
|Large| 72|
|Medium| |
|Small| |
|Large| 89|
|Small| 45|
|Medium| 32|
|Medium| |
I'm using
df['Number'].fillna(df.groupby(df['Size']=='Small') == 10, inplace = True)
but this resulting in:
| Size | Number |
|:-----|:-------|
|Small | 34 |
|Small | FALSE |
|Medium| 45|
|Large| 72|
|Medium| |
|Small| FALSE|
|Large| 89|
|Small| 45|
|Medium| 32|
|Medium| |
I'm expecting '10' to be added where False has been put. I've been stuck on this for ages. Can anyone tell me why it isn't inserting the value? Its finding the right rows to fill in but just not adding the value I want.
Please note that I'm working with a much larger database loaded from file and this is just an example df for my issue. Thank you.
CodePudding user response:
Consider that empty values, are not necessarily considered as Nulls, and therefore fillna will fail. Also, because of the evaluation you are making, you are creating a boolean pd.Series which is being passed as argument to the fillna() function, and not the strict value 10
which I assumed you intend to use. I would go with np.where()
:
df['Number'] = np.where((df['Number'].isna()) & (df['Size'] == 'Small'),10,
np.where((df['Number'].isna()) & (df['Size']=='Medium'),50,
df['Number']))
This will return specific values (10 and 50) when the conditions are met, otherwise return the original number.
Returning:
0 Small 34.0
1 Small 10.0
2 Medium 45.0
3 Large 72.0
4 Medium 50.0
5 Small 10.0
6 Large 89.0
7 Small 45.0
8 Medium 32.0
9 Medium 50.0
CodePudding user response:
You can try fillna
on selected rows
mask = df['Size'].eq('Small')
df.loc[mask, 'Number'] = df.loc[mask, 'Number'].fillna(10)
print(df)
Size Number
0 Small 34.0
1 Small 10.0
2 Medium 45.0
3 Large 72.0
4 Medium NaN
5 Small 10.0
6 Large 89.0
7 Small 45.0
8 Medium 32.0
9 Medium NaN