I'm trying to solve this problem on Stepik: The dataframe with the name my_stat
contains 4 columns: session_value
, group
, time
, n_users
. In the variables n_users
, we replaced all negative values with the median value of n_users
(excluding negative values, of course). Here's what I wrote:
import pandas as pd
import numpy as np
my_stat = my_stat['session_value'].replace(np.nan, 0)
my_stat.loc[my_stat['n_users'] < 0, 'n_users'] = my_stat['n_users'].median()
But I get this error:
Error:
Traceback (most recent call last):
File "jailed_code", line 25, in <module>
med = my_stat['n_users'].median()
File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/pandas/core/series.py", line 871, in __getitem__
result = self.index.get_value(self, key)
File "/home/stepic/instances/master-plugins/sandbox/python3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4405, in get_value
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
File "pandas/_libs/index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 135, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index_class_helper.pxi", line 109, in pandas._libs.index.Int64Engine._check_type
KeyError: 'n_users'
How can I fix it?
CodePudding user response:
Change your line
my_stat = my_stat['session_value'].replace(np.nan, 0)
to this
my_stat['session_value'] = my_stat['session_value'].replace(np.nan, 0)
Or, even better, use fillna
:
my_stat['session_value'] = my_stat['session_value'].fillna(0)