I have a data set where null value is
df.isnull().sum()
country 0
country_long 0
name 0
gppd_idnr 0
capacity_mw 0
latitude 46
longitude 46
primary_fuel 0
other_fuel1 0
other_fuel2 0
other_fuel3 908
commissioning_year 380
owner 0
source 0
url 0
geolocation_source 0
wepp_id 908
year_of_capacity_data 388
generation_gwh_2013 524
generation_gwh_2014 507
generation_gwh_2015 483
generation_gwh_2016 471
generation_gwh_2017 465
generation_data_source 0
estimated_generation_gwh 908
I tried mean mode max min and std all the methods but all null values is not removing when I try
df['wepp_id']=df['wepp_id'].replace(np.NAN,df['wepp_id'].mean())
its not working same things happen on median , std and min, max also
CodePudding user response:
Try df['wepp_id']=df['wepp_id'].fillna(df['wepp_id'].mean())
If this does not work, then it means that your column is not of number type. If it is an string type, then you need to do this first:
df['wepp_id'] = df['wepp_id'].astype(float)
Then run the command in the first line.