Home > Software design >  question when using interpolate for missing value
question when using interpolate for missing value

Time:02-19

I had a dataset looks like below:

import pandas as pd
test = pd.DataFrame({
    'date': ['2018-01-03 00:00:00', '2018-01-04 00:00:00', '2018-01-05 00:00:00'],
    'coal': [2669.0, np.nan ,2731.0],
     'hydro': [222.0, np.nan ,230.0],
    'unit': ['Gwh', 'Gwh', 'Gwh'],
})

test['date'] = pd.to_datetime(test['date'])

and when i was trying to fill the null by using interpolate method:

for x in test.columns.to_list():
    test[x] = test[x].fillna(test[x].interpolate())

it got error :

Invalid fill method. Expecting pad (ffill) or backfill (bfill). Got linear

and when i remove the

test['date'] = pd.to_datetime(test['date'])

it will work just fine, so i dont know why this happened:(

and also, it works fine when i using the sample df, but when i try it on my own dataset, it has no error, but the null value still exist, like the fillna() never used:(

and when i was using below code, then use the fillna code:

test['date'] = test['date'].astype(object)

it works on my sample, but not my own dataset:( my own dataset still like i never used the fillna method

im so confused by now:( was wondering if someone could explain why this happened? I try to google it, but no result:(

or maybe i dont know how to google it:(

CodePudding user response:

Is this what you're trying to do?

In [8]: test[["coal", "hydro"]].apply(pd.Series.interpolate)
Out[8]:
     coal  hydro
0  2669.0  222.0
1  2700.0  226.0
2  2731.0  230.0

CodePudding user response:

You just need to interpolate the given columns

cols_to_interpolate = ["coal", "hydro"]
test[cols_to_interpolate] = test[cols_to_interpolate].interpolate()
  • Related