Home > Software design >  Interpolating data for missing values pandas python
Interpolating data for missing values pandas python

Time:02-23

enter image description here[enter image description here][2]I am having trouble interpolating my missing values. I am using the following code to interpolate

df=pd.read_csv(filename, delimiter=',')
#Interpolating the nan values
df.set_index(df['Date'],inplace=True)
df2=df.interpolate(method='time')

Water=(df2['Water'])
Oil=(df2['Oil'])
Gas=(df2['Gas'])

Whenever I run my code I get the following message: "time-weighted interpolation only works on Series or DataFrames with a DatetimeIndex"

My Data consist of several columns with a header. The first column is named Date and all the rows look similar to this 12/31/2009. I am new to python and time series in general. Any tips will help.

Sample of CSV file

CodePudding user response:

Try this, assuming the first column of your csv is the one with date strings:

df = pd.read_csv(filename, index_col=0, parse_dates=[0], infer_datetime_format=True)
df2 = df.interpolate(method='time', limit_direction='both')

It theoretically should 1) convert your first column into actual datetime objects, and 2) set the index of the dataframe to that datetime column, all in one step. You can optionally include the infer_datetime_format=True argument. If your datetime format is a standard format, it can help speed up parsing by quite a bit.

The limit_direction='both' should back fill any NaNs in the first row, but because you haven't provided a copy-paste-able sample of your data, I cannot confirm on my end.

Reading the documentation can be incredibly helpful and can usually answer questions faster than you'll get answers from Stack Overflow!

  • Related