enter image description here[enter image description here][2]I am having trouble interpolating my missing values. I am using the following code to interpolate
df=pd.read_csv(filename, delimiter=',')
#Interpolating the nan values
df.set_index(df['Date'],inplace=True)
df2=df.interpolate(method='time')
Water=(df2['Water'])
Oil=(df2['Oil'])
Gas=(df2['Gas'])
Whenever I run my code I get the following message: "time-weighted interpolation only works on Series or DataFrames with a DatetimeIndex"
My Data consist of several columns with a header. The first column is named Date and all the rows look similar to this 12/31/2009. I am new to python and time series in general. Any tips will help.
CodePudding user response:
Try this, assuming the first column of your csv is the one with date strings:
df = pd.read_csv(filename, index_col=0, parse_dates=[0], infer_datetime_format=True)
df2 = df.interpolate(method='time', limit_direction='both')
It theoretically should 1) convert your first column into actual datetime
objects, and 2) set the index of the dataframe to that datetime
column, all in one step. You can optionally include the infer_datetime_format=True
argument. If your datetime format is a standard format, it can help speed up parsing by quite a bit.
The limit_direction='both'
should back fill any NaN
s in the first row, but because you haven't provided a copy-paste-able sample of your data, I cannot confirm on my end.
Reading the documentation can be incredibly helpful and can usually answer questions faster than you'll get answers from Stack Overflow!