I have a dataframe with missing rows that I interpolate and resample. I would like to know if there was a way to grab the index of the rows that are added to the dataframe when I resample it ?
This is how I create/resample/interpolate the dataframe:
import numpy as np
import pandas as pd
from datetime import *
# Create df and drop a few rows
rng = pd.date_range('2000-01-01', periods=365, freq='D')
df = pd.DataFrame({'Val': np.random.randn(len(rng)) },index = rng)
df = df.drop([datetime(2000,1,5),datetime(2000,1,24)])
df = df.resample('D').interpolate(method='linear')
CodePudding user response:
you can get the additional index elements by taking the difference between the new and the old ones
In [16]: df_new = df.resample('D').interpolate(method='linear')
In [17]: df_new.index.difference(df.index)
Out[17]: DatetimeIndex(['2000-01-05', '2000-01-24'], dtype='datetime64[ns]', freq=None)