I have a pandas dataframe with several time series ta
, tb
etc. below and corresponding measurements, given here by av
, bv
....
ta = np.arange(0, 1, 0.01)
av = np.random.rand(ta.shape[0], 1).flatten()
tb = np.arange(0, 1, 0.015)
bv = np.random.rand(tb.shape[0], 1).flatten()
d = {'ta': ta, 'a_val': av, 'tb':tb, 'b_val':bv}
pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in d.items() ]))
The time series all run from 0 to 1. I want to stretch and interpolate the shorter data so that they have the same number of rows.
I was going to use pd.resample()
but it seems the data has to be in the date/time format for that.
CodePudding user response:
Split the problem in 2 step:
- Stratch the shorter arrays (tb and bv)
After the creation of the arrays, following the suggestion of mccandar, extend tb and bv using:
to = len(ta)
tb_resized = np.repeat(np.nan,to)
foreign = np.linspace(0,to-1,len(tb)).round().astype(int)
tb_resized[foreign] = tb
bv_resized = np.repeat(np.nan,to)
foreign = np.linspace(0,to-1,len(tb)).round().astype(int)
bv_resized[foreign] = bv
d = {'ta': ta, 'a_val': av, 'tb':tb_resized, 'b_val':bv_resized}
- Interpolate.
Luckily pandas make this operation simple, trhrough interpolate method!
pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in d.items()])).interpolate()