I'm trying to use np.interp
to interpolate a float value based on pandas TimeStamp data. However, I noticed that np.interp
works if the input x
is a pandas TimeStamp pandas series, but not if it's a single TimeStamp
object.
Here's the code to illustrate this:
import pandas as pd
import numpy as np
coarse = pd.DataFrame({'start': ['2016-01-01 07:00:00.00000 00:00',
'2016-01-01 07:30:00.00000 00:00',]} )
fine = pd.DataFrame({'start': ['2016-01-01 07:00:02.156657 00:00',
'2016-01-01 07:00:15 00:00',
'2016-01-01 07:00:32 00:00',
'2016-01-01 07:11:17 00:00',
'2016-01-01 07:14:00 00:00',
'2016-01-01 07:15:55 00:00',
'2016-01-01 07:33:04 00:00'],
'price': [0,
1,
2,
3,
4,
5,
6,
]} )
coarse['start'] = pd.to_datetime(coarse['start'])
fine['start'] = pd.to_datetime(fine['start'])
np.interp(x=coarse.start, xp=fine.start, fp=fine.price) # works
np.interp(x=coarse.start.iloc[-1], xp=fine.start, fp=fine.price) # doesn't work
The latter gives the error
TypeError: float() argument must be a string or a number, not 'Timestamp'
I am wondering why the latter doesn't work, while the former does?
CodePudding user response:
The input of interp
must be an "array-like" (iterable), you can use .iloc[[-1]]
:
np.interp(x=coarse.start.iloc[[-1]], xp=fine.start, fp=fine.price)
Output: array([5.82118562])
CodePudding user response:
Look at what you get when selecting an item from the Series:
In [8]: coarse.start
Out[8]:
0 2016-01-01 07:00:00 00:00
1 2016-01-01 07:30:00 00:00
Name: start, dtype: datetime64[ns, UTC]
In [9]: coarse.start.iloc[-1]
Out[9]: Timestamp('2016-01-01 07:30:00 0000', tz='UTC')
With the list index, it's a Series:
In [10]: coarse.start.iloc[[-1]]
Out[10]:
1 2016-01-01 07:30:00 00:00
Name: start, dtype: datetime64[ns, UTC]
I was going to scold you for not showing the full error message, but I see that it's a compiled piece of code that raises the error. Keep in mind that interp
is a numpy function, which works with numpy arrays, and for math like this, float dtype ones.
So it's a good guess that interp
is trying to make a float array from your argument.
In [14]: np.asarray(coarse.start, dtype=float)
Out[14]: array([1.4516316e 18, 1.4516334e 18])
In [15]: np.asarray(coarse.start.iloc[[1]], dtype=float)
Out[15]: array([1.4516334e 18])
In [16]: np.asarray(coarse.start.iloc[1], dtype=float)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[16], line 1
----> 1 np.asarray(coarse.start.iloc[1], dtype=float)
TypeError: float() argument must be a string or a number, not 'Timestamp'
It can't make a float value from a Python TimeStamp
object.