I found this problem bellow while executing the code bellow on google colab it works normaly
df['temps'] = df['temps'].view(int).div(1e9).diff().fillna(0).abs()
print(df)
but while using jupyter notebook localy the error bellow appears
ValueError Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 df3['rebounds'] = pd.Series(df3['temps'].view(int).div(1e9).diff().fillna(0))
File C:\Python310\lib\site-packages\pandas\core\series.py:818, in Series.view(self, dtype)
815 # self.array instead of self._values so we piggyback on PandasArray
816 # implementation
817 res_values = self.array.view(dtype)
--> 818 res_ser = self._constructor(res_values, index=self.index)
819 return res_ser.__finalize__(self, method="view")
File C:\Python310\lib\site-packages\pandas\core\series.py:442, in Series.__init__(self, data, index, dtype, name, copy, fastpath)
440 index = default_index(len(data))
441 elif is_list_like(data):
--> 442 com.require_length_match(data, index)
444 # create/copy the manager
445 if isinstance(data, (SingleBlockManager, SingleArrayManager)):
File C:\Python310\lib\site-packages\pandas\core\common.py:557, in require_length_match(data, index)
553 """
554 Check the length of data matches the length of the index.
555 """
556 if len(data) != len(index):
--> 557 raise ValueError(
558 "Length of values "
559 f"({len(data)}) "
560 "does not match length of index "
561 f"({len(index)})"
562 )
ValueError: Length of values (830) does not match length of index (415)
any suggetions to resolve this !!
CodePudding user response:
Here are two ways to get this to work:
df3['rebounds'] = pd.Series(df3['temps'].view('int64').diff().fillna(0).div(1e9))
... or:
df3['rebounds'] = pd.Series(df3['temps'].astype('int64').diff().fillna(0).div(1e9))
For the following sample input:
df3.dtypes:
temps datetime64[ns]
dtype: object
df3:
temps
0 2022-01-01
1 2022-01-02
2 2022-01-03
... both of the above code samples give this output:
df3.dtypes:
temps datetime64[ns]
rebounds float64
dtype: object
df3:
temps rebounds
0 2022-01-01 0.0
1 2022-01-02 86400.0
2 2022-01-03 86400.0
The issue is probably that view()
essentially reinterprets the raw data of the existing series as a different data type. For this to work, according to the Series.view()
docs (see also the numpy.ndarray.view()
docs) the data types must have the same number of bytes. Since the original data is datetime64
, your code specifying int
as the argument to view() may not have met this requirement. Explicitly specifying int64
should meet it. Or, using astype()
instead of view()
with int64 will also work.
As to why this works in colab and not in jupyter notebook, I can't say. Perhaps they are using different versions of pandas and numpy which treat int
differently.
I do know that in my environment, if I try the following:
df3['rebounds'] = pd.Series(df3['temps'].astype('int').diff().fillna(0).div(1e9))
... then I get this error:
TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]
This suggests that int
means int32
. It would be interesting to see if this works on colab.