What am I doing wrong?
This is all the code needed to reproduce.
import pandas as pd
g = pd.Grouper('datetime', freq='D')
Result:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [1], line 2
1 import pandas as pd
----> 2 g = pd.Grouper('datetime', freq='D')
TypeError: TimeGrouper.__init__() got multiple values for argument 'freq'
Pandas version 1.5.1, Python version 3.10.6.
CodePudding user response:
This seems to be a bug
It looks like the weirdness is because Grouper.__new__()
instantiates a TimeGrouper
if you pass freq
as a kwarg, but not if you pass freq
as a positional argument. I don't know why it does that, and it's not documented, so it seems like a bug.
The reason for the error is that TimeGrouper.__init__()
's first parameter is freq
, not key
. (In fact it doesn't even have a key
parameter; it specifies **kwargs
, which are then sent to Grouper.__init__()
at the end.)
Workaround
Pass key
as a kwarg too:
g = pd.Grouper(key='datetime', freq='D')
All-positional syntax is also broken
In cottontail's answer, they suggested using positional arguments, with None
for level, but this doesn't work fully. For example, you can't specify an origin:
pd.Grouper('datetime', None, 'D', origin='epoch')
TypeError: __init__() got an unexpected keyword argument 'origin'
(I'm using Python 3.9 and Pandas 1.4.4, so the error might look a bit different, but the same error should occur on your version.)
Even worse, the resulting Grouper
doesn't work, for example:
df = pd.DataFrame({
'datetime': pd.to_datetime([
'2022-11-29T15', '2022-11-30T15', '2022-11-30T16']),
'v': [1, 2, 3]})
>>> g = pd.Grouper('datetime', None, 'D')
>>> df.groupby(g).sum()
v
datetime
2022-11-29 15:00:00 1
2022-11-30 15:00:00 2
2022-11-30 16:00:00 3
Compared to:
>>> g1 = pd.Grouper(key='datetime', freq='D')
>>> df.groupby(g1).sum()
v
datetime
2022-11-29 1
2022-11-30 5
CodePudding user response:
Try specifying the keyword argument key
:
g = pd.Grouper(key='datetime', freq='D')
print(type(g))
# pandas.core.resample.TimeGrouper
As explained in wjandrea's post, TimeGrouper
is instantiated only if freq
kwarg is passed. For example, the following instantiates a Grouper
:
g1 = pd.Grouper('datetime', None, pd.offsets.Day())
print(type(g1))
# pandas.core.groupby.grouper.Grouper
even though, g1.freq
is an instance of pd._libs.tslibs.offsets.Day
. So it looks like passing the arguments via keywords is the only way.