Home > database >  TypeError: TimeGrouper.__init__() got multiple values for argument 'freq'
TypeError: TimeGrouper.__init__() got multiple values for argument 'freq'

Time:12-01

What am I doing wrong?

This is all the code needed to reproduce.

import pandas as pd
g = pd.Grouper('datetime', freq='D')

Result:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [1], line 2
      1 import pandas as pd
----> 2 g = pd.Grouper('datetime', freq='D')

TypeError: TimeGrouper.__init__() got multiple values for argument 'freq'

Pandas version 1.5.1, Python version 3.10.6.

CodePudding user response:

This seems to be a bug

It looks like the weirdness is because Grouper.__new__() instantiates a TimeGrouper if you pass freq as a kwarg, but not if you pass freq as a positional argument. I don't know why it does that, and it's not documented, so it seems like a bug.

The reason for the error is that TimeGrouper.__init__()'s first parameter is freq, not key. (In fact it doesn't even have a key parameter; it specifies **kwargs, which are then sent to Grouper.__init__() at the end.)

Workaround

Pass key as a kwarg too:

g = pd.Grouper(key='datetime', freq='D')

All-positional syntax is also broken

In cottontail's answer, they suggested using positional arguments, with None for level, but this doesn't work fully. For example, you can't specify an origin:

pd.Grouper('datetime', None, 'D', origin='epoch')
TypeError: __init__() got an unexpected keyword argument 'origin'

(I'm using Python 3.9 and Pandas 1.4.4, so the error might look a bit different, but the same error should occur on your version.)

Even worse, the resulting Grouper doesn't work, for example:

df = pd.DataFrame({
    'datetime': pd.to_datetime([
        '2022-11-29T15', '2022-11-30T15', '2022-11-30T16']),
    'v': [1, 2, 3]})
>>> g = pd.Grouper('datetime', None, 'D')
>>> df.groupby(g).sum()
                     v
datetime              
2022-11-29 15:00:00  1
2022-11-30 15:00:00  2
2022-11-30 16:00:00  3

Compared to:

>>> g1 = pd.Grouper(key='datetime', freq='D')
>>> df.groupby(g1).sum()
            v
datetime     
2022-11-29  1
2022-11-30  5

CodePudding user response:

Try specifying the keyword argument key:

g = pd.Grouper(key='datetime', freq='D')
print(type(g))
# pandas.core.resample.TimeGrouper

As explained in wjandrea's post, TimeGrouper is instantiated only if freq kwarg is passed. For example, the following instantiates a Grouper:

g1 = pd.Grouper('datetime', None, pd.offsets.Day())
print(type(g1))
# pandas.core.groupby.grouper.Grouper

even though, g1.freq is an instance of pd._libs.tslibs.offsets.Day. So it looks like passing the arguments via keywords is the only way.

  • Related