Home > Software design >  solving nested renamer is not supported with dynamic arguments
solving nested renamer is not supported with dynamic arguments

Time:05-27

if cat_vars:
    df["static_cat"] = (
        df.groupby("group_col")
        .agg({i: "first" for i in cat_vars})
        .values.tolist()
    )

Error:

packages\pandas\core\groupby\generic.py in aggregate(self, func, *args, **kwargs)
        926         func = _maybe_mangle_lambdas(func)
        927 
    --> 928         result, how = self._aggregate(func, *args, **kwargs)
        929         if how is None:
        930             return result
    packages\pandas\core\base.py in _aggregate(self, arg, *args, **kwargs)
        355                     obj.columns.intersection(keys)
        356                 ) != len(keys):
    --> 357                     raise SpecificationError("nested renamer is not supported")
        358 
        359             from pandas.core.reshape.concat import concat
    
    SpecificationError: nested renamer is not supported

A similar question is solved here.But I want it to be dynamic, i.e. depending on the elements in the cat_vars code should adapt. for e.g. if cat_vars=[var1,var2] I can pass agg(var1="first" ,var2="first"}) to solve the problem. but what if it has 3 vars?

I really appreciate any help you can provide.

CodePudding user response:

Data:

df = pd.DataFrame({'group_col':[1,1,2,2,3],
                           'var1':range(5),
                           'var2':list('abcde')})

cat_vars = ['var1','var2']

If need only one aggreagte function simplier is:

df1 = df.groupby("group_col")[cat_vars].first()

Or use named aggregation with pass dictionary :

df1 = df.groupby("group_col").agg(**{i:(i, "first") for i in cat_vars})

Seems your solution should working too:

df1 = df.groupby("group_col").agg({i: "first" for i in cat_vars})
print (df1)
           var1 var2
group_col           
1             0    a
2             2    c
3             4    e

EDIT:

For new columns use:

df = pd.DataFrame({'group_col':[1,1,2,2,3],
                           'var1':range(5),
                           'var2':list('abcde')})

cat_vars = ['var1','var2']
df2 = df.join(df.groupby("group_col")[cat_vars].transform('first').add_prefix('new_'))
print (df2)
   group_col  var1 var2  new_var1 new_var2
0          1     0    a         0        a
1          1     1    b         0        a
2          2     2    c         2        c
3          2     3    d         2        c
4          3     4    e         4        e
  • Related