Home > Net >  When using 'df.groupby(column).apply()' get the groupby column within the 'apply'
When using 'df.groupby(column).apply()' get the groupby column within the 'apply'

Time:10-18

I want to get the groupby column i.e. column that is supplied to df.groupby as a by argument (i.e. df.groupby(by=column)), within the apply context that comes after groupby (i.e. df.groupby(by=column).apply(Here)).

For example,

df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
                              'Parrot', 'Parrot'],
                   'Max Speed': [380., 370., 24., 26.]})
df.groupby(['Animal']).apply(Here I want to know that groupby column is 'Animal')
df
   Animal  Max Speed
0  Falcon      380.0
1  Falcon      370.0
2  Parrot       24.0
3  Parrot       26.0

Of course, I can have one more line of code or simply by supplying the groupby column to the apply context separately (e.g. .apply(lambda df_: some_function(df_,s='Animal')) ), but I am curious to see if this can be done in a single line e.g. possibly using pandas function built for doing this.

CodePudding user response:

I just figured out a one-liner solution:

df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
                              'Parrot', 'Parrot'],
                   'Max Speed': [380., 370., 24., 26.]})
df.groupby(['Animal']).apply(lambda df_: df_.apply(lambda x: all(x==df_.name)).loc[lambda x: x].index.tolist())

returns groupby column within each groupby.apply context.

Animal
Falcon    [Animal]
Parrot    [Animal]

Since it is quite a long one-liner (uses 3 lambdas!), it is better to wrap it in a separate function, as shown below:

def get_groupby_column(df_): return df_.apply(lambda x: all(x==df_.name)).loc[lambda x: x].index.tolist()
df.groupby(['Animal']).apply(get_groupby_column)

Note of caution: this solution won't apply if other columns of the dataframe also contain the items from the groupby column e.g. if Max Speed column contained any of the items from the groupby column (i.e. Animal) there will be inaccurate results.

CodePudding user response:

You could use grouper.names:

>>> df.groupby('Animal').grouper.names
['Animal']
>>> 

With apply:

grouped = df.groupby('Animal')
grouped.apply(lambda x: grouped.grouper.names)
  • Related