I want to get the groupby column i.e. column that is supplied to df.groupby
as a by
argument (i.e. df.groupby(by=column)
), within the apply
context that comes after groupby
(i.e. df.groupby(by=column).apply(Here)
).
For example,
df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
'Parrot', 'Parrot'],
'Max Speed': [380., 370., 24., 26.]})
df.groupby(['Animal']).apply(Here I want to know that groupby column is 'Animal')
df
Animal Max Speed
0 Falcon 380.0
1 Falcon 370.0
2 Parrot 24.0
3 Parrot 26.0
Of course, I can have one more line of code or simply by supplying the groupby
column to the apply
context separately (e.g. .apply(lambda df_: some_function(df_,s='Animal'))
), but I am curious to see if this can be done in a single line e.g. possibly using pandas
function built for doing this.
CodePudding user response:
I just figured out a one-liner solution:
df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
'Parrot', 'Parrot'],
'Max Speed': [380., 370., 24., 26.]})
df.groupby(['Animal']).apply(lambda df_: df_.apply(lambda x: all(x==df_.name)).loc[lambda x: x].index.tolist())
returns groupby column within each groupby.apply context.
Animal
Falcon [Animal]
Parrot [Animal]
Since it is quite a long one-liner (uses 3 lambda
s!), it is better to wrap it in a separate function, as shown below:
def get_groupby_column(df_): return df_.apply(lambda x: all(x==df_.name)).loc[lambda x: x].index.tolist()
df.groupby(['Animal']).apply(get_groupby_column)
Note of caution: this solution won't apply if other columns of the dataframe also contain the items from the groupby column e.g. if Max Speed
column contained any of the items from the groupby column (i.e. Animal
) there will be inaccurate results.
CodePudding user response:
You could use grouper.names
:
>>> df.groupby('Animal').grouper.names
['Animal']
>>>
With apply
:
grouped = df.groupby('Animal')
grouped.apply(lambda x: grouped.grouper.names)