Home > Software engineering >  Pandas' groupby().apply() method
Pandas' groupby().apply() method

Time:09-17

I have following application on pandas' groupby.apply method

import pandas as pd
dat = pd.DataFrame({'name' : ['A', 'C', 'A', 'B', 'C'], 'val' : [1,2,1,2,4]})
dat.groupby('name').apply(lambda x : x)

This gives result as

  name  val
0    A    1
1    C    2
2    A    1
3    B    2
4    C    4

However I wanted to get the dataframe corresponding to i-th group. For example, for the first group I intend to get dataframe with name = 'A' etc.

Is there any method available to achieve the same?

CodePudding user response:

Try as follows:

# wrap groupby object in `list`, then in `dict`
groups = dict(list(dat.groupby('name')))

# all keys
print(groups.keys())
dict_keys(['A', 'B', 'C'])

# value for first key
print(groups['A'])

  name  val
0    A    1
2    A    1

Suppose you were grouping by name and val, then you might end up with awkard keys. E.g.:

groups = dict(list(dat.groupby(['name','val'])))

print(groups.keys())
dict_keys([('A', 1), ('B', 2), ('C', 2), ('C', 4)])

In this case, you might consider using a dictionary comprehension, like this:

groups = {idx:group[1] for idx, group in enumerate(dat.groupby(['name','val']))}

print(groups.keys())
dict_keys([0, 1, 2, 3])

print(groups[0])

  name  val
0    A    1
2    A    1

More generally, if you want to customize processing any further, you can access the groups as follows:

for name, group in dat.groupby(['name','val']):
    print(name)
    print(group)
    break

('A', 1) # `name`, i.e. first grouper

# `group`, i.e. `df` belonging to the grouper
  name  val
0    A    1
2    A    1
  • Related