I have following application on pandas' groupby.apply
method
import pandas as pd
dat = pd.DataFrame({'name' : ['A', 'C', 'A', 'B', 'C'], 'val' : [1,2,1,2,4]})
dat.groupby('name').apply(lambda x : x)
This gives result as
name val
0 A 1
1 C 2
2 A 1
3 B 2
4 C 4
However I wanted to get the dataframe corresponding to i-th
group. For example, for the first group I intend to get dataframe with name = 'A'
etc.
Is there any method available to achieve the same?
CodePudding user response:
Try as follows:
# wrap groupby object in `list`, then in `dict`
groups = dict(list(dat.groupby('name')))
# all keys
print(groups.keys())
dict_keys(['A', 'B', 'C'])
# value for first key
print(groups['A'])
name val
0 A 1
2 A 1
Suppose you were grouping by name
and val
, then you might end up with awkard keys. E.g.:
groups = dict(list(dat.groupby(['name','val'])))
print(groups.keys())
dict_keys([('A', 1), ('B', 2), ('C', 2), ('C', 4)])
In this case, you might consider using a dictionary comprehension, like this:
groups = {idx:group[1] for idx, group in enumerate(dat.groupby(['name','val']))}
print(groups.keys())
dict_keys([0, 1, 2, 3])
print(groups[0])
name val
0 A 1
2 A 1
More generally, if you want to customize processing any further, you can access the groups as follows:
for name, group in dat.groupby(['name','val']):
print(name)
print(group)
break
('A', 1) # `name`, i.e. first grouper
# `group`, i.e. `df` belonging to the grouper
name val
0 A 1
2 A 1