Let's say I have a dataframe like this:
import pandas as pd
df = pd.DataFrame([
('a', 'aa'),
('b', 'aa'),
('c', 'bb'),
('d', 'bb'),
('e', 'cc'),
('f', 'cc'),
('h', 'cc')
], columns=['group', 'id'])
I do a groupby to show the count of unique values and also the unique values themself. Here is what I am doing now:
df1 = df.groupby(["id"])["group"].nunique()
print(df1)
id
aa 2
bb 2
cc 3
df2 = df.groupby(['id'])['group'].agg(['unique'])
print(df2)
id
aa [a, b]
bb [c, d]
cc [e, f, h]
However, I am trying to have these two shown next to each other (one column shows the count and one shows the values as shown below. Is there any way to get that?
id count values
aa 2 [a, b]
bb 2 [c, d]
cc 3 [e, f, h]
CodePudding user response:
res = df.groupby('id')['group'].agg(count='nunique', values='unique')
Output
>>> res
count values
id
aa 2 [a, b]
bb 2 [c, d]
cc 3 [e, f, h]