I have a pandas dataframe where I am trying to group a column and get the unique values of another column.
id name
a-1 sfdad
a-1 sfdad
a-1 oiuoi
a-2 oqrwq
a-2 oqrwq
a-2 ljlsg
a-2 uoire
I do the group by using:
df = df.groupby('id')['name'].agg(['unique'])
df = df.reset_index()
and then when i do the count of column "unique" using the below statement, it does not align with the results of df['unique']. Length of df['unique'] and the below statement seems to be different.
df.groupby('id')['name'].nunique()
Result
id unique count
a-1 [sfdad,oiuoi] 2
a-2 [oqrwq,ljlsg,uoire] 3
CodePudding user response:
You can compute several things at once with agg
. This will necessarily be aligned:
df.groupby('id')['name'].agg(['unique', 'nunique'])
output:
unique nunique
id
a-1 [sfdad, oiuoi] 2
a-2 [oqrwq, ljlsg, uoire] 3