I have data for a TV game show where I have the corresponding Rounds and question categories in those rounds. I grouped the questions by round and category with the following code:
data.groupby(['Round']).Category.value_counts()
data.groupby(['Round']).Category.value_counts().head(n)
When I do the function head(n) it only shows me n observations from the first group, and I would like to get the n most repeated categories in each group
How can I find a solution to this problem.
CodePudding user response:
Reversed: you can count the values first, then take the top N per "Round":
df[["Round", "Category"]].value_counts().groupby(level="Round").head(n)
CodePudding user response:
You can use groupby.apply
here:
data.groupby('Round')['Category'].apply(lambda g: g.value_counts().head(n))