ID Color Size Category
1 Green Large High
2 Red Small Low
3 Green Small High
4 Green Small Low
5 Red Large High
... ... ... ...
I have a pandas data frame that is set up like this.
What I want is to collapse all rows with the same specifications into one row with a frequency like this:
Color Size Category Frequency
Green Large High 4
Green Large Low 5
Green Small High 3
Green Small Low 7
Red Large High 14
... ... ...
For example, in the whole data frame, there are 4 individuals with Green, Large, and High as their values.
How do I make this transition?
CodePudding user response:
Count distinct ID
for each combination of 'Color', 'Size', and 'Category', and rename result to Frequency
:
res = (df.groupby(['Color', 'Size', 'Category'], as_index=False)['ID']
.nunique()
.rename(columns={'ID': 'Frequency'}))
CodePudding user response:
IIUC, you can try groupby
then count
out = df.groupby(['Color', 'Size', 'Category'], as_index=False).count().rename(columns={'ID': 'Frequency'})
CodePudding user response:
Just use value_counts
:
output = df.set_index("ID").value_counts()