Collapse rows to frequency in Python-CodePudding

ID       Color       Size    Category  
1       Green       Large   High       
2       Red         Small   Low 
3       Green       Small   High      
4       Green       Small   Low       
5       Red         Large   High      
...      ...      ...      ...

I have a pandas data frame that is set up like this.

What I want is to collapse all rows with the same specifications into one row with a frequency like this:

Color    Size    Category   Frequency 
Green    Large   High       4
Green    Large   Low        5 
Green    Small   High       3 
Green    Small   Low        7
Red      Large   High       14
...      ...      ...

For example, in the whole data frame, there are 4 individuals with Green, Large, and High as their values.

How do I make this transition?

CodePudding user response：

Count distinct ID for each combination of 'Color', 'Size', and 'Category', and rename result to Frequency:

res = (df.groupby(['Color', 'Size', 'Category'], as_index=False)['ID']
         .nunique()
         .rename(columns={'ID': 'Frequency'}))

CodePudding user response：

IIUC, you can try groupby then count

out = df.groupby(['Color', 'Size', 'Category'], as_index=False).count().rename(columns={'ID': 'Frequency'})

CodePudding user response：

Just use value_counts:

output = df.set_index("ID").value_counts()