DataFrame groupby 2 columns und count occurence in third-CodePudding

I'm trying to group a DataFrame by two columns and count the difference occurence in the third column. What I've got so far is:

import pandas as pd

df = pd.DataFrame({'colA': ['name1', 'name2', 'name2', 'name4', 'name2', 'name5'], 'colB': ['red', 'yellow', 'yellow', 'black', 'yellow', 'green'], 'colC': ['value1', 'vlaue2', 'value2', 'value4', 'value5', 'value6']})

df_grouped = df.groupby(["colA", "colB"])["colC"].count()

print(df_grouped)

Where the output is:

colA   colB  
name1  red       1
name2  yellow    3
name4  black     1
name5  green     1

What I'm trying to do is, see colA and colB as 1 colum and count (and show) the occurance of the different values in colC. So in this case it shoud be:

    colA    colB    colC  count  
0  name1     red  value1  1
1  name2  yellow  vlaue2  2
2                 value5  1
3  name4   black  value4  1
4  name5   green  value6  1

Thanks in advance.

Edit: What I also tried is this:

df["colAB"] = df["colA"].map(str)   " "   df["colB"]
df_grouped = df.groupby(["colAB", "colC"])["colC"].count().reset_index(name="count")

But it also didn't bring the right result. To be more specific. What I need is every value (colC) and the number how often it occurs for the combination of colA and colB. Even if there are 2 columns for colA and ColB the combination of the two is one distinct value.

CodePudding user response：

If I understand correctly you just need to group by 3 col A,B and C:

df_grouped = df.groupby(["colA", "colB","colC"])["colC"].count().reset_index(name="count")

Output :
>
    colA    colB    colC  count
0  name1     red  value1      1
1  name2  yellow  value2      2
2  name2  yellow  value5      1
3  name4   black  value4      1
4  name5   green  value6      1

df_grouped = df.groupby(["colA", "colB","colC"])["colC"].count()

> 
colA   colB    colC  
name1  red     value1    1
name2  yellow  value2    2
               value5    1
name4  black   value4    1
name5  green   value6    1