For each unique Pandas series value, count an other field-CodePudding

I have a Data Frame like that :

df = pd.DataFrame({"id": [1, 2, 2, 1, 3, 3, 2, 1, 2], 
                   "colors": ['green', 'blue', 'green', 'yellow', 'yellow', 'yellow', 'green', 'red', 'black']

So there are two series with same length but no logical patern between id and colors. Eg id 1 can be blue green, red etc.

That I want to do is to create an other Data Frame with a first row of each unique id and next rows with the count of all possible colors.

In my example, I want the result to be :

id	green	blue	yellow	red	black
1	1	0	1	1	0
2	2	1	0	0	1
3	0	0	2	0	0

I try : but the == doesn't work.

print(df.loc[df.id.unique() == df["id"], "colors"].value_counts())

CodePudding user response：

You could use pivot_table:

df.pivot_table(index="id", columns="colors", aggfunc=len, fill_value=0)

# out:
colors  black  blue  green  red  yellow
id                                     
1           0     0      1    1       1
2           1     1      2    0       0
3           0     0      0    0       2