Home > other >  For each unique Pandas series value, count an other field
For each unique Pandas series value, count an other field

Time:06-06

I have a Data Frame like that :

df = pd.DataFrame({"id": [1, 2, 2, 1, 3, 3, 2, 1, 2], 
                   "colors": ['green', 'blue', 'green', 'yellow', 'yellow', 'yellow', 'green', 'red', 'black']

So there are two series with same length but no logical patern between id and colors. Eg id 1 can be blue green, red etc.

That I want to do is to create an other Data Frame with a first row of each unique id and next rows with the count of all possible colors.

In my example, I want the result to be :

id green blue yellow red black
1 1 0 1 1 0
2 2 1 0 0 1
3 0 0 2 0 0

I try : but the == doesn't work.

print(df.loc[df.id.unique() == df["id"], "colors"].value_counts())

CodePudding user response:

You could use pivot_table:

df.pivot_table(index="id", columns="colors", aggfunc=len, fill_value=0)

# out:
colors  black  blue  green  red  yellow
id                                     
1           0     0      1    1       1
2           1     1      2    0       0
3           0     0      0    0       2
  • Related