I have
df = pd.DataFrame({"A": [1,2,3,4,5,6,7,8], "B": [1,1,2,2,3,3,4,4], "C": [1,1,1,1,2,3,2,2] })
A B C
0 1 1 1
1 2 1 1
2 3 2 1
3 4 2 1
4 5 3 2
5 6 3 3
6 7 4 2
7 8 4 2
I would like to know, for each value b of column B, how many unique values c of column C there that are in rows where B=b.
So I'd like a series that tells me something like {1:1, 2:2, 3:2, 4:1}
meaning that, for example, when B=3, there are two unique values of C (namely 2 and 3).
How do I do this? Thanks
CodePudding user response:
Try this:
df.groupby("B")["C"].nunique()
> B
1 1
2 1
3 2
4 1
Name: C, dtype: int64
CodePudding user response:
df.groupby("B")["C"].nunique().to_dict()
output:
{1: 1, 2: 1, 3: 2, 4: 1}
how does it work?
- every time you want to calculate something in one column based on values in another,
groupby
is coming... so use it and pass all values grouped. - what do you want? number of unique values in
C
... so use["C"].nunique()
that return number of unique values. - and at last, you want
dict
, so convert your resultto_dict()