Home > database >  Determine number of unique values of one column for each value of another column
Determine number of unique values of one column for each value of another column

Time:07-20

I have

df = pd.DataFrame({"A": [1,2,3,4,5,6,7,8], "B": [1,1,2,2,3,3,4,4], "C": [1,1,1,1,2,3,2,2] })

    A   B   C
0   1   1   1
1   2   1   1
2   3   2   1
3   4   2   1
4   5   3   2
5   6   3   3
6   7   4   2
7   8   4   2

I would like to know, for each value b of column B, how many unique values c of column C there that are in rows where B=b.

So I'd like a series that tells me something like {1:1, 2:2, 3:2, 4:1} meaning that, for example, when B=3, there are two unique values of C (namely 2 and 3).

How do I do this? Thanks

CodePudding user response:

Try this:

df.groupby("B")["C"].nunique()

> B
  1    1
  2    1
  3    2
  4    1
  Name: C, dtype: int64

CodePudding user response:

df.groupby("B")["C"].nunique().to_dict()

output:

{1: 1, 2: 1, 3: 2, 4: 1}

how does it work?

  1. every time you want to calculate something in one column based on values in another, groupby is coming... so use it and pass all values grouped.
  2. what do you want? number of unique values in C... so use ["C"].nunique() that return number of unique values.
  3. and at last, you want dict, so convert your result to_dict()
  • Related