View set of values of column C for each value of column B-CodePudding

I have

df = pd.DataFrame({"A": [1,2,3,4,5,6,7,8], "B": [1,1,2,2,3,3,4,4], "C": [1,1,1,1,2,3,2,2] })

    A   B   C
0   1   1   1
1   2   1   1
2   3   2   1
3   4   2   1
4   5   3   2
5   6   3   3
6   7   4   2
7   8   4   2

I would like to see, for each value b of column B, the set of values c of column C that there are in rows where B=b.

So I'd like something like a series that tells me something like {1:[1], 2:[1], 3:[2,3], 4:[2]} meaning that, for example, when B=3, the values of C are 2 and 3.

How do I do this? Thanks

CodePudding user response：

You can groupby and aggregate as set:

df.groupby('B')['C'].agg(set).to_dict()
# or, as lists
# df.groupby('B')['C'].agg(lambda x: list(set(x))).to_dict()

Output:

{1: {1}, 2: {1}, 3: {2, 3}, 4: {2}}

For the values in their original order:

df.groupby('B')['C'].agg(lambda x: list(dict.fromkeys(x))).to_dict()

Output:

{1: [1], 2: [1], 3: [2, 3], 4: [2]}