I have a certain use case and I cannot do it well in pandas.
order_id asset_id
1 A
1 B
1 C
2 A
2 C
3 A
4 B
4 C
- I would like to know in how many orders I have the asset
A
alone? In that case: 1 time (In order 3) - I would like to know in how many orders I have the asset
A
with others? In that case: 2 times (In order 1 and 2)
It could be great to have some help with that. I can't figure how to do it.
CodePudding user response:
If need count unique values membership per groups order_id
first aggregate set
s and then compare values by set A
:
s = df.groupby('order_id')['asset_id'].agg(set)
print (s)
order_id
1 {A, B, C}
2 {A, C}
3 {A}
4 {B, C}
Name: asset_id, dtype: object
alone = (s == {'A'}).sum()
print (alone)
1
with_others = (s > {'A'}).sum()
print (with_others)
2
CodePudding user response:
Use groupby.agg
with set
operations:
(df.groupby('order_id')['asset_id']
.agg(alone=lambda x: set(x)=={'A'},
others=lambda x: set(x)>{'A'}
)
.sum()
)
Output:
alone 1
others 2
dtype: int64