I have a data frame like this:
df = pd.DataFrame({"product": [1,2,3,4,5], "company": ["A", "B","B","A","B"], "state": ["CA", "NY", "CA", "CA", "NY"]})
company state product
0 A CA 1
1 B NY 2
2 B CA 3
3 A CA 4
4 B NY 5
I would like a bool that picks out just the rows that correspond to states which have only one company in them. In this case that would be only NY which has only company B, so the desired bool would be [False, True, False, False, True]
Alternatively, I would like to know the set of states that have only one company in them. I guess I could do that e.g. using value_counts
once I have the bool.
How do I do this?
Thanks!
CodePudding user response:
You can use groupby_nunique
with transform
to broadcast the result over rows then just check if the result is equal to 1:
df['flag'] = df.groupby('state')['company'].transform('nunique').eq(1)
print(df)
# Output
product company state flag
0 1 A CA False
1 2 B NY True
2 3 B CA False
3 4 A CA False
4 5 B NY True