I have a grouped data frame df_grouped
, I would like to create a new boolean column df_grouped["Unique"]
where for each subset of grouping, this column is True
if the values of location
is unique within the grouping & False
if it's not unique.
dataset = {
'ID': ['One', 'One', 'One', 'Five', 'Five','Five','Four'],
'Day': [2, 2, 2, 1, 1,1,0],
'Location': ['London', 'London', 'Paris', 'London', 'Paris','Paris','Berlin']}
df = pd.DataFrame(dataset)
df_grouped = df.groupby(['Name','Day'])
Expected output for the unique column:
'Unique': [False, False, True, True, False, False, True]
CodePudding user response:
Use DataFrame.duplicated
with keep=False
and inverted mask by ~
:
df['Unique'] = ~df.duplicated(['ID','Day', 'Location'], keep=False)
print (df)
ID Day Location Unique
0 One 2 London False
1 One 2 London False
2 One 2 Paris True
3 Five 1 London True
4 Five 1 Paris False
5 Five 1 Paris False
6 Four 0 Berlin True