I wish to condense my dataset. Essentially it is a groupby.
Data
id box status
aa box11 hey
aa box11 hey
aa box11 hey
aa box11 hey
aa box5 hello
aa box5 hello
aa box5 hello
aa box5 hello
aa box5 hello
bb box8 no
bb box8 no
Desired
id box status
aa box11 hey
aa box5 hello
bb box8 no
Doing
df1 = df.groupby(["id"])["box"]).agg()
CodePudding user response:
If you want to be careful and exclude "id" you can use the subset keyword:
df1 = df.drop_duplicates(subset = ['box', 'status'])