Home > Net >  How to find group-column have duplicate values in a dataframegroup python?
How to find group-column have duplicate values in a dataframegroup python?

Time:01-01

first i have a df, when i groupby it with a column, will it remove duplicate values?. Second, how to know which group have duplicate values ( i tried to find how to know which columns of a df have duplicate values but couldn't find anything, they just talk about how each element duplicated or not)

ex i have a dfgroup like this:
     B   C
1    2   3
1    4   3
2    2   2
2    3   4
2    2   3

and result i want after find which group and column duplicated:

     B       C
1    False   True
2    True    False


i tried find a way like this df.groupby(A).agg(find_duplicate) with A is column is groupby, thanks for help

CodePudding user response:

You could use a lambda function inside GroupBy.agg to compare number of unique values that is not equal to the number of values in a group. To get the number of unique we can use Series.nunique and Series.size for the number of values in a group.

df.groupby(level=0).agg(lambda x: x.size!=x.nunique())

#        B      C
# 1  False   True
# 2   True  False

CodePudding user response:

Let us try

out = df.groupby(level=0).agg(lambda x : x.duplicated().any())
       B      C
1  False   True
2   True  False
  • Related