Home > Blockchain >  i have dataframe having column -customer i want to elist the details of customer who has occured mor
i have dataframe having column -customer i want to elist the details of customer who has occured mor

Time:12-11

this is a dataframe having column 'customer' with repetative values

df=pd.DataFrame({'id':[1,2,3,4,5,6,7,8,9,10],'customer':['a','b','c','b','b','b','d','e','e','f'],'address':['xx','yy','rr','yy','oo','ee','vv','zz','nn','cc']})


want values repeating more than 3 times

df.groupby(['customer']).count()>3

result==> in the result am getting boolean values

    id  address
customer        
a   False   False
b   True    True
c   False   False
d   False   False
e   False   False
f   False   False
expected result==>
    id  customer address
1   2   b     yy

CodePudding user response:

You can GroupBy.filter() the dataframe and the .drop_duplicates by "customer" column:

x = (
    df.groupby("customer")
    .filter(lambda x: len(x) > 3)
    .drop_duplicates("customer")
)

print(x)

Prints:

   id customer address
1   2        b      yy

CodePudding user response:

You can use groupby.transform and boolean indexing:

df[df.groupby('customer')['customer'].transform('count').gt(3)]

Output:

   id customer address
1   2        b      yy
3   4        b      yy
4   5        b      oo
5   6        b      ee

CodePudding user response:

Fix your code with isin

s = df.groupby(['customer'])['id'].count()>3
out = df.loc[df['customer'].isin(s[s].index)]
Out[389]: 
   id customer address
1   2        b      yy
3   4        b      yy
4   5        b      oo
5   6        b      ee
  • Related