Below is the dataframe
df = pd.DataFrame({'Cust_Pincode':[487551,487551,639207,452001,484661,484661],
'REGIONAL_GROUPING':['WEST I','WEST II','TN II','WEST I','WEST I','WEST II'],
'C_LATITUDE':[22.89831,23.74881,10.72208,22.69875,23.88280,23.88280],
'C_LONGITUDE':[78.75441,79.48472,77.94168,75.88575,80.98250,80.98250],
'Region_dist_lim':[33.577743,33.577743,36.812093,33.577743,33.577743,33.577743]})
Cust_Pincode REGIONAL_GROUPING C_LATITUDE C_LONGITUDE Region_dist_lim
0 487551 WEST I 22.89831 78.75441 33.577743
1 487551 WEST II 23.74881 79.48472 33.577743
2 639207 TN II 10.72208 77.94168 36.812093
3 452001 WEST I 22.69875 75.88575 33.577743
4 484661 WEST I 23.88280 80.98250 33.577743
5 484661 WEST II 23.88280 80.98250 33.577743
I'm trying to write a code which will return unique Cust_Pincode has different REGIONAL_GROUPING. groupby on cust_pincode, regional_grouping and return the dataframe where cust_pincode has multiple regional grouping value. Below is the expected output dataframe
Cust_Pincode REGIONAL_GROUPING
WEST I
0 487551
WEST II
WEST I
1 484661
WEST II
The code which i've written is below
df.groupby(['Cust_Pincode','REGIONAL_GROUPING']).filter(lambda x: len(x) > 1)
The above code is not giving any output
CodePudding user response:
You can try this solution
df = df.groupby(['Cust_Pincode']).filter(lambda x: len(x) > 1)
print(df.groupby(['Cust_Pincode', 'REGIONAL_GROUPING']).first())
CodePudding user response:
Why use filter()
?
You can just use first()
like this:
df.groupby(['Cust_Pincode','REGIONAL_GROUPING']).first()