How to drop duplicates for the following data-frame based on multiple conditions?-CodePudding

i am having a dataframe as follows:

df

cylce	csec	dist	vel
1	-40.1	9.87	2.7
1	-40.1	9.89	2.2
2	-39.1	14.07	2.0
2	-39.1	14.09	2.8
3	-38.7	18.09	3.2
4	-36.6	15.37	0.5
4	-38.01	16.23	1.8
4	-38.4	16.66	3.1

i have to drop the duplicate cycle based on some conditions:  
    -if the csec is same then  
            -look for the  dist and keep the row with highest dist  
                   -if dist are same check the vel ,keep the row with highest vel  
    -if csec is different
            -keep the row with highest csec

output

cylce	csec	dist	vel
1	-40.1	9.89	2.2
2	-39.1	14.09	2.8
3	-38.7	18.09	3.2
4	-36.6	15.37	0.5

I was able to get the dupicate rows with following code

    duplicate_cyle = df[df.duplicated('cycle',keep = False)]

I would like to know how to drop the rows based on the conditions.

CodePudding user response：

Sort your dataframe in descending order of csec, dist and vel, then drop duplictes, eg:

out = (
    df.sort_values(['cycle', 'csec', 'dist', 'vel'], ascending=[True, False, False, False])
    .drop_duplicates(subset=['cycle'])
)