how to do grouping with one or more index in pandas-CodePudding

Id    Freq     ID2   PSC
a1     0        xy    33
a1     0        yz    35
a1     1        xz    60
a2     0        pq    70
a2     1        qr     75
a2     0        rs     80

output should be

Id    Freq     ID2   PSC
a1     0        xy    33
                yz    35
a1     1        xz    60
a2     0        pq     70
                rs     80
a2     1        qr     75

after that check a1 and freq=o psc shlould be unique

CodePudding user response：

What exactly do you want to do?

If you want to mask the duplicated values with empty string (or NA) to highlight the consecutive duplicates like it would appear on a MultiIndex you could use:

df = df.sort_values(by=['Id', 'Freq'])
m = df.duplicated(['Id', 'Freq'])

df.loc[m, ['Id', 'Freq']] = ''

output:

   Id Freq ID2  PSC
0  a1    0  xy   33
1           yz   35
2  a1    1  xz   60
3  a2    0  pq   70
5           rs   80
4  a2    1  qr   75

Note that this denaturates your data, so you should only do this for display purposes.

Another option, set the columns as MultiIndex:

df.set_index(['Id', 'Freq']).sort_index()

The display will hide the consecutive duplicates, not exactly the way you want though.

CodePudding user response：

You can have a look at the Pandas functions set_index and sort_index. For your specific problem let's say that you have a dataset called df

df.set_index(['Id','Freq'])

will give you

        ID2 PSC
Id  Freq        
a1  0   xy  33
    0   yz  35
    1   xz  60
a2  0   pq  70
    1   qr  75
    0   rs  80

Then you can decide if you want to sort by the specific index of your choice (to get unique values).