Home > Enterprise >  Filter our rows where all the grouped elements are equal to zero
Filter our rows where all the grouped elements are equal to zero

Time:02-24

my df:

id1 id2  uid . . . 
 1  100   0
 1  101 1000
 1  101 1000
 2  102   0
 2  103   0
 3  104 1002
 3  104 1002
 3  104 1002
 3  104   0
 3  105   0
 3  106   0
 4  107   0
 4  107   0
 4  108   0
 4  108   0

I would to group by id1 and filter out id1s where all the uids are zero.

I tried the following:

df = df.groupby(by = 'id1').filter(lambda x: x['uid'].sum() > 0).reset_index(drop = True)

But the issue is that it sums up the non-zero uids and creates random uids by doing so.

Desired result:

id1 id2  uid . . . 
 1  100   0
 1  101 1000
 1  101 1000
 3  104 1002
 3  104 1002
 3  104 1002
 3  104   0
 3  105   0
 3  106   0

CodePudding user response:

You could find the "uid"s not equal to 0; then select the corresponding "id1"s by transforming the max, which assigns True if for an "id1", the corresponding set of "uid"s contain a non-0 (so for example, "id1"=4 is dropped):

out = df[df['uid'].ne(0).groupby(df['id1']).transform('max')]

Output:

    id1  id2   uid
0     1  100     0
1     1  101  1000
2     1  101  1000
5     3  104  1002
6     3  104  1002
7     3  104  1002
8     3  104     0
9     3  105     0
10    3  106     0
  • Related