Groupby id and create a dummy if a column value does not include zeros-CodePudding

I have the following df, and i want to create a dummy =1 if and only if each id does not contain any zeros in column "count".

id    count
A      9
A      0
A      2
A      1
B      2
B      5
B      2
B      1
C      1
C      9
D      7
D      2
D      0

desired output

id    count   dummy
A      9      0
A      0      0
A      2      0
A      1      0
B      2      1
B      5      1
B      2      1
B      1      1
C      1      1
C      9      1
D      7      0
D      2      0
D      0      0

thanks

CodePudding user response：

df['dummy'] = df.groupby('id')['count'].transform(lambda col: int(0 not in col.unique()))


   id  count  dummy
0   A      9      0
1   A      0      0
2   A      2      0
3   A      1      0
4   B      2      1
5   B      5      1
6   B      2      1
7   B      1      1
8   C      1      1
9   C      9      1
10  D      7      0
11  D      2      0
12  D      0      0

CodePudding user response：

here is one way do it

df['dummy']=df.groupby('id')['count'].transform(lambda x: 0 if x.eq(0).sum()>0 else 1)
df

    id  count   dummy
0    A      9       0
1    A      0       0
2    A      2       0
3    A      1       0
4    B      2       1
5    B      5       1
6    B      2       1
7    B      1       1
8    C      1       1
9    C      9       1
10   D      7       0
11   D      2       0
12   D      0       0

CodePudding user response：

You can use pandas.groupby and transform Then check in each group exist any zero or not and return result astype(int).

df['dummy'] = df.groupby('id')['count'].transform(lambda g : (~(g.eq(0)).any()).astype(int))
print(df)

   id  count  dummy
0   A      9      0
1   A      0      0
2   A      2      0
3   A      1      0
4   B      2      1
5   B      5      1
6   B      2      1
7   B      1      1
8   C      1      1
9   C      9      1
10  D      7      0
11  D      2      0
12  D      0      0

CodePudding user response：

groupby().transform is the way to go, but I'd groupby on the logic series itself

# transform `min` would work as well
df['dummy'] = df['count'].ne(0).groupby(df['id']).transform('all').astype(int)

Output:

   id  count  dummy
0   A      9      0
1   A      0      0
2   A      2      0
3   A      1      0
4   B      2      1
5   B      5      1
6   B      2      1
7   B      1      1
8   C      1      1
9   C      9      1
10  D      7      0
11  D      2      0
12  D      0      0

CodePudding user response：

Using a groupby with series. Transform the original series in 0/1 and get the min value.

df['dummy'] = df['count'].ne(0).astype(int).groupby(df['id']).transform('min')

Output:

  id  count  dummy
0   A      9      0
1   A      0      0
2   A      2      0
3   A      1      0
4   B      2      1
5   B      5      1
6   B      2      1
7   B      1      1
8   C      1      1
9   C      9      1
10  D      7      0
11  D      2      0
12  D      0      0