I have the following df, and i want to create a dummy =1 if and only if each id does not contain any zeros in column "count".
id count
A 9
A 0
A 2
A 1
B 2
B 5
B 2
B 1
C 1
C 9
D 7
D 2
D 0
desired output
id count dummy
A 9 0
A 0 0
A 2 0
A 1 0
B 2 1
B 5 1
B 2 1
B 1 1
C 1 1
C 9 1
D 7 0
D 2 0
D 0 0
thanks
CodePudding user response:
df['dummy'] = df.groupby('id')['count'].transform(lambda col: int(0 not in col.unique()))
id count dummy
0 A 9 0
1 A 0 0
2 A 2 0
3 A 1 0
4 B 2 1
5 B 5 1
6 B 2 1
7 B 1 1
8 C 1 1
9 C 9 1
10 D 7 0
11 D 2 0
12 D 0 0
CodePudding user response:
here is one way do it
df['dummy']=df.groupby('id')['count'].transform(lambda x: 0 if x.eq(0).sum()>0 else 1)
df
id count dummy
0 A 9 0
1 A 0 0
2 A 2 0
3 A 1 0
4 B 2 1
5 B 5 1
6 B 2 1
7 B 1 1
8 C 1 1
9 C 9 1
10 D 7 0
11 D 2 0
12 D 0 0
CodePudding user response:
You can use pandas.groupby
and transform
Then check in each group exist any zero
or not and return result astype(int)
.
df['dummy'] = df.groupby('id')['count'].transform(lambda g : (~(g.eq(0)).any()).astype(int))
print(df)
id count dummy
0 A 9 0
1 A 0 0
2 A 2 0
3 A 1 0
4 B 2 1
5 B 5 1
6 B 2 1
7 B 1 1
8 C 1 1
9 C 9 1
10 D 7 0
11 D 2 0
12 D 0 0
CodePudding user response:
groupby().transform
is the way to go, but I'd groupby on the logic series itself
# transform `min` would work as well
df['dummy'] = df['count'].ne(0).groupby(df['id']).transform('all').astype(int)
Output:
id count dummy
0 A 9 0
1 A 0 0
2 A 2 0
3 A 1 0
4 B 2 1
5 B 5 1
6 B 2 1
7 B 1 1
8 C 1 1
9 C 9 1
10 D 7 0
11 D 2 0
12 D 0 0
CodePudding user response:
Using a groupby
with series. Transform the original series in 0/1 and get the min value.
df['dummy'] = df['count'].ne(0).astype(int).groupby(df['id']).transform('min')
Output:
id count dummy
0 A 9 0
1 A 0 0
2 A 2 0
3 A 1 0
4 B 2 1
5 B 5 1
6 B 2 1
7 B 1 1
8 C 1 1
9 C 9 1
10 D 7 0
11 D 2 0
12 D 0 0