I have a dataframe df with column A with random numbers and column B with categories. Now, I obtain another column C using the code below:
df.loc[df['A'] >= 50, 'C'] = 1
df.loc[df['A'] < 50, 'C'] = 0
I want to obtain a column 'D' which creates a sequence if 1 is encountered else returns the value 0. The required dataframe is given below.
Required df
A B C D
17 a 0 0
88 a 1 1
99 a 1 2
76 a 1 3
73 a 1 4
23 b 0 0
36 b 0 0
47 b 0 0
74 b 1 1
80 c 1 1
77 c 1 2
97 d 1 1
30 d 0 0
80 d 1 2
CodePudding user response:
Use GroupBy.cumcount
with Series.mask
:
df['D'] = df.groupby(['B', 'C']).cumcount().add(1).mask(df['C'].eq(0), 0)
print (df)
A B C D
17 a 0 0
88 a 1 1
99 a 1 2
76 a 1 3
73 a 1 4
23 b 0 0
36 b 0 0
47 b 0 0
74 b 1 1
80 c 1 1
77 c 1 2
97 d 1 1
30 d 0 0
80 d 1 2
Or numpy.where
:
df['D'] = np.where(df['C'].eq(0), 0, df.groupby(['B', 'C']).cumcount().add(1))