I'm looking for a fast pandas way of labeling sections within a dataframe.
Suppose I have a dataframe column A with some strings in it, I'd like to create a new column B that tags the sections incrementally between the keyword 'hi' like so:
A B
hi
a 1
b 1
hi
d 2
f 2
g 2
hi
CodePudding user response:
df.assign(C = df['A'].eq('hi').cumsum().mask(df['B'].isna()))
Out:
A B C
0 hi NaN NaN
1 a 1.0 1.0
2 b 1.0 1.0
3 hi NaN NaN
4 d 2.0 2.0
5 f 2.0 2.0
6 g 2.0 2.0
7 hi NaN NaN