There is a dataframe consists of a column of numbers that only includes '0, 1, 2'. Among them, '1' and '2' appear alternately, and there will not be two '1' or two '2' consecutively.
My goal is to change all the values in this column between 1 and 2 (start with 1 and including 1 and 2) to a special value say '9'. Maybe I can do this by looping through the whole df, but I wonder if pandas has an easier way to do it?
df = pd.DataFrame({'a':[2, 0, 1, 2, 0, 1, 0, 0, 2, 0, 1, 0, 0, 2, 0, 1]})
>>> df
a desired
0 2 2
1 0 0
2 1 9
3 2 9
4 0 0
5 1 9
6 0 9
7 0 9
8 2 9
9 0 0
10 1 9
11 0 9
12 0 9
13 2 9
14 0 0
15 1 1
CodePudding user response:
IIUC, you can form forward groups starting with 1, and reverse groups ending with 2, then form common groups. If a group starts with 1 and ends with 2, then fill it with 9:
g1 = df['a'].eq(1).cumsum()
g2 = df['a'].eq(2)[::-1].cumsum()
m = df.groupby([g1, g2])['a'].transform(lambda g: g.iloc[0]==1 and g.iloc[-1]==2)
df['desired'] = pd.Series(9, index=m.index).mask(~m, df['a'])
output:
a desired
0 2 2
1 0 0
2 1 9
3 2 9
4 0 0
5 1 9
6 0 9
7 0 9
8 2 9
9 0 0
10 1 9
11 0 9
12 0 9
13 2 9
14 0 0
15 1 1
CodePudding user response:
here is one way do it
df['b'] = np.where(df['a'].diff() > -2 , 9, 0)
# first and last value are exception, so handle them separately
df['b'].iat[0]= df['a'].iat[0]
df['b'].iat[-1]= df['a'].iat[-1]
df
a b
0 2 2
1 0 0
2 1 9
3 2 9
4 0 0
5 1 9
6 0 9
7 0 9
8 2 9
9 0 0
10 1 9
11 0 9
12 0 9
13 2 9
14 0 0
15 1 1