Home > database >  Modify all values between a pair of values in a column of dataframe
Modify all values between a pair of values in a column of dataframe

Time:08-24

There is a dataframe consists of a column of numbers that only includes '0, 1, 2'. Among them, '1' and '2' appear alternately, and there will not be two '1' or two '2' consecutively.

My goal is to change all the values in this column between 1 and 2 (start with 1 and including 1 and 2) to a special value say '9'. Maybe I can do this by looping through the whole df, but I wonder if pandas has an easier way to do it?

df = pd.DataFrame({'a':[2, 0, 1, 2, 0, 1, 0, 0, 2, 0, 1, 0, 0, 2, 0, 1]})

>>> df
    a  desired
0   2        2
1   0        0
2   1        9
3   2        9
4   0        0
5   1        9
6   0        9
7   0        9
8   2        9
9   0        0
10  1        9
11  0        9
12  0        9
13  2        9
14  0        0
15  1        1

CodePudding user response:

IIUC, you can form forward groups starting with 1, and reverse groups ending with 2, then form common groups. If a group starts with 1 and ends with 2, then fill it with 9:

g1 = df['a'].eq(1).cumsum()
g2 = df['a'].eq(2)[::-1].cumsum()

m = df.groupby([g1, g2])['a'].transform(lambda g: g.iloc[0]==1 and g.iloc[-1]==2)

df['desired'] = pd.Series(9, index=m.index).mask(~m, df['a'])

output:

    a  desired
0   2        2
1   0        0
2   1        9
3   2        9
4   0        0
5   1        9
6   0        9
7   0        9
8   2        9
9   0        0
10  1        9
11  0        9
12  0        9
13  2        9
14  0        0
15  1        1

CodePudding user response:

here is one way do it

df['b'] = np.where(df['a'].diff() > -2 , 9, 0)

# first and last value are exception, so handle them separately
df['b'].iat[0]= df['a'].iat[0]
df['b'].iat[-1]= df['a'].iat[-1]
df

    a   b
0   2   2
1   0   0
2   1   9
3   2   9
4   0   0
5   1   9
6   0   9
7   0   9
8   2   9
9   0   0
10  1   9
11  0   9
12  0   9
13  2   9
14  0   0
15  1   1


  • Related