Home > Net >  Update series values with a difference of 1
Update series values with a difference of 1

Time:12-29

I have a certain series on in dataframe.

df=pd.DataFrame()
df['yMax'] = [127, 300, 300, 322, 322, 322, 322, 344, 344, 344, 366, 366, 367, 367, 367, 388, 388, 388, 388, 389, 389, 402, 403, 403, 403]

For values very close to one another, say, with a difference of 1, I would like to obliterate that difference to yield the same number, either by adding or subtracting by 1.

So, for example, the resultant list would become:

df['yMax'] = [127, 300, 300, 322, 322, 322, 322, 344, 344, 344, 367, 367, 367, 367, 367, 389, 389, 389, 389, 389, 389, 403, 403, 403, 403]

I know we can easily find the difference between adjacent values with df.diff().

0       NaN
1     173.0
2       0.0
3      22.0
4       0.0
5       0.0
8       0.0
6      22.0
7       0.0
9       0.0
10     22.0
11      0.0
12      1.0
13      0.0
14      0.0
15     21.0
16      0.0
17      0.0
20      0.0
18      1.0
19      0.0
21     13.0
22      1.0
23      0.0
24      0.0
Name: yMax, dtype: float64

But how should I perform the transformation?

CodePudding user response:

import pandas as pd
df=pd.DataFrame({'yMax':[127, 300, 300, 322, 322, 322, 322, 344, 344, 344, 366, 366, 367, 367, 367, 388, 388, 388, 388, 389, 389, 402, 403, 403, 403]})

Where there is a difference of 1 in consecutive, move the immediate consecutive up. Group by the original numbers picking max value in the adjusted column. Code below

df =df.assign(new_yMax=np.where(df['yMax'].diff(-1)==-1, df['yMax'].shift(-1),df['yMax']))

df =df.assign(new_yMax=df.groupby('yMax')['new_yMax'].transform('max'))

df
  • Related