Is there any good recipe to go lets say from:
datetime value
2021-10-05 09:39:00 1
2021-10-05 09:40:00 2
2021-10-05 09:41:00 3
2021-10-05 09:42:00 2 <--
2021-10-05 09:43:00 3
to:
datetime value
2021-10-05 09:39:00 1
2021-10-05 09:40:00 2
2021-10-05 09:41:00 3
2021-10-05 09:42:00 3 <--
2021-10-05 09:43:00 3
in python/pandas?
thx.
CodePudding user response:
Test rows with not monotonically increasing by compare difference for not equal -1
and then replace by shifted values:
df['value'] = df['value'].where(df['value'].diff().ne(-1), df['value'].shift(-1))
print (df)
id value
0 0 1
1 0 2
2 0 3
3 0 3
4 0 3
If possible multiple values not monotonically increased is better use backfill
not matched values:
df['value'] = df['value'].where(df['value'].diff().ne(-1)).bfill().astype(int)
print (df)
id value
0 0 1
1 0 2
2 0 3
3 0 3
4 0 3
CodePudding user response:
I would do it in two steps:
- set the corresponding value to NaN, by testing if the next value (shfited value) is equal or larger than the previous value.
- fill the NaNs with the previous value
Of course, this doesn't work if there are NaNs in the input. But in that case, monotonically increasing doesn't mean anything anymore.
In [60]: df = pd.DataFrame({'id': range(5), 'value': [1, 2, 3, 2, 3]}).set_index('id')
In [62]: df.loc[df['value'].shift() >= df['value'], 'value'] = np.nan
In [63]: df
Out[63]:
value
id
0 1.0
1 2.0
2 3.0
3 NaN
4 3.0
In [64]: df['value'].fillna(method='bfill')
Out[64]:
id
0 1.0
1 2.0
2 3.0
3 3.0
4 3.0
Name: value, dtype: float64
CodePudding user response:
Do you mean by:
df['value'] = df['value'].cummax()
Or:
df.loc[df['value'] < df['value'].shift(), 'value'] = 1
Both codes give:
>>> df
id value
0 0 1
1 1 2
2 2 3
3 3 3
4 4 3
>>>