I have this dataset:
col1 = [1,2,3,4,5,6,7,8]
col2 = [2,3,5,1,4,3,4,5]
df = pd.DataFrame({'Column1': col1, 'Column2': col2})
Column1 Column2
1 2
2 3
3 5
4 1
5 4
6 3
7 4
8 5
I am trying to get it so when the Column2 has stopped increasing that it fills the previous values so the expected output would be:
Column1 Column2
1 5
2 5
3 5
4 4
5 4
6 5
7 5
8 5
I tried doing this by a for loop comparing the previous to the current, but this would require lots of for loops. Is there an efficient way of doing this?
CodePudding user response:
groupby
increasing stretches and transform
with the last
value:
df['Column2'] = (df.groupby(df['Column2'].diff().lt(0).cumsum())['Column2']
.transform('last')
)
output:
Column1 Column2
0 1 5
1 2 5
2 3 5
3 4 4
4 5 4
5 6 5
6 7 5
7 8 5
intermediate to define the group:
df['Column2'].diff().lt(0).cumsum()
0 0
1 0
2 0
3 1
4 1
5 2
6 2
7 2
Name: Column2, dtype: int64
CodePudding user response:
Another solution:
df.Column2 = df.Column2[(df.Column2.diff() <= 0).shift(-1).fillna(True)]
df.Column2 = df.Column2.bfill()
print(df)
Prints:
Column1 Column2
0 1 5.0
1 2 5.0
2 3 5.0
3 4 4.0
4 5 4.0
5 6 5.0
6 7 5.0
7 8 5.0