Home > Net >  How to create a column to store trailing high value in Pandas DataFrame?
How to create a column to store trailing high value in Pandas DataFrame?

Time:12-30

Consider a DataFrame with only one column named values.

data_dict = {values:[5,4,3,8,6,1,2,9,2,10]}
df = pd.DataFrame(data_dict)
display(df)

The output will look something like:

    values
0   5
1   4
2   3
3   8
4   6
5   1
6   2
7   9
8   2
9   10

I want to generate a new column that will have the trailing high value of the previous column.

Expected Output:

    values  trailing_high
0   5       5
1   4       5 
2   3       5
3   8       8
4   6       8
5   1       8
6   2       8
7   9       9
8   2       9
9   10      10

Right now I am using for loop to iterate on df.iterrows() and calculating the values at each row. Because of this, the code is very slow.

Can anyone share the vectorization approach to increase the speed?

CodePudding user response:

Use .cummax:

df["trailing_high"] = df["values"].cummax()
print(df)

Output

   values  trailing_high
0       5              5
1       4              5
2       3              5
3       8              8
4       6              8
5       1              8
6       2              8
7       9              9
8       2              9
9      10             10
  • Related