I'm targeting to find provisional minimums in a dataframe. To find them I search values that are lower than the minimum of the previous 3 values in a dataframe colum. Tested in Google Colab and python 3.9 in windows.
Maybe there is other more Pandastic way of doing this.
The problem: when using <=
everything seems to work fine, but when using <
, nothing was found. Even when the provided data should be matched.
import pandas as pd
data_list = [55,66,77,88,99,88,77,66,55,54,65,67,68,70,73,78,83] # use any other values
low = pd.Series(data_list)
df = pd.DataFrame(low.values, columns=['Low'])
df
Low
0 55
1 66
2 77
3 88
4 99
5 88
6 77
7 66
8 55
...
df.plot(grid=True)
When trying to find values lower than the 3 previos rows OR EQUAL, everything works fine:
df[df['Low'] <= df['Low'].rolling(3).min()]
Low
5 88
6 77
7 66
8 55
9 54
The Problem: But when using just < operator, nothing was found.
df[df['Low'] < df['Low'].rolling(3).min()]
This are the versions used in google colab (python 3.7), updated:
!pip install numpy --upgrade
!pip install pandas --upgrade
!pip freeze ...
numpy==1.21.5
pandas==1.3.5
When using local Python 3.9, same behaviour.
CodePudding user response:
trying to find values lower than the 3 previos rows
When you do df[df['Low'] <= df['Low'].rolling(3).min()]
or df[df['Low'] < df['Low'].rolling(3).min()]
you are not comparing to 3 previous rows but current row and 2 previous.
You need to shift
if you want to get desired behavior that is
import pandas as pd
data_list = [55,66,77,88,99,88,77,66,55,54,65,67,68,70,73,78,83] # use any other values
low = pd.Series(data_list)
df = pd.DataFrame(low.values, columns=['Low'])
print(df[df['Low'] < df['Low'].rolling(3).min().shift()])
output
Low
6 77
7 66
8 55
9 54