Home > database >  pandas DataFrame: sequentially compare cells to a numeric value, and then update the value once a co
pandas DataFrame: sequentially compare cells to a numeric value, and then update the value once a co

Time:12-16

I’ve got this dataframe:

                                   Time   Price
0           2021-11-01T13:30:00.001643Z  460.30
1            2021-11-01T13:30:00.00169Z  460.30
2           2021-11-01T13:30:00.001907Z  460.30
3        2021-11-01T13:30:00.002802497Z  460.31
4         2021-11-01T13:30:00.00349859Z  460.31
...                                 ...     ...
2854578  2021-12-03T21:00:00.118396616Z  453.39
2854579  2021-12-03T21:00:00.128718627Z  453.38
2854580  2021-12-03T21:00:00.287665293Z  453.38
2854581  2021-12-03T21:00:00.287665293Z  453.38
2854582  2021-12-03T21:00:00.907833812Z  453.25

[2854583 rows x 2 columns]

I’d like compare the ‘Price’ cells to a value (dataframe['Price'].loc[0]), and then record when the price does not equal this value /- 0.1 by entering the price under a new column (‘Range’). This price would also then become the value that is used to assess subsequent prices against in the same manner as before.

This is my attempt, however the value doesn’t seem to update from it’s original definition (dataframe['Range'].loc[dataframe['Range'].last_valid_index() seems to only refer to dataframe['Price'].loc[0]):

dataframe['Range'] = numpy.nan
dataframe['Range'].loc[0] = dataframe['Price'].loc[0]
dataframe['Range'] = numpy.where(abs(dataframe['Price'] - dataframe['Range'].loc[dataframe['Range'].last_valid_index()]) > 0.10, dataframe['Price'], dataframe['Range'].loc[dataframe['Range'].last_valid_index()])

Relatively inexperienced with python & pandas, so thanks in advance for your time and any help/comments!

CodePudding user response:

I would do something similar to the following.

  1. Set the price in a variable
PRICE = df["Price"].loc[0]
  1. Set True/False to df["Range"] using a list comprehension.
df["Range"] = [True if PRICE-0.1 < price < PRICE 0.1 else False for price in df["Price"]]

This is saying that for every price in df["Price"], you want to create a corresponding list of True or False depending on whether that price is in the correct range or not respectively, and then set that list to df["Range"]

  • Related