Home > OS >  Pandas how to avoid SettingWithCopyWarning that possibly make pandas confuse values?
Pandas how to avoid SettingWithCopyWarning that possibly make pandas confuse values?

Time:10-31

I am making a bot that for now downloads price data from one of the brokers every X period of time. I have realised that for some reason values are switched between Close/High/low/open price columns, and that is possibly due to the SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame Warning.

I need help with changing these lines of code that are used to calculate downloaded data.

   for i in range(self.DE30EUR_1m_price_data.shape[0]):
        self.DE30EUR_1m_price_data['open'][i] = self.DE30EUR_1m_price_data['open'][i] / 1000
        self.DE30EUR_1m_price_data['close'][i] = self.DE30EUR_1m_price_data['open'][i]   (self.DE30EUR_1m_price_data['close'][i] / 1000)
        self.DE30EUR_1m_price_data['high'][i] = self.DE30EUR_1m_price_data['open'][i]   (self.DE30EUR_1m_price_data['high'][i] / 1000)
        self.DE30EUR_1m_price_data['low'][i] = self.DE30EUR_1m_price_data['open'][i]   (self.DE30EUR_1m_price_data['low'][i] / 1000)

(Indentation has broken while pasting the code, so I indented it manually with backspace).

How can I change the code so I will get rid of the warning and possibly eliminate any errors with values misplacement? I have calculated other things on the dataframe in the same way further in the code, so I want to start debugging with the very first occurance of the flawled writing.

Thank you in advance for you help :)

CodePudding user response:

Using loops with Pandas is very not recommended as it's not efficient. Pandas support vectorized operations, meaning you could calculate something for the whole columns once without looping through the rows.

Regarding the warning, this is because you are selecting a copy of a section of the dataframe and manipulating it. This is not the way to go. Anyway, my code below should resolve it.

If I understand your code correctly, you should be able to get the same result using this code: (Please note I haven't test it locally, this is just for a general guideline)

See the following example:

self.DE30EUR_1m_price_data['open'] = self.DE30EUR_1m_price_data['open'] / 1000
self.DE30EUR_1m_price_data['close'] = self.DE30EUR_1m_price_data['open']   (self.DE30EUR_1m_price_data['close'] / 1000)
self.DE30EUR_1m_price_data['high'] = self.DE30EUR_1m_price_data['open']   (self.DE30EUR_1m_price_data['high'] / 1000)
self.DE30EUR_1m_price_data['low'] = self.DE30EUR_1m_price_data['open']   (self.DE30EUR_1m_price_data['low'] / 1000)

As you see, there is no need a for loop to iterate the rows.

  • Related