Home > Mobile >  establish column 'day of month' for stock prices
establish column 'day of month' for stock prices

Time:09-26

Starting from a dataframe with stock prices, I intend to define a column ['Day_Of_Month'].

This is what I did:

import yfinance as yf
import pandas as pd
df = yf.download('SPY', start='2020-07-02')
pd.set_option("display.max_rows", None, "display.max_columns", None)
df['Month'] = df.index.month
df.loc[(df['Month'].shift(0) != df['Month'].shift(1)) & (df['Month'].shift(1) == 
             ['Month'].shift(2)), 'DayOfMonth'] = 1
for i in range(1, len(df), 1):
    if df['Month'].iloc[i - 1] == df['Month'].iloc[i]:
        df['DayOfMonth'].iloc[i] = df['DayOfMonth'].iloc[i - 1]   1

The result is correct:

            Close     Month  DayOfMonth
Date                                        
2020-08-31  349.309998    8   21.0
2020-09-01  352.600006    9   1.0
2020-09-02  357.700012    9   2.0
2020-09-03  345.390015    9   3.0
2020-09-04  342.570007    9   4.0
2020-09-08  333.209991    9   5.0
2020-09-09  339.790009    9   6.0
2020-09-10  333.890015    9   7.0
2020-09-11  334.059998    9   8.0
2020-09-14  338.459991    9   9.0
2020-09-15  340.170013    9   10.0
2020-09-16  338.820007    9   11.0
2020-09-17  335.839996    9   12.0
2020-09-18  330.649994    9   13.0
2020-09-21  326.970001    9   14.0
2020-09-22  330.299988    9   15.0
2020-09-23  322.640015    9   16.0
2020-09-24  323.500000    9   17.0
2020-09-25  328.730011    9   18.0
2020-09-28  334.190002    9   19.0
2020-09-29  332.369995    9   20.0
2020-09-30  334.890015    9   21.0
2020-10-01  337.040009    10    1.0

However, I got this :

C:\Users\....\anaconda3\lib\site-packages\pandas\core\indexing.py:670: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)

Moreover, I believe my code is not efficient.

Is there a better way to do it?

Thank you

CodePudding user response:

yes, there is.

Assuming that your index is sorted, do this:

df['DayOfMonth'] = df.groupby(pd.Grouper(freq="M")).cumcount()   1

This is combo of these two answers:

CodePudding user response:

There is a very good explanation abou this in the following link I will no try to reproduce it but it has to do with hidden chaining somewhere in your code. Check out the article: https://towardsdatascience.com/explaining-the-settingwithcopywarning-in-pandas-ebc19d799d25

  • Related