Starting from a dataframe with stock prices, I intend to define a column ['Day_Of_Month'].
This is what I did:
import yfinance as yf
import pandas as pd
df = yf.download('SPY', start='2020-07-02')
pd.set_option("display.max_rows", None, "display.max_columns", None)
df['Month'] = df.index.month
df.loc[(df['Month'].shift(0) != df['Month'].shift(1)) & (df['Month'].shift(1) ==
['Month'].shift(2)), 'DayOfMonth'] = 1
for i in range(1, len(df), 1):
if df['Month'].iloc[i - 1] == df['Month'].iloc[i]:
df['DayOfMonth'].iloc[i] = df['DayOfMonth'].iloc[i - 1] 1
The result is correct:
Close Month DayOfMonth
Date
2020-08-31 349.309998 8 21.0
2020-09-01 352.600006 9 1.0
2020-09-02 357.700012 9 2.0
2020-09-03 345.390015 9 3.0
2020-09-04 342.570007 9 4.0
2020-09-08 333.209991 9 5.0
2020-09-09 339.790009 9 6.0
2020-09-10 333.890015 9 7.0
2020-09-11 334.059998 9 8.0
2020-09-14 338.459991 9 9.0
2020-09-15 340.170013 9 10.0
2020-09-16 338.820007 9 11.0
2020-09-17 335.839996 9 12.0
2020-09-18 330.649994 9 13.0
2020-09-21 326.970001 9 14.0
2020-09-22 330.299988 9 15.0
2020-09-23 322.640015 9 16.0
2020-09-24 323.500000 9 17.0
2020-09-25 328.730011 9 18.0
2020-09-28 334.190002 9 19.0
2020-09-29 332.369995 9 20.0
2020-09-30 334.890015 9 21.0
2020-10-01 337.040009 10 1.0
However, I got this :
C:\Users\....\anaconda3\lib\site-packages\pandas\core\indexing.py:670: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
iloc._setitem_with_indexer(indexer, value)
Moreover, I believe my code is not efficient.
Is there a better way to do it?
Thank you
CodePudding user response:
yes, there is.
Assuming that your index is sorted, do this:
df['DayOfMonth'] = df.groupby(pd.Grouper(freq="M")).cumcount() 1
This is combo of these two answers:
CodePudding user response:
There is a very good explanation abou this in the following link I will no try to reproduce it but it has to do with hidden chaining somewhere in your code. Check out the article: https://towardsdatascience.com/explaining-the-settingwithcopywarning-in-pandas-ebc19d799d25