Home > Blockchain >  How do I remove rows of a Pandas DataFrame based on a certain condition?
How do I remove rows of a Pandas DataFrame based on a certain condition?

Time:06-15

import yfinance as yf
import numpy as np
import pandas as pd

ETF_DB = ['QQQ', 'EGFIX']
fundsret = yf.download(ETF_DB, start=datetime.date(2020,12,31), end=datetime.date(2022,4,30), interval='1mo')['Adj Close'].pct_change()
df = pd.DataFrame(fundsret)
df

Gives me: enter image description here

I'm trying to remove the rows in the dataframe that aren't month end such as the row 2021-03-22. How do I have the dataframe go through and remove the rows where the date doesn't end in '01'?

CodePudding user response:

df.reset_index(inplace=True)

# Convert the date to datetime64
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d')

#select only day = 1
filtered = df.loc[df['Date'].dt.day == 1]

CodePudding user response:

Did you mean month start?

You can use:

df = df[df.index.day==1]

reproducible example:

df = pd.DataFrame(columns=['A', 'B'],
                  index=['2021-01-01', '2021-02-01', '2021-03-01',
                         '2021-03-22', '2021-03-31'])
df.index = pd.to_datetime(df.index, dayfirst=False)

output:

              A    B
2021-01-01  NaN  NaN
2021-02-01  NaN  NaN
2021-03-01  NaN  NaN

end of month

for the end of month, you can add 1 day and check if this jumps to the next month:

end = (df.index pd.Timedelta('1d')).month != df.index.month

df = df[end]

or add an offset and check if the value is unchanged:

end = df.index == (df.index   pd.offsets.MonthEnd(0))

df = df[end]

output:

              A    B
2021-03-31  NaN  NaN
  • Related