This is my sample code. My database contains columns for every date of the year, going back multiple years. Each column corresponds to a specific date.
import pandas as pd
df = pd.DataFrame([[10, 5, 25, 67,25,56],
[20, 10, 26, 45, 56, 34],
[30, 3, 27, 34, 78, 34],
[40, 9, 28, 45, 34,76]],
columns=[pd.to_datetime('2022-09-14'), pd.to_datetime('2022-08-14'), pd.to_datetime('2022-07-14'), pd.to_datetime('2021-09-14'),
pd.to_datetime('2020-09-14'), pd.to_datetime('2019-09-14')])
Is there a way to select only those columns which fit a particular criteria based on year, month or quarter.
For example, I was hoping to get only those columns which is the same date as today (any starting date) for every year. For example, today is Sep 14, 2022 and I need columns only for Sep 14, 2021, Sep 14, 2020 and so on. Another option could be to do the same on a month or quarter basis. How can this be done in pandas?
CodePudding user response:
Yes, you can do:
# day
df.loc[:, df.columns.day == 14]
2022-09-14 2022-08-14 2022-07-14 2021-09-14 2020-09-14 2019-09-14
0 10 5 25 67 25 56
1 20 10 26 45 56 34
2 30 3 27 34 78 34
3 40 9 28 45 34 76
# month
df.loc[:, df.columns.month == 9]
2022-09-14 2021-09-14 2020-09-14 2019-09-14
0 10 67 25 56
1 20 45 56 34
2 30 34 78 34
3 40 45 34 76
# quarter
df.loc[:, df.columns.quarter == 3]
2022-09-14 2022-08-14 2022-07-14 2021-09-14 2020-09-14 2019-09-14
0 10 5 25 67 25 56
1 20 10 26 45 56 34
2 30 3 27 34 78 34
3 40 9 28 45 34 76