How can I group by dates only by month (without years)?-CodePudding

What I need is to group dataset only by months, so in the end the would be only 12 rows with data aggregated for each month. Unfortunately, when I try to use pd.Grouper it doesn't work (well, it changes days to 31, but it's not what I need). Maybe I should try using regex and delete years names and then try? Still unsure. Anyway, I hope that you'll help me.. P.S. Does Grouper works only with date as index? Without setting date to index, even if date in datetime format it just doesn't work..

ds = pd.read_csv('https://raw.githubusercontent.com/Lotaristo/Datasets/main/Projects/Homework/homework.csv')
ds.date = pd.to_datetime(ds.date)
ds.set_index('date', inplace=True)
ds.groupby(pd.Grouper(freq='M')).mean()

CodePudding user response：

One very simple solution may be to create another column month as a substring of date, and then group on this new column.

ds = pd.read_csv('https://raw.githubusercontent.com/Lotaristo/Datasets/main/Projects/Homework/homework.csv')

ds['month'] = ds.date.str[5:7]
aggr = ds.groupby('month').mean()

CodePudding user response：

You can use .month to access months of the index:

out = ds.groupby(ds.index.month).mean()

Output:

      гдз по русскому языку 5 класс  гдз по английскому языку 5 класс   ...
date                                                                    
1                         27.200000                         28.733333   ...
2                         34.466667                         34.000000   ...
3                         26.200000                         27.000000   ...
4                         36.600000                         36.666667   ...
5                         20.133333                         20.866667   ...
6                          0.266667                          0.533333   ...
7                          0.066667                          0.266667   ...
8                          0.000000                          0.533333   ...
9                         33.400000                         30.733333   ...
10                        29.466667                         32.666667   ...
11                        31.000000                         31.800000   ...
12                        29.600000                         32.733333   ...


[12 rows x 69 columns]