I have a dataframe which contains a specific column for the date which is called 'testdate'. And I have a period between two specific date, such as 20110501~20120731.
From a dataframe, I want to divide that dataframe into multiple dataframes according to the year-month of 'testdate'.
For example, if 'testdate' is between 20110501-20110531 then df1, if 'testdate' is between next month, then f2, ... and so on.
For example, a whole dataframe looks like this...
| StudentID | Testdate | Record |
| -------- | -------- | ------ |
| 1 | 20110528 | 50 |
| 2 | 20110601 | 75 |
| 3 | 20110504 | 100 |
| 4 | 20110719 | 82 |
| 5 | 20111120 | 42 |
| 6 | 20111103 | 95 |
| 7 | 20120520 | 42 |
| 8 | 20120503 | 95 |
But, I want to divide it like this...
[DF1]: name should be 201105
| StudentID | Testdate | Record |
| -------- | -------- | ------ |
| 1 | 20110528 | 50 |
| 3 | 20110504 | 100 |
[DF2]: name should be 201106
| StudentID | Testdate | Record |
| -------- | -------- | ------ |
| 2 | 20110601 | 75 |
[DF3]
| StudentID | Testdate | Record |
| -------- | -------- | ------ |
| 4 | 20110719 | 82 |
[DF4]
| StudentID | Testdate | Record |
| -------- | -------- | ------ |
| 5 | 20111120 | 42 |
| 6 | 20111103 | 95 |
[DF5]
| StudentID | Testdate | Record |
| -------- | -------- | ------ |
| 7 | 20120520 | 42 |
| 8 | 20120503 | 95 |
I found some codes for dividing a dataframe according to the quarter, but I could find any codes for my task.
How can I deal with it ? Many thanks to your help.
CodePudding user response:
Create a grouper by slicing yyyymm
from testdate
then group the dataframe and store each group inside a dict comprehension
s = df['Testdate'].astype(str).str[:6]
dfs = {f'df_{k}': g for k, g in df.groupby(s)}
# dfs['df_201105']
StudentID Testdate Record
0 1 20110528 50
2 3 20110504 100
# dfs['df_201106']
StudentID Testdate Record
1 2 20110601 75