Let it be the following Python Panda DataFrame:
code | date | color | counter |
---|---|---|---|
TTT | 2021-02-01 00:30:11 | blue | 2 |
AVC | 2022-02-01 10:00:01 | red | 12 |
BCH | 2022-02-02 22:00:01 | red | 12 |
POD | 2022-03-01 01:00:03 | blue | 1 |
UDY | 2022-05-01 09:02:01 | red | 3 |
YUD | 2022-12-01 09:02:01 | green | 0 |
I need to create a column by year and month that includes whether or not that row belongs (bool) to the specific month indicated in the column. An example of the resulting DataFrame is as follows:
code | date | color | counter | 2021-02 | 2022-02 | 2022-03 | 2022-05 | 2022-12 |
---|---|---|---|---|---|---|---|---|
TTT | 2021-02-01 00:30:11 | blue | 2 | True | False | False | False | False |
AVC | 2022-02-01 10:00:01 | red | 12 | False | True | False | False | False |
BCH | 2022-02-02 22:00:01 | red | 12 | False | True | False | False | False |
POD | 2022-03-01 01:00:03 | blue | 1 | False | False | True | False | False |
UDY | 2022-05-01 09:02:01 | red | 3 | False | False | False | True | False |
YUD | 2022-12-01 09:02:01 | green | 0 | False | False | False | False | True |
To solve the problem only the information in the date
column is used, but I include the complete DataFrame to avoid incompatibilities.
CodePudding user response:
Use get_dummies
with Series.dt.strftime
for YYYY-MM
format and parameter dtype
for booleans:
df = df.join(pd.get_dummies(df['date'].dt.strftime('%Y-%m'), dtype=bool))
print (df)
code date color counter 2021-02 2022-02 2022-03 \
0 TTT 2021-02-01 00:30:11 blue 2 True False False
1 AVC 2022-02-01 10:00:01 red 12 False True False
2 BCH 2022-02-02 22:00:01 red 12 False True False
3 POD 2022-03-01 01:00:03 blue 1 False False True
4 UDY 2022-05-01 09:02:01 red 3 False False False
5 YUD 2022-12-01 09:02:01 green 0 False False False
2022-05 2022-12
0 False False
1 False False
2 False False
3 False False
4 True False
5 False True
CodePudding user response:
You can try df.to_period
with str.get_dummies
out = df.join(pd.to_datetime(df['date']).dt.to_period("M")
.astype(str).str.get_dummies()
.astype(bool))
print(out)
code date color counter 2021-02 2022-02 2022-03 2022-05 2022-12
0 TTT 2021-02-01 00:30:11 blue 2 True False False False False
1 AVC 2022-02-01 10:00:01 red 12 False True False False False
2 BCH 2022-02-02 22:00:01 red 12 False True False False False
3 POD 2022-03-01 01:00:03 blue 1 False False True False False
4 UDY 2022-05-01 09:02:01 red 3 False False False True False
5 YUD 2022-12-01 09:02:01 green 0 False False False False True