I will need to pivot a column in pandas, would greatly appreciate any help.
Input:
ID | Status | Date |
---|---|---|
1 | Online | 2022-06-31 |
1 | Offline | 2022-07-28 |
2 | Online | 2022-08-01 |
3 | Online | 2022-07-03 |
3 | None | 2022-07-05 |
4 | Offline | 2022-05-02 |
5 | Online | 2022-04-04 |
5 | Online | 2022-04-06 |
Output: Pivot on Status
ID | Date | Online | Offline | None |
---|---|---|---|---|
1 | 2022-06-31 | 1 | 0 | 0 |
1 | 2022-07-28 | 0 | 1 | 0 |
2 | 2022-08-01 | 1 | 0 | 0 |
3 | 2022-07-03 | 1 | 0 | 0 |
3 | 2022-07-05 | 1 | 0 | 0 |
4 | 2022-05-02 | 0 | 0 | 1 |
5 | 2022-04-04 | 1 | 0 | 0 |
5 | 2022-04-06 | 1 | 0 | 0 |
Or even better output if I am able to merge the counts for example:
Output: Pivot on Status & merge
ID | Online | Offline | None |
---|---|---|---|
1 | 1 | 1 | 0 |
2 | 1 | 0 | 0 |
3 | 2 | 0 | 0 |
4 | 0 | 0 | 1 |
5 | 2 | 0 | 0 |
The main issue here is that I won't know the status values i.e. Offline, Online, None.
I believe doing it in pandas might be easier due to the dynamic nature of not knowing column values for the column I want to pivot on.
CodePudding user response:
df.assign(seq=1).pivot_table(index='ID', columns='Status', values='seq', aggfunc='sum').fillna(0)
Status None Offline Online
ID
1 0.0 1.0 1.0
2 0.0 0.0 1.0
3 1.0 0.0 1.0
4 0.0 1.0 0.0
5 0.0 0.0 2.0