I have a device_id's event data which might be successful sometime and unsuccessful sometime.
device_id | status |
---|---|
1 | Successful |
1 | UnSuccessful |
1 | UnSuccessful |
1 | UnSuccessful |
1 | Successful |
2 | Successful |
2 | UnSuccessful |
2 | UnSuccessful |
Is there a way to do a group by and get result for an Id in a single row like this:
device_id | success_count | unsuccessful_count |
---|---|---|
1 | 2 | 3 |
2 | 1 | 2 |
I have been trying several ways using group by but I haven't been able to get the success_count and unsuccessful_count for a device_id in single row.
CodePudding user response:
You need to group your data by device id and then pivot by status and count:
df.groupBy("device_id").pivot("status").count()