I have a huge table with the following format.
| Timestamp | Site_ID | Connected_status |
| ------------------------ | ------ | ---------------- |
| 2022-05-14 00:00:00 UTC | 12345 | True |
| 2022-05-14 00:01:00 UTC | 12345 | True |
| 2022-05-14 00:02:00 UTC | 12345 | True |
| 2022-05-14 00:03:00 UTC | 12345 | True |
...a few 1000 more entries for the same site_id (12345)
| 2022-05-14 00:00:00 UTC | 32145 | True |
| 2022-05-14 00:01:00 UTC | 32145 | True |
| 2022-05-14 00:02:00 UTC | 32145 | True |
| 2022-05-14 00:03:00 UTC | 32145 | True |
| 2022-05-14 00:04:00 UTC | 32145 | False |
| 2022-05-14 00:05:00 UTC | 32145 | False |
...a few 1000 more entries for the same site_id (32145)
This table captures every minute is a certain site_ID is connected or not. As a result, this table has several million records in it. I want to pull out whether a site_ID recorded both True and False 'Connected_status'. Thus I want the output to look like this
| Site_ID | Connected_status_change |
| ------ | ----------------------- |
| 31245 | True |
...and all other Site_ids where the connected_status has changed
I understand that I need to somehow group by all the Site_Ids and then run some kind of a filter on connected_status but I'm unable to figure it out.
CodePudding user response:
You can use 2 queries - the internal query will select all records with distinct - so you'll get only 1 record for true and 1 for false. than use external query to "count" the record group by the site_id, having the count column > 1
CodePudding user response:
SELECT Site_ID, COUNT(DISTINCT connected_status) AS CONNECTED_STATUS_CHANGE FROM yourtable GROUP BY Site_ID
This should return the number of distinct values of connected_status in the table, grouped by Site_ID. You can convert that to a boolean using IF THEN or CASE, etc. as you wish.