Home > Software engineering >  SQLite Count() of true boolean values in GROUP
SQLite Count() of true boolean values in GROUP

Time:12-12

I want to get messages from a view (it is a view because it contains messages from multiple sources) in an SQLite database. If a there are multiple messages from the same user, I want only the newest one. The view is already sorted by Date DESC. If I use the query SELECT *, COUNT(IsRead = false) FROM Messages GROUP BY User I get the newest message from each user and the amount of messages for each user. However instead of the total amount of messages, I only want the amount of unread messages (Read == false)

For the following table:

 ------- -------- ------------ 
| User  | IsRead |    Date    |
 ------- -------- ------------ 
| User1 | false  | 2020-01-05 |
| User2 | false  | 2020-01-04 |
| User1 | false  | 2020-01-03 |
| User3 | true   | 2020-01-02 |
| User2 | true   | 2020-01-01 |
| User3 | true   | 2020-01-01 |
 ------- -------- ------------ 

I would like to get the following result

 ------- -------- ------------ --------- 
| User  | IsRead |    Date    | notRead |
 ------- -------- ------------ --------- 
| User1 | false  | 2020-01-05 |       2 |
| User2 | false  | 2020-01-04 |       1 |
| User3 | true   | 2020-01-02 |       0 |
 ------- -------- ------------ --------- 

How can I achieve that? The COUNT(IsRead = false) part in my query is just the way I imagined it could work. I could not find anything about how to do that in sqlite.

Also: Am I correct about always getting the most recent message from each user, if the output from the view is already sorted by Date descending? That seemed to be the case in my tests but I just want to make sure that was not a fluke.

CodePudding user response:

From SELECT/The ORDER BY clause:

If a SELECT statement that returns more than one row does not have an ORDER BY clause, the order in which the rows are returned is undefined.

This means that even if the view is defined to return sorted rows, selecting from the view is not guaranteed to maintain that sort order.

Also, with this query:

SELECT *, COUNT(IsRead = false) FROM Messages GROUP BY User

even if you get the newest message from each user, this is not guaranteed, because it is not a documented feature.

So, for your question:

Am I correct about always getting the most recent message from each user, if the output from the view is already sorted by Date descending?

the answer is no.

You can't rely on coincidental results, but you can rely on documented features.

A documented feature is the use of Bare columns in an aggregate query which can solve your problem with a simple aggregation query:

SELECT User,
       IsRead,
       MAX(Date) Date,
       SUM(NOT IsRead) notRead -- sums all 0s (false) by converting them to 1s (true)
FROM Messages
GROUP BY User;

Or, use window functions:

SELECT DISTINCT User,    
       FIRST_VALUE(IsRead) OVER (PARTITION BY User ORDER BY Date DESC) IsRead,
       Max(Date) OVER (PARTITION BY User) Date,
       SUM(NOT IsRead) OVER (PARTITION BY User) notRead 
FROM Messages;

See the demo.

CodePudding user response:

To get the number of unread messages per user, you can use a subquery to count the number of unread messages for each user, and then join that result to your main query. Here's an example:

SELECT m.*, unread.notRead
FROM Messages m
LEFT JOIN (
  SELECT User, COUNT(IsRead = 0) AS notRead
  FROM Messages
  GROUP BY User
) unread
ON m.User = unread.User
GROUP BY m.User

This query first counts the number of unread messages for each user and stores the result in a subquery. Then, it uses a LEFT JOIN to join the subquery to the Messages table, based on the User column. Finally, it groups the results by User to ensure that each user only appears once in the result set.

As for your second question, if the view is already sorted by Date in descending order, then you will indeed get the most recent message for each user in the result set. However, if the view is not sorted in this way, then you may need to add an ORDER BY clause to your query to ensure that the results are sorted correctly.

  • Related