I have a sqlite3 database maintained on an AWS exchange that is regularly updated by a Python script. One of the things it tracks is when any team generates a new post for a given topic. The entries look something like this:
id | client | team | date | industry | city |
---|---|---|---|---|---|
895 | acme industries | blueteam | 2022-06-30 | construction | springfield |
I'm trying to create a table that shows me how many entries for construction
occur each day. Right now, the entries with data populate, but they exclude dates with no entries. For example, if I search for just
SELECT date, count(id) as num_records
from mytable
WHERE industry = "construction"
group by date
order by date asc
I'll get results that looks like this:
date | num_records |
---|---|
2022-04-01 | 3 |
2022-04-04 | 1 |
How can I make sqlite output like this:
date | num_records |
---|---|
2022-04-02 | 3 |
2022-04-02 | 0 |
2022-04-03 | 0 |
2022-04-04 | 1 |
I'm trying to generate some graphs from this data and need to be able to include all dates for the target timeframe.
EDIT/UPDATE:
The table does not already include every date; it only includes dates relevant to an entry. If no team posts work on a day, the date
column will jump from day 1 (e.g. 2022-04-01) to day 3 (2022-04-03).
CodePudding user response:
Given that your "mytable" table contains all dates you need as an assumption, you can first select all of your dates, then apply a LEFT JOIN
to your own query, and map all resulting NULL values for the "num_records" field to "0" using the COALESCE
function.
WITH cte AS (
SELECT date,
COUNT(id) AS num_records
FROM mytable
WHERE industry = "construction"
GROUP BY date
ORDER BY date
)
SELECT dates.date,
COALESCE(cte.num_records, 0) AS num_records
FROM (SELECT date FROM mytable) dates
LEFT JOIN cte
ON dates.date = cte.date