I need to count data for the last month. How can I achieve this? Subqueries in SELECT clause and correlated subqueries are not allowed. Something similar to this:
count(crimes.crime_id) OVER (PARTITION BY ((crimes.month::date - interval '1 month')::date), crimes.category) AS "Previous month crimes"
But that doesn't work and selects by current month:
Crime category | Month | Previous month crimes | Current month crimes
----------------------- ------------ ----------------------- ----------------------
anti-social-behaviour | 2021-01-01 | 1 | 1
bicycle-theft | 2021-01-01 | 19 | 19
bicycle-theft | 2021-02-01 | 4 | 4
bicycle-theft | 2021-03-01 | 18 | 18
burglary | 2021-01-01 | 61 | 61
burglary | 2021-02-01 | 42 | 42
burglary | 2021-03-01 | 48 | 48
criminal-damage-arson | 2021-01-01 | 60 | 60
criminal-damage-arson | 2021-02-01 | 54 | 54
criminal-damage-arson | 2021-03-01 | 64 | 64
Query:
SELECT DISTINCT
crimes.category AS "Crime category",
crimes.month AS "Month",
count(crimes.crime_id) OVER (PARTITION BY ((crimes.month::date - interval '1 month')::date), crimes.category) AS "Previous month crimes",
count(crimes.crime_id) OVER (PARTITION BY crimes.month, crimes.category) AS "Current month crimes"
FROM crimes
WHERE crimes.month >= :start_month AND crimes.month <= :end_month
GROUP BY crimes.month, crimes.category, crimes.crime_id
ORDER BY crimes.category, crimes.month ASC;
Table:
CREATE TABLE IF NOT EXISTS crimes(
"crime_id" bigserial PRIMARY KEY,
"category" VARCHAR(255) NOT NULL,
"persistent_id" VARCHAR(255) NOT NULL UNIQUE,
"month" DATE NOT NULL,
"location" bigint NOT NULL REFERENCES locations (location_id),
"context" VARCHAR(255) NOT NULL,
"id" bigint NOT NULL UNIQUE,
-- "location_type" location_type NOT NULL,
"location_type" VARCHAR(255) NOT NULL,
"location_subtype" VARCHAR(255) NOT NULL,
"outcome_status" bigint REFERENCES outcomes (status_id)
);
Sample data for crimes table:
crime_id | category | persistent_id | month | location | context | id | location_type | location_subtype | outcome_status
---------- ----------------------- ------------------------------------------------------------------ ------------ ---------- --------- ---------- --------------- ------------------ ----------------
1 | anti-social-behaviour | | 2021-01-01 | 56047 | | 89873829 | Force | |
2 | bicycle-theft | 197505c1b681339abd93e87200b9d68779d08e0b89154f3568cf13615b0e3147 | 2021-01-01 | 39811 | | 89921014 | Force | | 1
3 | bicycle-theft | 403ecf74716180b5d4d8f64d6c96cf05351684c0dc924f0d520331ed559697b2 | 2021-01-01 | 57734 | | 89905063 | Force | | 1
4 | bicycle-theft | 8f04c99ded0a5ab35d8d48037c1a8230c8bf06226f5f09a5d45fb914572460d1 | 2021-01-01 | 57735 | | 89941994 | Force | | 1
5 | burglary | 340ae50ef443ba7df87f0a69ff80fc8aa2dd46500d40ef8696baa4f3dac7de57 | 2021-01-01 | 57736 | | 89934048 | Force | | 1
6 | burglary | 7f85d33e112a93702c54f6d9b101b5fc2962cc30fcd6aca59991fa8a00df5fb6 | 2021-01-01 | 56114 | | 89936322 | Force | | 2
7 | burglary | e3e609ace7096d5b9f7b68a5c3d26653dc79f6d5d6f22c959f99654cc67ee61a | 2021-01-01 | 57734 | | 89936709 | Force | | 1
8 | burglary | 46cf1c8597841acc963b525b70d4f3c2bbee0977b97b7a059e94762cab350ab5 | 2021-01-01 | 57699 | | 89894592 | Force | | 1
9 | burglary | 517f7c304d89b92f8cbf429b4d873bb456768148e566c53f4f31a772d277ddcf | 2021-01-01 | 57737 | | 89902228 | Force | | 1
10 | burglary | 6e12fbf5ee3a56a4a75f0fb91f5008c1ac8e6f18ab74c62f789739208491e208 | 2021-01-01 | 57705 | | 89930518 | Force | | 3
CodePudding user response:
Maybe something like this. Remove crime_id
from GROUP BY
, so that the simple aggregate is per month/category. Then use window functions to obtain the prior count
per category
, based on the month ordering. COALESCE simply handles the null
cases (for rows with no prior month) by returning 0. The DISTINCT
is not required, unless you're not selecting enough detail to identify each group properly. I'm removing that.
Remember, if your WHERE
clause eliminates monthly data you need for LAG
to find previous months counts, you can filter later, after the LAG
calculation. In other words calculate current and previous results first (with a slightly larger date range, or no range), then filter those results by the final date range required, using CTE terms or a derived table.
SELECT
crimes.category AS "Crime category",
crimes.month AS "Month",
COALESCE(LAG(count(crimes.crime_id)) OVER (PARTITION BY crimes.category ORDER BY month), 0) AS "Previous month crimes",
count(crimes.crime_id) AS "Current month crimes"
FROM crimes
WHERE crimes.month >= :start_month AND crimes.month <= :end_month
GROUP BY crimes.month, crimes.category
ORDER BY crimes.category, crimes.month ASC
;