dear community!
I'm a bit lost with my SQL query on PRESTO. Essentially, I have a view with the following columns:
branch | year_month (varchar) | sellid [... and others]
Germany | 2022-06 | 123q1239
What I need to do is fairly simple, I need to create a table like this:
branch | year_month | count(sellid) actual month | count(sellid) last month
I attempted the following among other things, but unfortunately I'm still getting errors.
SELECT branch, year_month, count(sellid)as active_locs,
CASE
WHEN TO_DATE(year_month, '%Y-%m') = date_format(TO_DATE(year_month, '%Y-%m') - interval '1' month, '%Y-%m') THEN count(sellid) END as locs_LM
FROM table
GROUP BY branch, year_month
This is the exact error I'm getting:
An error occurred when executing the SQL command: SELECT branch, year_month, count(sellid)as active_locs, CASE WHEN TO_DATE(year_month, '%Y-%m') = date_format(TO_DATE(year_month, '%Y-%m') - interv...
Query failed (#20220628_093246_00650_e4kcc): line 3:39: '=' cannot be applied to date, varchar [DB Errorcode=1] 1 statement failed.
Could someone help me out??
Thanks in advance
CodePudding user response:
date_format
returns varchar while TO_DATE
returns date, so you have type mismatch. Also you need to use date_parse
with provided format:
SELECT branch, year_month, count(sellid)as active_locs,
CASE
WHEN year_month = date_format(date_parse(year_month, '%Y-%m') - interval '1' month, '%Y-%m') THEN count(sellid) END as locs_LM
FROM dataset
GROUP BY branch, year_month
This will make query executing, but will not make it right. To make it right you can use windows functions and subquery:
--sample data
WITH dataset(branch, year_month, sellid) AS (
VALUES ('Germany', '2022-06', '123q1239'),
('Germany', '2022-05', '123q1239')
)
-- sample query
SELECT branch,
date_format(year_month, '%Y-%m') year_month,
active_locs,
if( -- if previous record is for previous month use previous active_locs
lag(year_month) over (partition by branch order by year_month)
= year_month - interval '1' month,
lag(active_locs) over (partition by branch order by year_month),
0
) locs_LM
FROM (
SELECT branch,
date_parse(year_month, '%Y-%m') year_month,
count(sellid) as active_locs
FROM dataset
GROUP BY 1,
2
)
Output:
branch | year_month | active_locs | locs_LM |
---|---|---|---|
Germany | 2022-05 | 1 | 0 |
Germany | 2022-06 | 1 | 1 |