I am using the max() windows function to get the maximum money raised for each year considering there are three categories of campaign length. I need to see which campaign length i.e '<=30 days', '36 - 60 days' and '>60 days' is able to raise more money year on year.
I wrote
select
year,
campaign_length_category,
money_raised,
max(money_raised) over (partition by year ) as max_money_raised
FROM
(
SELECT
year,
campaign_length_category,
concat('$', format(sum(pledged),2,'en_US')) as money_raised
FROM
(
select
extract(year from launched) Year,
case
when campaign_length >= 1 and campaign_length <= 30 then '<=30 days'
when campaign_length > 35 and campaign_length <= 60 then '36 - 60 days'
else '>60 days'
end as campaign_length_category,
pledged
from
(
select
launched,
outcome,
datediff(cast(deadline as date),cast(launched as date)) campaign_length,
pledged
from
campaign
) t1
)t2
group by 1, 2
)t3
order by 1
However, it's not showing the maximum value for each year in the output. For the year 2011 and 2012 it fetches wrong values.
year | campaign_length_category | money_raised | max_money_raised |
---|---|---|---|
2009 | <=30 days | $25,852.12 | $64,088.48 |
2009 | >60 days | $64,088.48 | $64,088.48 |
2009 | 36 - 60 days | $31,978.64 | $64,088.48 |
2010 | <=30 days | $201,063.08 | $467,862.95 |
2010 | >60 days | $467,862.95 | $467,862.95 |
2010 | 36 - 60 days | $432,416.84 | $467,862.95 |
2011 | <=30 days | $1,634,463.10 | $742,348.24 |
2011 | 36 - 60 days | $1,710,892.85 | $742,348.24 |
2011 | >60 days | $742,348.24 | $742,348.24 |
2012 | 36 - 60 days | $2,492,257.73 | $5,410,974.02 |
2012 | <=30 days | $5,410,974.02 | $5,410,974.02 |
2012 | >60 days | $1,434,506.99 | $5,410,974.02 |
I am not sure why the max() window function picks up incorrect max values for some years as given. Please enlighten me if I am doing something wrong
CodePudding user response:
You need to move your formatting to only the first, outer select. By doing the formatting in the inner select, you are making max(money_raised)
do string maximum of the formatted string, and "$742,348.24" is indeed greater than "$1,710,892.85", because the first characters are equal, and for the second characters, '7' is greater than '1'.
So:
select
year,
campaign_length_category,
concat('$', format(pledged,2,'en_US')) as money_raised
concat('$', format(max(pledged) over (partition by year),2,'en_US')) as max_money_raised
FROM
(
SELECT
year,
campaign_length_category,
sum(pledged) as pledged