I am completely rewriting this question, I just cant crack it
IDB DB2 SQL
(from a Chicago Crime Dataset) Which community area is most crome prone?
When I use this code, it does correctly count and sort the data
select community_area_number as community_area_number, count(community_area_number) as total_area_crime
from chicago_crime_data
group by community_area_number
order by total_area_crime desc;
the problem is, it lists all the data descending, but no matter what MAX statement I use, either in the select or the order by statement, it wont show just the max values. The max values are 43, so I would like to to show both 'community_area_numbers' that have 43.
Instead it shows the entire list.
also, yes I understand I can just do a LIMIT 2 command, but that would be cheating since I manually checked that there are 2 max values, but if this data changed or i didnt know that, it doesnt solve anything
thanks in advance
CodePudding user response:
Like @topsail mentioned. You could use a rank function.
From the table you have above you could do the following
SELECT t.* FROM
(
SELECT *,
RANK() OVER (Order by Total_Area_Crime DESC) rnk
from
table1
)t
WHERE t.rnk = 1
So your full query should look something like this:
With cte AS (
SELECT MAX(COMMUNITY_AREA_NUMBER) AS COMMUNITY_AREA_NUMBER,
COUNT(COMMUNITY_AREA_NUMBER) AS TOTAL_AREA_CRIME
FROM CHICAGO_CRIME_DATA
GROUP BY COMMUNITY_AREA_NUMBER
ORDER BY TOTAL_AREA_CRIME DESC;
)
SELECT t.* FROM
(
SELECT *,
RANK() OVER (Order by Total_Area_Crime DESC) rnk
from
cte
)t
WHERE t.rnk = 1
CodePudding user response:
What you would be looking for is the standard SQL clause FETCH WITH TIES
;
select community_area_number, count(*) as total_area_crime
from chicago_crime_data
group by community_area_number
order by total_area_crime desc
fetch first row with ties;
Unfortunately, though, DB2 doesn't support WITH TIES
in FETCH FIRST
.
The classic way (that is before we had the window functions RANK
and DENSE_RANK
) is to use a subquery: Get the maximum value, then get all rows with that maximum. I am using a CTE (aka WITH
clause) here in order not to have to write everything twice.
with counted as
(
select community_area_number, count(*) as total_area_crime
from chicago_crime_data
group by community_area_number
)
select community_area_number, total_area_crime
from counted
where total_area_crime = (select max(total_area_crime) from counted);
(Please not that this is a mere COUNT(*)
, because we want to count rows per community_area_number.)
CodePudding user response:
It turns out the professor did want us to use the Limit command. Here is the final answer:
SELECT COMMUNITY_AREA_NUMBER, COUNT(ID) AS CRIMES_RECORDED
FROM CHICAGO_CRIME_DATA
GROUP BY COMMUNITY_AREA_NUMBER
ORDER BY CRIMES_RECORDED DESC LIMIT 1;
thanks to all those who responded :D