LIMITING MAX VALUES IN SQL-CodePudding

I am completely rewriting this question, I just cant crack it

IDB DB2 SQL

(from a Chicago Crime Dataset) Which community area is most crome prone?

When I use this code, it does correctly count and sort the data

select community_area_number as community_area_number, count(community_area_number) as total_area_crime
from chicago_crime_data
group by community_area_number
order by total_area_crime desc;

the problem is, it lists all the data descending, but no matter what MAX statement I use, either in the select or the order by statement, it wont show just the max values. The max values are 43, so I would like to to show both 'community_area_numbers' that have 43.

Instead it shows the entire list.

Here is a screenshot

also, yes I understand I can just do a LIMIT 2 command, but that would be cheating since I manually checked that there are 2 max values, but if this data changed or i didnt know that, it doesnt solve anything

thanks in advance

CodePudding user response：

Like @topsail mentioned. You could use a rank function.

From the table you have above you could do the following

SELECT t.* FROM
   (
    SELECT *,
       RANK() OVER (Order by Total_Area_Crime DESC) rnk
    from 
    table1
    )t
WHERE t.rnk = 1

db fiddle

So your full query should look something like this:

With cte AS (
    SELECT MAX(COMMUNITY_AREA_NUMBER) AS COMMUNITY_AREA_NUMBER, 
    COUNT(COMMUNITY_AREA_NUMBER) AS TOTAL_AREA_CRIME
    FROM CHICAGO_CRIME_DATA 
    GROUP BY COMMUNITY_AREA_NUMBER
    ORDER BY TOTAL_AREA_CRIME DESC;
)
SELECT t.* FROM
   (
    SELECT *,
       RANK() OVER (Order by Total_Area_Crime DESC) rnk
    from 
    cte
    )t
WHERE t.rnk = 1

CodePudding user response：

What you would be looking for is the standard SQL clause FETCH WITH TIES;

select community_area_number, count(*) as total_area_crime
from chicago_crime_data
group by community_area_number
order by total_area_crime desc
fetch first row with ties;

Unfortunately, though, DB2 doesn't support WITH TIES in FETCH FIRST.

The classic way (that is before we had the window functions RANK and DENSE_RANK) is to use a subquery: Get the maximum value, then get all rows with that maximum. I am using a CTE (aka WITH clause) here in order not to have to write everything twice.

with counted as
(
  select community_area_number, count(*) as total_area_crime
  from chicago_crime_data
  group by community_area_number
)
select community_area_number, total_area_crime
from counted
where total_area_crime = (select max(total_area_crime) from counted);

(Please not that this is a mere COUNT(*), because we want to count rows per community_area_number.)

CodePudding user response：

It turns out the professor did want us to use the Limit command. Here is the final answer:

SELECT COMMUNITY_AREA_NUMBER, COUNT(ID) AS CRIMES_RECORDED 
FROM CHICAGO_CRIME_DATA
GROUP BY COMMUNITY_AREA_NUMBER 
ORDER BY CRIMES_RECORDED DESC LIMIT 1;

thanks to all those who responded :D