I have a database that has id and distance columns.
I need to count the number of id's that are between a certain distance.
For example, I want to count how many ids are between 1km and 2km in distance, so I'm using this code:
SELECT COUNT (distance)
FROM table
WHERE distance BETWEEN 1000 AND 2000
Returns a COUNT of 240,600
When I want to count the number of ids that are between 2km and 3km and I use the same query, am I'm double counting the 2km routes as the BETWEEN operator is inclusive.
SELECT COUNT (distance)
FROM table
WHERE distance BETWEEN 2000 AND 3000
Returns a COUNT of 353,440
As I understand it the two queries above will both count all ids where the distance is 2000.
I'm curious to know if there is another way to count things like distance (and there are a lot of rows) or do I need to GROUP BY And then count?
Any help will be grateful.
Thanks,
CodePudding user response:
You can try this sort of thing to get a histogram table
SELECT DIV(distance, 1000) * 1000 distance_from,
(DIV(distance, 1000) * 1000) 999 distance_to,
COUNT(*) num
FROM mytable
GROUP BY DIV(distance, 1000) * 1000
The expression DIV(distance, 1000)
creates groups of distance values when you use it with GROUP BY
.
As you have discovered, BETWEEN
has limited usefulness for handling ranges of numbers. You may want distance >= 1000 AND distance < 2000
for the range [1000, 2000).
CodePudding user response:
If I understand correctly, then these:
WHERE distance BETWEEN 1000 AND 2000
WHERE distance BETWEEN 2000 AND 3000
should be:
WHERE distance BETWEEN 1000 AND 1999
WHERE distance BETWEEN 2000 AND 2999
These two ranges are mutually exclusive -and- there are exactly 1000 whole numbers in each range including start and end. That being said, it would rather write it as:
WHERE distance >= 1000 AND distance < 2000
WHERE distance >= 2000 AND distance < 3000