Home > database >  How can I count distinct values of certain attributes within a date range?
How can I count distinct values of certain attributes within a date range?

Time:04-22

I need to compute how many different types of fruit were sold in each county within a certain date range, using county_name and distinct fruit_codes.

I got a county table:

county_code county_name
1 Colfax
2 Catron
3 Harding

And I got a fruit table:

county_code fruit_code date
1 1 2020-01-01
1 2 2020-01-01
1 3 2020-01-01
1 4 2020-01-01
2 1 2020-01-01
2 2 2020-01-01
3 1 2020-01-01
3 2 2020-01-01
3 3 2020-01-01
1 1 2021-01-01
1 3 2021-01-01
2 2 2021-01-01
3 2 2021-01-01
3 3 2021-01-01

So my expected result would be:

county_name fruit_code
Colfax 4
Catron 2
Harding 3

I tried the following code, but the number of distinct fruit_codes is not exactly the correct one, unfortunately (note: the actual table is much longer, of course).

WITH unique_fruits AS (
    SELECT fruit_code FROM fruit
    WHERE date BETWEEN '2020-01-01' AND '2021-01-01'
    GROUP BY fruit_code HAVING COUNT(DISTINCT county_code) = 1
)
SELECT c.county_name, COUNT(*) AS no_of_unique_fruits
  FROM fruit AS f
  JOIN county AS c
    ON f.county_code = c.county_code
   AND f.fruit_code IN (SELECT fruit_code FROM unique_fruits)
GROUP BY c.county_name
ORDER BY no_of_unique_fruits DESC

What have I forgotten here?

CodePudding user response:

You can solve this easier

WITH fruitcodespercountry AS(
SELECT "county_code", COUNT(DISTINCT "fruit_code") fruit_code
FROM Fruit WHERE "date" BETWEEN '2020-01-01' AND '2021-01-01'
GROUP  BY "county_code")
SELECT "county_name" ,fruit_code FROM fruitcodespercountry f JOIN country c ON f."county_code" = c."county_code"
county_name | fruit_code
:---------- | ---------:
Colfax      |          4
Catron      |          2
Harding     |          3

db<>fiddle here

CodePudding user response:

You need a LEFT join of county to fruit (so that you get a result with 0 for counties that do not meet the conditions) and group by county.
Also, the condition about the date should be placed in the ON clause:

SELECT c.county_name,
       COUNT(DISTINCT f.fruit_code) total
FROM county c LEFT JOIN fruit f
ON f.county_code = c.county_code AND date BETWEEN '2020-01-01' AND '2021-01-01'
GROUP BY c.county_code;

I assume that county_code is the primary key of county.

See the demo.

  • Related