Home > Software engineering >  SQL: How can I count unique attribute values using two attributes and joining two tables?
SQL: How can I count unique attribute values using two attributes and joining two tables?

Time:04-18

I'm a beginner in SQL and I've been struggling for hours and hours now over a problem that seems to be quite easy and I didn't find an appropriate answer:

Simplified, I have two tables, districts and streetdistricts, which contain information about city districts and streets. Every district has a unique number dkey and every street has a unique street number stkey (as primary keys respectively).

Here's an example:

Table districts:

dkey name
1 Inner City
2 Outer City
3 Outskirts

Table streetdistricts:

stkey dkey
113 1
126 2
148 2
148 3
152 3
154 3

What I want to do now is to find out how many streets are there per district that are located only in one single district. So that means I do not have to just remove duplicates (like street with stkey 148 here), but instead to remove streets that are situated in more than one district completely so that I only see the districts and the number of streets per district that are just located in one district only.

For this example, this would be:

name           number_of_street_in_just_this_district
Inner City     1
Outer City     1
Outskirts      2

I've tried many things, but I always get stuck, mostly because when I SELECT the name of the district, it is also needed in GROUP BY as SQL says, but when I add it, then either the whole number of streets (here: 6) or at least the number including the duplicates (here: 5) are displayed, but not the right answer of 3.

Or I'm not able to JOIN the tables correctly so to get the output I want. Here is my last try:

SELECT SUM(StreetDistricts.dkey) as d_number, StreetDistricts.stkey, COUNT(StreetDistricts.stkey) as numb
FROM StreetDistricts
INNER JOIN Districts ON Districts.dkey = StreetDistricts.dkey
GROUP BY StreetDistricts.stkey
HAVING COUNT(StreetDistricts.dkey) = 1
ORDER BY d_number DESC

This works to get me the correct sum of rows, but I was not able to combine/join it with the other table to receive name and number of unique streets.

Please help me, thanks!

CodePudding user response:

First obtain the streets that are found in only one district (cte1). Then count just those streets per district. This should do it:

WITH cte1 AS (
        SELECT stkey FROM StreetDistricts GROUP BY stkey HAVING COUNT(DISTINCT dkey) = 1
     )
SELECT d.name, COUNT(*) AS n
  FROM StreetDistricts AS s
  JOIN Districts       AS d
    ON s.dkey = d.dkey
   AND s.stkey IN (SELECT stkey FROM cte1)
 GROUP BY d.dkey
;

Result:

 ------------ --- 
| name       | n |
 ------------ --- 
| Inner City | 1 |
| Outer City | 1 |
| Outskirts  | 2 |
 ------------ --- 

Note: I used the fact that dkey is the primary key of Districts to avoid having to GROUP BY d.name as well. This is guaranteed by functional dependence. If your database doesn't guarantee that with a constraint, just add d.name to the final GROUP BY terms.

The test case:

CREATE TABLE Districts (dkey int primary key, name varchar(30));

CREATE TABLE StreetDistricts (stkey int, dkey int);

INSERT INTO Districts VALUES
  (1,'Inner City')
, (2,'Outer City')
, (3,'Outskirts')
;

INSERT INTO StreetDistricts VALUES
  (113,1)
, (126,2)
, (148,2)
, (148,3)
, (152,3)
, (154,3)
;
  • Related