Home > Software design >  SQL query to find distinct values in joined tables
SQL query to find distinct values in joined tables

Time:10-15

I have two tables, one titled Members and one titled Addresses.

Each member can have multiple addresses, and each address can have multiple members. How do I write a query to find a count of how many members have multiple instances of the same address?

The address string (street_address) isn't stored directly on the member table but is accessed through a unique address id(address_id) located on the member table. That address id is a record(row on the address table) that stores the address string along with other information.

Sample data:

Member

name address_id1 address_id2
Jane Smith 1234 5678

Address

address_id street_address
1234 123 Main St.
5678 123 Main St.

I would like a count of the number of members who fit this scenario, multiple address_ids with the same street_address. Using AWS Redshift

Query attempted:

SELECT COUNT(*)
FROM member
    join address
    on distinct street_address; 

CodePudding user response:

I'm not sure if this is what you want to achieve, but that's how I understood it from the description. Taking into account your current data structure, you can try this query:

SELECT 
    * 
FROM members
    LEFT JOIN addresses AS first_address ON (first_address.address_id = members.address_id)
    LEFT JOIN addresses AS second_address ON (second_address.address_id = members.address_id2)
WHERE second_address.street_address = first_address.street_address

About your DB structure

You should consider changing the structure of your tables. You have a many-to-many relationship here.

The problem with your structure is that now each user can have a maximum of 2 addresses, as long as it may be fine in business terms, if you want to be able to add 3 addresses, you will add another column 'address_id3', for the 4th address the column 'address_id4' e.t.c. If these addresses are 40, you will have 40 columns. The second next problem is that you will probably have users who will still have 1 address, even though your table will hold 40 such fields.

The solution to such problems is creating intermediate tables:

Member table

member_id member
1 John
2 Bob

Address table

address_id street_address
1 First
2 Second

Intermediate table

address_id member_id
1 1
2 1
2 2

This way, you can store multiple addresses for multiple members without the overhead of blank data when not needed.

One member can then have up to 50,000 addresses and the size of your tables is still the same (I mean table structure - columns - not data)

  • Related