I have two tables, one titled Members and one titled Addresses.
Each member can have multiple addresses, and each address can have multiple members. How do I write a query to find a count of how many members have multiple instances of the same address?
The address string (street_address) isn't stored directly on the member table but is accessed through a unique address id(address_id) located on the member table. That address id is a record(row on the address table) that stores the address string along with other information.
Sample data:
Member
name | address_id1 | address_id2 |
---|---|---|
Jane Smith | 1234 | 5678 |
Address
address_id | street_address |
---|---|
1234 | 123 Main St. |
5678 | 123 Main St. |
I would like a count of the number of members who fit this scenario, multiple address_ids with the same street_address. Using AWS Redshift
Query attempted:
SELECT COUNT(*)
FROM member
join address
on distinct street_address;
CodePudding user response:
I'm not sure if this is what you want to achieve, but that's how I understood it from the description. Taking into account your current data structure, you can try this query:
SELECT
*
FROM members
LEFT JOIN addresses AS first_address ON (first_address.address_id = members.address_id)
LEFT JOIN addresses AS second_address ON (second_address.address_id = members.address_id2)
WHERE second_address.street_address = first_address.street_address
About your DB structure
You should consider changing the structure of your tables. You have a many-to-many relationship here.
The problem with your structure is that now each user can have a maximum of 2 addresses, as long as it may be fine in business terms, if you want to be able to add 3 addresses, you will add another column 'address_id3', for the 4th address the column 'address_id4' e.t.c. If these addresses are 40, you will have 40 columns. The second next problem is that you will probably have users who will still have 1 address, even though your table will hold 40 such fields.
The solution to such problems is creating intermediate tables:
Member table
member_id | member |
---|---|
1 | John |
2 | Bob |
Address table
address_id | street_address |
---|---|
1 | First |
2 | Second |
Intermediate table
address_id | member_id |
---|---|
1 | 1 |
2 | 1 |
2 | 2 |
This way, you can store multiple addresses for multiple members without the overhead of blank data when not needed.
One member can then have up to 50,000 addresses and the size of your tables is still the same (I mean table structure - columns - not data)