I have table called location
that looks like this:
| zip_code | city | state |
| -------- | ------------- | ----- |
| 01108-2005 | | |
| 02130-1911 | Jamaica Plain | MA |
| 02130-1911 | Jamaica Plain | |
| 02138-1557 | Cambridge | MA |
| 02138-1557 | | MA |
| 02138-1557 | | |
| 02370-2509 | | MA |
| 02370-2509 | | MA |
I only want unique zip codes and as many corresponding city/state values as possible. For example:
- the zip code 01108-2005 is already unique, so I will keep it even though there is no city or state.
- the zip code 02138-1557 at least has state in one record, but not in the other, so I want to only keep the record that has a state.
- the zip code 02130-1911 has a record with only city as well as a record with both city and state. I want the record with both city and state, for it is the most complete.
I've tried the following code:
SELECT DISTINCT zip_code, city, state
FROM location
ORDER BY zip_code, city DESC, state DESC
which treats each record as distinct. Is there a way to only select the most complete records, or to merge all records that share a zip code?
I am new to StackOverflow (and to MySQL) so any tips on how I can make my question easier to understand are appreciated.
CodePudding user response:
Using this concept: MySQL Orderby a number, Nulls last
Without test data: Simple query for location
SELECT zip_Code, City, State
FROM (SELECT zip_Code, city, state, row_number() over (partition by zip_Code order by isNull(city), city ASC, isnull(state), state ASC) RN
FROM LOCATION) subq
WHERE RN=1
With Test Data: TESTED:
SELECT zip_Code, City, State
FROM (SELECT zip_Code, city, state, row_number() over (partition by zip_Code order by isNull(city), city ASC, isnull(state), state ASC) RN
FROM (SELECT '02130-1911' zip_Code, 'Jamaica Plain' city, 'MA' state UNION ALL
SELECT '02130-1911','Jamaica Plains', 'MA' UNION ALL
SELECT '02130-1911', 'aAKLAND' , 'MA' UNION ALL
SELECT '01108-2005', NULL , null UNION ALL
SELECT '02130-1911', 'Jamaica Plain', NULL UNION ALL
SELECT '02138-1557', 'Cambridge' , 'MA' UNION ALL
SELECT '02138-1557', NULL , 'MA' UNION ALL
SELECT '02138-1557', NULL , NULL UNION ALL
SELECT '02370-2509', NULL , 'MA' UNION ALL
SELECT '02370-2509', NULL , 'MA' ) LOCATION) subQ
WHERE RN=1
Example: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=a03800d0e61713ee61b4c550d395c096
Giving us below and handling edge case I believe This is giving us the city/state alphabetically first when multiple exist. We could check to see if a city/state is populated assign a 1 add both together. Use result to assign a rank/row and return all such occurrences if multiple are "as complete". Like I said Edge cases and business rules... :P
------------ ----------- -------
| zip_Code | City | State |
------------ ----------- -------
| 01108-2005 | | |
| 02130-1911 | aAKLAND | MA | <-- My edge case
| 02138-1557 | Cambridge | MA |
| 02370-2509 | | MA |
------------ ----------- -------
CodePudding user response:
Check for not null in state
SELECT DISTINCT zip_code, city, state
FROM location
WHERE state is not null
ORDER BY zip_code, city DESC, state DESC