I need to join 2 tables.
In the first table (geoname) there is a list of city codes, country codes and area codes. In the second table (alternatename) there is a list of available translations for each city code.
The sqlfiddle example is here: http://sqlfiddle.com/#!9/7df393/1
What I need is: give me an ordered list of cities by name, taking the italian language version if available or the english if it is not available or the NULL language if previous are not available. For each language, take the preferred version if available.
I tried with many queries, but I don't get the expected result. This is the last one:
SELECT g.geonameid, name
FROM geoname g
LEFT JOIN alternatename a ON g.geonameid = a.geonameid AND
(a.lang = 'it' OR a.lang = 'en' OR a.lang = NULL)
WHERE name IS NOT NULL and fcode="PPL"
GROUP BY g.geonameid
ORDER BY isPreferred DESC, name ASC
LIMIT 0,20
The result is
geonameid name
78910 Milan
12345 New York City
34542 Tampere
But I expect
geonameid name
78910 Milano
12345 New York
34542 Tampere
71132 Sanaa
Any help?
Table schema:
CREATE TABLE geoname (
geonameid INT(11) PRIMARY KEY,
country CHAR(2),
fcode VARCHAR(5));
CREATE TABLE alternatename (
id_name INT(11) PRIMARY KEY AUTO_INCREMENT,
geonameid INT(11),
lang CHAR(2),
name VARCHAR(250),
isPreferred TINYINT(1) NULL);
INSERT INTO geoname (geonameid,country,fcode) VALUES
("12345","US","PPL"),
("78910","IT","PPL"),
("34542","FI","PPL"),
("5","IR","PPL"),
("71132","YE","ADM1");
INSERT INTO alternatename (geonameid,lang,name,isPreferred) VALUES
("12345","en","New York City",NULL),
("12345","en","Big Apple",NULL),
("12345","en","New York",1),
("12345","it","La Grande Mela",NULL),
("12345","it","New York",1),
("12345","ru","New York",1),
("78910","en","Milan",1),
("78910","it","Milano",1),
("34542","en","Tampere",NULL),
("5","fa","Yekāhī",NULL),
("71132","ar","صنعاء",NULL),
("71132","fr","Muhafadat Sanaa",NULL),
("71132",NULL,"Sanaa",NULL);
CodePudding user response:
In MySQL 8 you can use a lateral join that gets the top name record for each location according to your sort order.
SELECT g.geonameid,
a.name
FROM geoname g
JOIN LATERAL (SELECT a.name
FROM alternatename a
WHERE g.geonameid = a.geonameid
AND a.lang IN ('it',
'en')
OR a.lang IS NULL
ORDER BY a.lang DESC,
coalesce(a.ispreferred, 0) DESC
LIMIT 1) a
WHERE g.fcode = 'PPL'
LIMIT 0,
20;
Side note: Don't get used to use double quotes for string or date literals. Yes, sadly MySQL accepts that, but in SQL double quotes are usually for identifiers, such as column names. Should you ever use another DBMS (or future MySQL versions become more sane about this) you'll likely get an "invalid object name" error. Always use single quotes for string or date literals.
Also note that = NULL
is never true as NULL
isn't equal to anything, even NULL
. Use IS NULL
to check if a value is `NULL.
CodePudding user response:
One solution using conditional aggregation :
SELECT
g.geonameid,
COALESCE(
MAX(CASE WHEN a.lang = 'it' THEN name END),
MAX(CASE WHEN a.lang = 'en' THEN name END),
MAX(CASE WHEN a.lang IS NULL THEN name END)
) as name
FROM
geoname g
LEFT JOIN alternatename a ON g.geonameid = a.geonameid AND (a.lang = 'it' OR a.lang = 'en' OR a.lang IS NULL)
WHERE
a.name IS NOT NULL
AND g.fcode = "PPL"
GROUP BY
g.geonameid
ORDER BY
a.isPreferred DESC,
a.name ASC
LIMIT 0,20
Output :
geonameid | name |
---|---|
78910 | Milano |
12345 | New York |
71132 | Sanaa |
34542 | Tampere |
Another solution would be to use subqueries, or to join multiple times on the alternatename
table (for each language you want to consider in the order by), then still use coalesce
to get first non null retrieved values.
CodePudding user response:
An option for MySQL 8 is using ROW_NUMBER()
, in your sample scripts, there is no index on alternatename.geonameid
, better add one.
WITH selection AS (
SELECT
g.geonameid,
a.name,
ROW_NUMBER() OVER(ORDER BY a.lang DESC, COALESCE(a.isPreferred, 0) DESC) name_order
FROM geoname g
LEFT JOIN alternatename a ON g.geonameid = a.geonameid AND ( a.lang IN ( 'it', 'en' ) OR a.lang IS NULL )
WHERE g.fcode = 'PPL' AND g.geonameid = 12345
LIMIT 0, 20
)
SELECT geonameid, name
FROM selection
WHERE name_order = 1
Edit
WITH selection AS (
SELECT
g.geonameid,
a.name,
ROW_NUMBER() OVER(PARTITION BY g.geonameid ORDER BY a.lang DESC, COALESCE(a.isPreferred, 0) DESC) name_order
FROM geoname g
LEFT JOIN alternatename a ON g.geonameid = a.geonameid AND ( a.lang IN ( 'it', 'en' ) OR a.lang IS NULL )
WHERE g.fcode = 'PPL'
)
SELECT geonameid, name
FROM selection
WHERE name_order = 1
LIMIT 0, 20
;