I'm building a Pokemon API, and I'm having trouble with a repeating row when joining via a junction table. It is a SQL Server database. A watered down version of my tables looks like this:
To recreate my database:
CREATE DATABASE PokeApi;
USE PokeApi;
CREATE TABLE Pokemon (
PokemonId INT PRIMARY KEY IDENTITY,
PokemonName VARCHAR(55),
NationalDexNumber INT,
PokemonHeight INT,
PokemonWeight DECIMAL(10,2),
Category VARCHAR(50),
Gender VARCHAR(15),
GenderDifference BIT,
TypeOne VARCHAR(15),
TypeTwo VARCHAR(15),
);
CREATE TABLE Abilities (
AbilityId INT PRIMARY KEY IDENTITY,
AbilityName VARCHAR(20)
);
CREATE TABLE PokemonAbilitiesMapped (
PokemonId INT NOT NULL,
AbilityId INT,
CONSTRAINT PK_PokemonAbilitiesMapped PRIMARY KEY
(
PokemonId,
AbilityId
),
FOREIGN KEY(PokemonId) REFERENCES Pokemon(PokemonId),
FOREIGN KEY(AbilityId) REFERENCES Abilities(AbilityId)
);
INSERT INTO Pokemon
VALUES
('Venusaur', 3, 79, 220.5, 'Seed', 'Male / Female', 1, 'Grass', 'Poison'),
('Mega Venusaur', 3, 94, 342.8, 'Seed', 'Male / Female', 0, 'Grass', 'Poison'),
('Gigantamax Venusaur', 3, 945, NULL, 'Seed', 'Male / Female', 1, 'Grass', 'Poison');
INSERT INTO Abilities
VALUES
('Overgrow'),
('Thick Fat'),
('Test');
INSERT INTO PokemonAbilitiesMapped
VALUES
(1, 1),
(2, 2),
(3, 1),
(3, 3);
My query:
SELECT p.PokemonName, p.NationalDexNumber, p.PokemonHeight, p.PokemonWeight,
p.Category, p.Gender, p.GenderDifference, p.TypeOne, p.TypeTwo,
a.AbilityName
FROM Pokemon p
JOIN PokemonAbilitiesMapped pam
ON p.PokemonId = pam.PokemonId
JOIN Abilities a
ON pam.AbilityId = a.AbilityId
Outputs this:
PokemonName | NationalDexNumber | PokemonHeight | PokemonWeight | Category | Gender | GenderDifference | TypeOne | TypeTwo | AbilityName |
---|---|---|---|---|---|---|---|---|---|
Venusaur | 3 | 79 | 220.50 | Seed | Male / Female | True | Grass | Poison | Overgrow |
Mega Venusaur | 3 | 94 | 342.80 | Seed | Male / Female | False | Grass | Poison | Thick Fat |
Gigantamax Venusaur | 3 | 945 | null | Seed | Male / Female | True | Grass | Poison | Overgrow |
Gigantamax Venusaur | 3 | 945 | null | Seed | Male / Female | True | Grass | Poison | Test |
As you can see, Gigantamax Venusaur is listed twice because it has two abilities. I was thinking that I could add another column called "AbilityNameTwo" only if the pokemon queried had a second ability to avoid the row repeating.
What is the best way to avoid repeating data in this scenario? If I should add another column when joining, how would I write that as a query?
CodePudding user response:
It's perfectly normal to have results like that when joining data. If you really want to, you can have "ability two", but what if some pokemon has three, or five? And how do you order them? Here's an example of how you'd do it for two abilities and ordering alphabetically:
WITH abilitiesMapped AS
(
SELECT pam.PokemonId
, a.AbilityName
, ROW_NUMBER() OVER (PARTITION BY pam.PokemonId
ORDER BY a.AbilityName) AS abilityNo
FROM PokemonAbilitiesMapped pam
JOIN Abilities a
ON a.AbilityId = pam.AbilityId
)
SELECT p.PokemonName, p.NationalDexNumber, p.PokemonHeight, p.PokemonWeight,
p.Category, p.Gender, p.GenderDifference, p.TypeOne, p.TypeTwo
, ability1.abilityName AS ability1
, ability2.abilityName AS ability2
FROM Pokemon p
LEFT OUTER
JOIN abilitiesMapped ability1
ON ability1.PokemonId = p.PokemonId
AND ability1.abilityNo = 1
LEFT OUTER
JOIN abilitiesMapped ability2
ON ability2.PokemonId = p.PokemonId
AND ability1.abilityNo = 2
CTE abilitiesMapped takes your mappings and abilities and generates an abilityNo for each pokemon (PARTITION BY pam.PokemonId
) where the number is ordered by the ability name (ORDER BY ...
). Then this CTE is joined twice for abilityNo 1 and 2. LEFT JOIN
is important because a pokemon does not necessarily need to have two abilities, and you will get a NULL in ability2 in such case (with an INNER JOIN
you'd only get the ones that have two abilities because abilityNo = 2 needs to exist in order to satisfy the join condition).
CodePudding user response:
Thanks for your post with create, insert statements. It was very helpful.
To avoid duplication of the ability name, You can make the ability names as comma separated list. I am using STR_AGG
to generate comma separated list. It is available from SQL Server 2017 onwards. For earlier versions, you have to use FOR XML PATH
to generate comma separated list.
SELECT p.PokemonName, p.NationalDexNumber, p.PokemonHeight, p.PokemonWeight,
p.Category, p.Gender, p.GenderDifference, p.TypeOne, p.TypeTwo,
STRING_AGG(a.AbilityName,',') AS Abilities
FROM Pokemon p
JOIN PokemonAbilitiesMapped pam
ON p.PokemonId = pam.PokemonId
JOIN Abilities a
ON pam.AbilityId = a.AbilityId
GROUP BY p.PokemonName, p.NationalDexNumber, p.PokemonHeight, p.PokemonWeight,
p.Category, p.Gender, p.GenderDifference, p.TypeOne, p.TypeTwo
PokemonName | NationalDexNumber | PokemonHeight | PokemonWeight | Category | Gender | GenderDifference | TypeOne | TypeTwo | Abilities |
---|---|---|---|---|---|---|---|---|---|
Gigantamax Venusaur | 3 | 945 | NULL | Seed | Male / Female | 1 | Grass | Poison | Overgrow,Test |
Mega Venusaur | 3 | 94 | 342.80 | Seed | Male / Female | 0 | Grass | Poison | Thick Fat |
Venusaur | 3 | 79 | 220.50 | Seed | Male / Female | 1 | Grass | Poison | Overgrow |