I have my current sql query which I will later convert to a HIVE query.
SELECT home_team, COUNT(home_team)
from results
WHERE results.home_score < results.away_score
GROUP BY home_team
UNION
SELECT away_team, COUNT(away_team)
FROM results
WHERE results.away_score < results.home_score
GROUP BY away_team
And it currently returns two occurrences of a country, once as a home_team and once as an away_team. Current results
How can I modify this query so it adds the count(home_team) and makes the country only appear once? Ex. Argentina : 50
I've tried to put both select queries in brackets and then count the sum being returned but I seem to always get an error when I do order by.
CodePudding user response:
This should resolve your issue
-- Create a temporary table
WITH counted AS (
SELECT home_team, COUNT(home_team) AS row_nb FROM results WHERE results.home_score < results.away_score GROUP BY home_team
UNION
SELECT away_team, COUNT(away_team) FROM results WHERE results.away_score < results.home_score GROUP BY away_team
)
-- And then sum the row_nb
SELECT home_team, SUM(row_nb) FROM counted;
CodePudding user response:
After unioning the results, aggregate by team and add up the lost games with SUM
.
And you must use UNION ALL
instead of UNION [DISTINCT]
. Otherwise, for a team that lost five games at home and five games away, five games will only counted once instead of twice. (I.e. the team will be shown with five lost games instead of with the ten games they really lost.)
Remember: UNION
removes duplicates and is very seldom needed. UNION ALL
is used more often.
SELECT team, SUM(lost) AS lost_games
FROM
(
SELECT home_team AS team, COUNT(*) AS lost
FROM results
WHERE home_score < away_score
GROUP BY home_team
UNION ALL
SELECT away_team AS team, COUNT(*) AS lost
FROM results
WHERE away_score < home_score
GROUP BY away_team
)
GROUP BY team
ORDER BY team;