I'm very new to SQL, thank you for bearing with me.
I'm using PostgreSQL, trying to produce a simple summary statistic on the percentage of home team or away team victories in a dataset of soccer penalty shootouts. I've tried doing this with a few common table expressions, which I then try to perform some operations on (divide one from another) like so:
WITH home_team_wins (match) AS (
SELECT match
FROM penalties
WHERE neutral_stadium = 0
AND attacker_home = 1
AND match_winner = 1
GROUP BY match
),
away_team_wins (match) AS (
SELECT match
FROM penalties
WHERE neutral_stadium = 0
AND attacker_home = 0
AND match_winner = 1
),
num_home_wins (match) AS (
SELECT COUNT(DISTINCT match)
FROM home_team_wins
),
num_away_wins (match) AS (
SELECT COUNT(DISTINCT match)
FROM away_team_wins
),
total_matches (match_count) AS (
SELECT COUNT(DISTINCT match)
FROM penalties
WHERE neutral_stadium = 0
)
SELECT num_home_wins.match / total_matches.match_count, num_away_wins.match / total_matches.match_count
FROM num_home_wins, num_away_wins, total_matches;
I'm confused by the results, and don't fully understand what's happening beneath the hood. I expect to see a couple decimal values (in this case, 0.485 and 0.515), but instead both results come out to zero:
If I don't operate on the CTE variables and simply select them, I see the expected numbers:
SELECT num_home_wins.match, num_away_wins.match, total_matches.match_count
FROM num_home_wins, num_away_wins, total_matches;
Obviously, dividing these values shouldn't equal zero. Why does operating on them produce this weird result?
I won't be surprised if there's a more efficient way to do this, which will also be helpful to know, but he main question above still stands. Thank you!
CodePudding user response:
Try something like this to make your query more compact :
SELECT COUNT(DISTINCT match) FILTER (WHERE attacker_home = 1 AND match_winner = 1) :: numeric / COUNT(DISTINCT match)
, COUNT(DISTINCT match) FILTER (WHERE attacker_home = 0 AND match_winner = 1) :: numeric / COUNT(DISTINCT match)
FROM penalties
WHERE neutral_stadium = 0
and as stated by @klin, you must cast to numeric to avoid zero as the result of an integer division.