Home > Net >  operating on multiple postgreSQL CTEs
operating on multiple postgreSQL CTEs

Time:01-03

I'm very new to SQL, thank you for bearing with me.

I'm using PostgreSQL, trying to produce a simple summary statistic on the percentage of home team or away team victories in a dataset of soccer penalty shootouts. I've tried doing this with a few common table expressions, which I then try to perform some operations on (divide one from another) like so:

WITH home_team_wins (match) AS (
        SELECT match
        FROM penalties
        WHERE neutral_stadium = 0
            AND attacker_home = 1
            AND match_winner = 1
        GROUP BY match
    ),
    away_team_wins (match) AS (
        SELECT match
        FROM penalties
        WHERE neutral_stadium = 0
            AND attacker_home = 0
            AND match_winner = 1
    ),
    num_home_wins (match) AS (
        SELECT COUNT(DISTINCT match)
        FROM home_team_wins
    ),
    num_away_wins (match) AS (
        SELECT COUNT(DISTINCT match)
        FROM away_team_wins
    ),
    total_matches (match_count) AS (
        SELECT COUNT(DISTINCT match)
        FROM penalties
        WHERE neutral_stadium = 0
    )
SELECT num_home_wins.match / total_matches.match_count, num_away_wins.match / total_matches.match_count
FROM num_home_wins, num_away_wins, total_matches;
 

I'm confused by the results, and don't fully understand what's happening beneath the hood. I expect to see a couple decimal values (in this case, 0.485 and 0.515), but instead both results come out to zero:

weird result

If I don't operate on the CTE variables and simply select them, I see the expected numbers:

SELECT num_home_wins.match, num_away_wins.match, total_matches.match_count
FROM num_home_wins, num_away_wins, total_matches;

expected match counts

Obviously, dividing these values shouldn't equal zero. Why does operating on them produce this weird result?

I won't be surprised if there's a more efficient way to do this, which will also be helpful to know, but he main question above still stands. Thank you!

CodePudding user response:

Try something like this to make your query more compact :

SELECT COUNT(DISTINCT match) FILTER (WHERE attacker_home = 1 AND match_winner = 1) :: numeric / COUNT(DISTINCT match)
     , COUNT(DISTINCT match) FILTER (WHERE attacker_home = 0 AND match_winner = 1) :: numeric / COUNT(DISTINCT match)
  FROM penalties
 WHERE neutral_stadium = 0

and as stated by @klin, you must cast to numeric to avoid zero as the result of an integer division.

  • Related