Home > Back-end >  Why does my subquery not work as expected?
Why does my subquery not work as expected?

Time:04-19

I am using DataCamp. The course is 'Joining Data in SQL/Subquery Inside Where'. The aim is:

Select all fields from populations with records corresponding to larger than 1.15 times the average you calculated in the first task for 2015.

I've entered the following SQL code:

SELECT *
FROM   populations
WHERE  life_expectancy > 1.15 * (SELECT Avg(life_expectancy)
                                 FROM   populations
                                 WHERE  year = 2015) 

enter image description here

The query however returns values corresponding to all years (2015,2010 etc.). When I type:

SELECT *
FROM   populations
WHERE  life_expectancy > 1.15 * (SELECT Avg(life_expectancy)
                                 FROM   populations)
       AND year = 2015 

enter image description here

It only gives me results for the year 2015. Which is what I am looking for. Why does the first code not return data only for 2015 but the second code does?.

CodePudding user response:

In your first query, the condition: year=2015 applies only to the subquery, i.e. the subquery says 'give me the average life expectancy in 2015'. The main query has no year restriction, it only says 'give me all countries and years where the life expectancy column is 15% above the value returned by the subquery.

The second query, on the other hand, has year=2015 applying to the whole query.

CodePudding user response:

There's a few things going on here, so let's look at them one at a time.

Suppose you have the following data (simplified for calculations):

country_code year life_expectancy
AUS 2010 60
AUT 2010 70
CAN 2010 95
AUS 2015 55
AUT 2015 90
CAN 2015 95

The average life expectancy in 2015 (your first subquery) is 80.

Adding 15 % to that, we get 1.15 x 80 = 92.

CAN has a life_expectancy of 95 in both 2010 and 2015, so if you don't specify which year you want in your main query, you will get both rows.

In your second query, your subquery returns the average over all years, which would be 77.5. Adding 15 % we get 1.15 x 77.5 = 89.125.

With this data, you'll suddenly get both CAN and AUT, where if you only want the average over 2015, you should only get CAN.

You probably want something like this:

SELECT *
FROM populations
WHERE year = 2015
  AND life_expectancy > 1.15 * (SELECT AVG(life_Expectancy)
                                FROM populations
                                WHERE year = 2015)
  •  Tags:  
  • sql
  • Related