I am using DataCamp. The course is 'Joining Data in SQL/Subquery Inside Where'. The aim is:
Select all fields from populations with records corresponding to larger than 1.15 times the average you calculated in the first task for 2015.
I've entered the following SQL code:
SELECT *
FROM populations
WHERE life_expectancy > 1.15 * (SELECT Avg(life_expectancy)
FROM populations
WHERE year = 2015)
The query however returns values corresponding to all years (2015,2010 etc.). When I type:
SELECT *
FROM populations
WHERE life_expectancy > 1.15 * (SELECT Avg(life_expectancy)
FROM populations)
AND year = 2015
It only gives me results for the year 2015. Which is what I am looking for. Why does the first code not return data only for 2015 but the second code does?.
CodePudding user response:
In your first query, the condition: year=2015 applies only to the subquery, i.e. the subquery says 'give me the average life expectancy in 2015'. The main query has no year restriction, it only says 'give me all countries and years where the life expectancy column is 15% above the value returned by the subquery.
The second query, on the other hand, has year=2015 applying to the whole query.
CodePudding user response:
There's a few things going on here, so let's look at them one at a time.
Suppose you have the following data (simplified for calculations):
country_code | year | life_expectancy |
---|---|---|
AUS | 2010 | 60 |
AUT | 2010 | 70 |
CAN | 2010 | 95 |
AUS | 2015 | 55 |
AUT | 2015 | 90 |
CAN | 2015 | 95 |
The average life expectancy in 2015 (your first subquery) is 80.
Adding 15 % to that, we get 1.15 x 80 = 92.
CAN has a life_expectancy of 95 in both 2010 and 2015, so if you don't specify which year you want in your main query, you will get both rows.
In your second query, your subquery returns the average over all years, which would be 77.5. Adding 15 % we get 1.15 x 77.5 = 89.125.
With this data, you'll suddenly get both CAN and AUT, where if you only want the average over 2015, you should only get CAN.
You probably want something like this:
SELECT *
FROM populations
WHERE year = 2015
AND life_expectancy > 1.15 * (SELECT AVG(life_Expectancy)
FROM populations
WHERE year = 2015)