I have been struggling with a problem for hours. I have found myself down multiple rabbit holes and into the realms of DeCorrelated SubQueries which are frankly beyond me...
I have two tables and I'm trying to pull from both without a common column to join against. I need to take the a value from table 1, find the closest value (that is lower) in table 2 and then pull related data from table 2.
table_1
id | score |
---|---|
1 | 99.983545 |
2 | 98.674359 |
3 | 97.832475 |
4 | 96.184545 |
5 | 93.658572 |
6 | 89.963544 |
7 | 87.427353 |
8 | 82.883345 |
table_2
average_level | percentile |
---|---|
99.743545 | 99 |
97.994359 | 98 |
97.212485 | 97 |
96.987545 | 96 |
95.998573 | 95 |
88.213584 | 94 |
87.837384 | 93 |
80.982147 | 92 |
From the two tables above I need to:
- Take the id and score
- identify the closest average_level to the score
- include the correlating average_level and percentile
The hoped for output would look like this...
id | score | average_level | percentile |
---|---|---|---|
1 | 99.983545 | 99.743545 | 99 |
2 | 98.674359 | 97.994359 | 98 |
3 | 97.832475 | 97.212485 | 97 |
4 | 96.184545 | 95.998573 | 95 |
5 | 93.658572 | 88.213584 | 94 |
6 | 89.963544 | 88.213584 | 94 |
7 | 87.427353 | 80.982147 | 92 |
8 | 82.883345 | 80.982147 | 92 |
Any help or advice would be very much appreciated
CodePudding user response:
You can do this by joining both tables with table_1.score >= table_2.average_level and then getting the max(average_level) and max(average_level) - which will be the closet yet inferior or equal values from table_2 - and grouping by the fields in table_1:
SELECT TABLE_1.ID, TABLE_1.SCORE,
MAX(TABLE_2.AVERAGE_LEVEL) AS AVERAGE_LEVEL,
MAX(TABLE_2.PERCENTILE) AS PERCENTILE
FROM TABLE_1 INNER JOIN TABLE_2
ON TABLE_1.SCORE >= TABLE_2.AVERAGE_LEVEL
GROUP BY TABLE_1.ID, TABLE_1.SCORE
ORDER BY TABLE_1.ID
I add the fiddle example here, it also includes @Ömer's answer
CodePudding user response:
if we say first table score and second one avarage you can try this.
select *
from Score s
inner join average a on a.Percentile = (select top(1) al.Percentile from average al order by Abs(average_level - s.score))
CodePudding user response:
Use lag
function on percentile table to attach the values from next higher row e.g.:
l_level | l_percentile | u_level | u_percentile |
---|---|---|---|
99.743545 | 99 | null | null |
97.994359 | 98 | 99.743545 | 99 |
97.212485 | 97 | 97.994359 | 98 |
96.987545 | 96 | 97.212485 | 97 |
Now join it with score table using the condition:
on score >= l_level
and (score < u_level or u_level is null)
This will join each score with exactly one range. In order to find whether the score value is closer to the lower bound or upper bound, just use case
:
case when u_level - score < score - l_level then u_level else l_level end,
case when u_level - score < score - l_level then u_percentile else l_percentile end,
Window functions should, in theory, give you better performance than self joins.