I want to calculate the average ratings for each location using the most recent (highest timestamp) records from each user. (if a user’s most recent check-in didn’t include a rating, we would use the most recent rating they provided).
Below is the example of the table I am working with and has many more rows not shown
This is what I have so far..
select t.loc_id, t.loc_name, tm.avgRating
from Table_T as t
inner join (
select loc_id, loc_name, avg(cast(rating as decimal)) as avgRating, max(Timestamp) as LatestDate
from Table_T
group by loc_id, loc_name) as tm
on t.loc_id = tm.loc_id and t.Timestamp = tm.LatestDate
I feel like I need to use the CASE function to reference the older timestamp if there isn't a rating, but I am having a hard time figuring out where to put it.
Any help will be much appreciated!
CodePudding user response:
Your approach doesn't seem to factor in the last rating for the user but only finding the overall average. You may try the following where we use ROW_NUMBER
ordered by the latest timestamp to determine the most recent non-null rating for that user at the location before finding the average for each location.
SELECT
t.loc_id,
t.loc_name,
AVG(rating) as avgRating
FROM (
SELECT
t.loc_id,
t.loc_name,
CAST(rating as DECIMAL(12,2)) rating,
CASE WHEN rating IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY loc_id,user_id ORDER BY t1.Timestamp DESC) END as rn
FROM
Table_T t1
) t
WHERE rn=1
GROUP BY
t.loc_id,
t.loc_name
or
SELECT
t.loc_id,
t.loc_name,
AVG(rating) as avgRating
FROM (
SELECT
t.loc_id,
t.loc_name,
CAST(rating as DECIMAL(12,2)) rating,
ROW_NUMBER() OVER (PARTITION BY loc_id,user_id ORDER BY t1.Timestamp DESC) as rn
FROM
Table_T t1
WHERE rating is NOT NULL
) t
WHERE rn=1
GROUP BY
t.loc_id,
t.loc_name
Let me know if this works for you
CodePudding user response:
You can use a row-numbering solution for this
I've assumed you actually want the most recent rating
per location and user, which the other answer does not give
SELECT
loc_id,
loc_name,
AVG(CAST(rating AS decimal(18,2))) AS avgRating,
MAX([Timestamp]) AS LatestDate
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY loc_id, loc_name, user_id
ORDER BY [Timestamp] DESC) AS rn
FROM Table_T AS t
WHERE rating IS NOT NULL
) AS t
WHERE rn = 1
GROUP BY
loc_id,
loc_name;