Home > Enterprise >  Calculating average from most recent records for all locations
Calculating average from most recent records for all locations

Time:09-17

I want to calculate the average ratings for each location using the most recent (highest timestamp) records from each user. (if a user’s most recent check-in didn’t include a rating, we would use the most recent rating they provided).

Below is the example of the table I am working with and has many more rows not shown

Image

This is what I have so far..

select t.loc_id, t.loc_name, tm.avgRating
from Table_T as t
inner join (
    select  loc_id, loc_name, avg(cast(rating as decimal)) as avgRating, max(Timestamp) as LatestDate
    from Table_T
    group by  loc_id, loc_name) as tm
on t.loc_id = tm.loc_id and t.Timestamp = tm.LatestDate

I feel like I need to use the CASE function to reference the older timestamp if there isn't a rating, but I am having a hard time figuring out where to put it.

Any help will be much appreciated!

CodePudding user response:

Your approach doesn't seem to factor in the last rating for the user but only finding the overall average. You may try the following where we use ROW_NUMBER ordered by the latest timestamp to determine the most recent non-null rating for that user at the location before finding the average for each location.

SELECT
     t.loc_id,
     t.loc_name,
     AVG(rating) as avgRating
FROM (
    SELECT
        t.loc_id,
        t.loc_name,
        CAST(rating as DECIMAL(12,2)) rating,
        CASE WHEN rating IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY loc_id,user_id ORDER BY t1.Timestamp DESC) END as rn
    FROM
        Table_T t1
) t
WHERE rn=1
GROUP BY
     t.loc_id,
     t.loc_name

or

SELECT
     t.loc_id,
     t.loc_name,
     AVG(rating) as avgRating
FROM (
    SELECT
        t.loc_id,
        t.loc_name,
        CAST(rating as DECIMAL(12,2)) rating,
        ROW_NUMBER() OVER (PARTITION BY loc_id,user_id ORDER BY t1.Timestamp DESC) as rn
    FROM
        Table_T t1
    WHERE rating is NOT NULL
) t
WHERE rn=1
GROUP BY
     t.loc_id,
     t.loc_name

Let me know if this works for you

CodePudding user response:

You can use a row-numbering solution for this

I've assumed you actually want the most recent rating per location and user, which the other answer does not give

SELECT
  loc_id,
  loc_name,
  AVG(CAST(rating AS decimal(18,2))) AS avgRating,
  MAX([Timestamp]) AS LatestDate
FROM (
    SELECT *,
      ROW_NUMBER() OVER (PARTITION BY loc_id, loc_name, user_id
          ORDER BY [Timestamp] DESC) AS rn
    FROM Table_T AS t
    WHERE rating IS NOT NULL
) AS t
WHERE rn = 1
GROUP BY
  loc_id,
  loc_name;
  • Related