Home > Back-end >  AWS Timestream query to get average measure for the first month of samples
AWS Timestream query to get average measure for the first month of samples

Time:04-10

In AWS Timestream I am trying to get the average heart rate for the first month since we have received heart rate samples for a specific user and the average for the last week. I'm having trouble with the query to get the first month part. When I try to use MIN(time) in the where clause I get the error: WHERE clause cannot contain aggregations, window functions or grouping operations.

SELECT * FROM "DATABASE"."TABLE" 
WHERE measure_name = 'heart_rate' AND time < min(time)   30

If I add it as a column and try to query on the column, I get the error: Column 'first_sample_time' does not exist

SELECT MIN(time) AS first_sample_time FROM "DATABASE"."TABLE" 
WHERE measure_name = 'heart_rate' AND time > first_sample_time

Also if I try to add to MIN(time) I get the error: line 1:18: ' ' cannot be applied to timestamp, integer

SELECT MIN(time)   30 AS first_sample_time FROM "DATABASE"."TABLE"

Here is what I finally came up with but I'm wondering if there is a better way to do it?

  WITH first_month AS (
  SELECT 
    Min(time) AS creation_date, 
    From_milliseconds(
      To_milliseconds(
        Min(time)
      )   2628000000
    ) AS end_of_first_month, 
    USER 
  FROM 
    "DATABASE"."TABLE" 
  WHERE 
    USER = 'xxx' 
    AND measure_name = 'heart_rate' 
  GROUP BY 
    USER
), 
first_month_avg AS (
  SELECT 
    Avg(hm.measure_value :: DOUBLE) AS first_month_average, 
    fm.USER 
  FROM 
    "DATABASE"."TABLE" hm 
    JOIN first_month fm ON hm.USER = fm.USER 
  WHERE 
    measure_name = 'heart_rate' 
    AND hm.time BETWEEN fm.creation_date 
    AND fm.end_of_first_month 
  GROUP BY 
    fm.USER
), 
last_week_avg AS (
  SELECT 
    Avg(measure_value :: DOUBLE) AS last_week_average, 
    USER 
  FROM 
    "DATABASE"."TABLE" 
  WHERE 
    measure_name = 'heart_rate' 
    AND time > ago(14d) 
    AND USER = 'xxx' 
  GROUP BY 
    USER
) 
SELECT 
  lwa.last_week_average, 
  fma.first_month_average, 
  lwa.USER 
FROM 
  first_month_avg fma 
  JOIN last_week_avg lwa ON fma.USER = lwa.USER

Is there a better or more efficient way to do this?

CodePudding user response:

I can see you've run into a few challenges along the way to your solution, and hopefully I can clear these up for you and also propose a cleaner way of reaching your solution.

Filtering on aggregates As you've experienced first hand, SQL doesn't allow aggregates in the where statement, and you also cannot filter on new columns you've created in the select statement, such as aggregates or case statements, as those columns/results are not present in the table you're querying.

Fortunately there are ways around this, such as:

Making your main query a subquery, and then filtering on the result of that query, like below

Select * from (select *,count(that_good_stuff) as total_good_stuff from tasty_table group by 1,2,3) where total_good_stuff > 69

This works because the aggregate column (count) is no longer an aggregate at the time it's called in the where statement, it's in the result of the subquery.

  1. Having clause

If a subquery isn't your cup of tea, you can use the having clause straight after your group by statement, which acts like a where statement except exclusively for handling aggregates.

This is better than resorting to a subquery in most cases, as it's more readable and I believe more efficient.

select *,count(that_good_stuff) as total_good_stuff from tasty_table group by 1,2,3 having total_good_stuff > 69

Finally, window statements are fantastic...they've really helped condense many queries I've made in the past by removing the need for subqueries/ctes. If you could share some example raw data (remove any pii of course) I'd be happy to share an example for your use case.

Nevertheless, hope this helps! Tom

  • Related