I have the following structure of data:
average | date | hash
20 | 2017-08-29 | 9xxx
15 | 2017-08-29 | 9xxx
26 | 2017-08-29 | 1as0
24 | 2017-09-01 | 1as0
21 | 2017-09-01 | 1as0
22 | 2017-12-20 | 9xxx
21 | 2017-12-20 | 1as0
24 | 2017-12-20 | 1as0
... | ... | ...
Columns date & hash have duplicates.
I'm trying to build a SQL query that will help me to get rid of duplicates (hash & date). However, I need to accumulate all the values from average from duplicates for the final average per date and per hash.
In other words, I'd like to receive:
- Average for hash based on all duplicated values (from hash column)
- Per date
So it should look like:
average | date | hash
17. 5 | 201-08-29 | 9xxx
26 | 2017-08-29 | 1as0
22.5 | 2017-09-01 | 1as0
22 | 2017-12-20 | 9xxx
22.5 | 2017-12-20 | 1as0
... | ... | ...
It looks rather simple, but I'm new to SQL, so it might be not so obvious to me. I spend many hours googling and I didn't find a solution that I was able to implement with a success.
For now, I tried some queries with DISTINCT, GROUP BY and inner query. Each time I miss something (like an average of some subgroups per date for instance).
My last idea was to:
- To create a separate column with the average for each hash and then...
- ... to group values per date
But in such a solution, the average is done for all hashes not per hash subgroups (duplicates). I'm not able to grasp all values per each date (year/month/day).
SELECT *, AVG(avg_tmpr_c) OVER (PARTITION BY wthr_date) as av_for_hash
FROM testing
Anyone will be so kind to help me?
CodePudding user response:
select "date", hash, avg(average) as avg
from tbl
group by "date", hash
CodePudding user response:
First we collect the total of average per hash and date. This could be saved in a temporary table. Then we add them and we compute the last average.
select date, hash , sum(average ) sum_a , count(*) count_a from tbl group by date, hash
--- per date
select date , sum(sum_a) / sum(count_a) as date_avg from
( select date, hash , sum(average ) sum_a , count(*) count_a
from tbl
group by date, hash ) a
group by date
--- per hash
select hash , sum(sum_a) / sum(count_a) as hash_avg from
( select date, hash , sum(average ) sum_a , count(*) count_a
from tbl
group by date, hash ) a
group by hash