How to sum value by DATE_FORMAT(date,'%Y-%m-%d')
and id the first 20 rows of data(order by data desc), and sum the remaining value if greater than 20 else 0? supposed I have below data with below SQL, how to do it? thanks so much for any advice.
SELECT SUM(value), id, DATE_FORMAT(date,'%Y-%m-%d')
FROM test_table
GROUP BY id, DATE_FORMAT(date,'%Y-%m-%d')
sum(value) id DATE_FORMAT(date,'%Y-%m-%d')
--------------------------------------------
64.98 123456 2021-01-01
64.98 123456 2021-01-02
64.98 123456 2021-01-03
64.98 123456 2021-01-04
64.98 123456 2021-01-05
64.98 123456 2021-01-06
72.34 123456 2021-01-07
64.98 123456 2021-01-08
64.98 123456 2021-01-09
103.80 123456 2021-01-10
64.98 123456 2021-01-11
64.98 123456 2021-01-12
64.98 123456 2021-01-13
64.98 123456 2021-01-14
64.98 123456 2021-01-15
64.98 123456 2021-01-16
64.98 123456 2021-01-17
64.98 123456 2021-01-18
64.98 123456 2021-01-19
64.98 123456 2021-01-20
64.98 123456 2021-01-21
64.98 123456 2021-01-22
64.98 123456 2021-01-23
64.98 123456 2021-01-24
144.98 123456 2021-01-25
64.98 123456 2021-01-26
64.98 123456 2021-01-27
64.98 123456 2021-01-28
64.98 123456 2021-01-29
64.98 123456 2021-01-30
64.98 123456 2021-01-31
64.98 123456 2021-02-01
64.98 123456 2021-02-02
64.98 123456 2021-02-03
64.98 123456 2021-02-04
720.92 123456 2021-02-05
66.98 123456 2021-02-06
66.98 123456 2021-02-07
66.98 123456 2021-02-08
64.98 123456 2021-02-09
64.98 123456 2021-02-10
64.98 223456 2021-01-20
54.98 223456 2021-01-21
...................................
I need the output like below:
id day_1 day_2 ..... day_20 day_other
--------------------------------------------
123456 64.98 64.98 ..... 234.67 2342.12
223456 64.98 64.98 ..... 234.67 2342.12
CodePudding user response:
Finally, I solved this problem by below SQL, thanks so much for FanoFN's answer
SELECT id,
SUM(CASE WHEN rn = 1 THEN vals ELSE 0 END) AS 'day_1',
SUM(CASE WHEN rn = 2 THEN vals ELSE 0 END) AS 'day_2',
SUM(CASE WHEN rn = 3 THEN vals ELSE 0 END) AS 'day_3',
SUM(CASE WHEN rn = 4 THEN vals ELSE 0 END) AS 'day_4',
SUM(CASE WHEN rn = 5 THEN vals ELSE 0 END) AS 'day_5',
SUM(CASE WHEN rn = 6 THEN vals ELSE 0 END) AS 'day_6',
SUM(CASE WHEN rn = 7 THEN vals ELSE 0 END) AS 'day_7',
SUM(CASE WHEN rn = 8 THEN vals ELSE 0 END) AS 'day_8',
SUM(CASE WHEN rn = 9 THEN vals ELSE 0 END) AS 'day_9',
SUM(CASE WHEN rn = 10 THEN vals ELSE 0 END) AS 'day_10',
SUM(CASE WHEN rn = 11 THEN vals ELSE 0 END) AS 'day_11',
SUM(CASE WHEN rn = 12 THEN vals ELSE 0 END) AS 'day_12',
SUM(CASE WHEN rn = 13 THEN vals ELSE 0 END) AS 'day_13',
SUM(CASE WHEN rn = 14 THEN vals ELSE 0 END) AS 'day_14',
SUM(CASE WHEN rn = 15 THEN vals ELSE 0 END) AS 'day_15',
SUM(CASE WHEN rn = 16 THEN vals ELSE 0 END) AS 'day_16',
SUM(CASE WHEN rn = 17 THEN vals ELSE 0 END) AS 'day_17',
SUM(CASE WHEN rn = 18 THEN vals ELSE 0 END) AS 'day_18',
SUM(CASE WHEN rn = 19 THEN vals ELSE 0 END) AS 'day_19',
SUM(CASE WHEN rn = 20 THEN vals ELSE 0 END) AS 'day_20',
SUM(CASE WHEN rn > 20 THEN vals ELSE 0 END) AS 'day_others'
FROM
(SELECT SUM(value) AS vals, id, DATE_FORMAT(date,'%Y-%m-%d') AS dt,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY dt DESC) AS rn
FROM test_table
GROUP BY id, dt) v
GROUP BY id;
CodePudding user response:
Note: Unfortunately, part of the component in this solution won't work on older MySQL version; as OP mentioned in her previous question, she is using MySQL 5.7.
I see, in that case you can upgrade the query a bit. Let's consider that maybe the condition will change somehow to first 30 rows of data, then you might want to use prepared statement so that you don't need to keep on changing the query.
I'll be using 3 variables for this example:
a.@columns
: for generating the columnsSUM(CASE ..
.
b.@sql
: for the final query.
c.@maxrows
: to define the "first of xx rows".All the variables will be set as:
SET @columns := NULL;
SET @sql := NULL;
SET @maxrows := XX; /*how many first rows*/
Setting @columns
and @sql
as NULL
is just a precaution, and @maxrows
is just the rows you required to pivot, in your current situation it should be SET @maxrows := 20;
- Setting the
@columns
variables.
WITH RECURSIVE cte AS (
SELECT 1 seq UNION ALL
SELECT seq 1 FROM cte WHERE seq 1 <= @maxrows)
SELECT GROUP_CONCAT(cols SEPARATOR ', \r\n')
INTO @columns
FROM (
SELECT CONCAT('SUM(CASE WHEN rn =',seq,' THEN vals ELSE 0 END) AS "day_',seq,'"')
AS cols
FROM cte UNION
SELECT CONCAT('SUM(CASE WHEN rn >',seq,' THEN vals ELSE 0 END) AS "day_others"')
FROM cte WHERE seq=@maxrows) c;
There are two components here, first is using common table expression to generate row sequences based on @maxrows
then the second one is concatenating the SUM(CASE ..
that you have in your SELECT
query. The last condition of WHEN rn > 20
is generated separately and combined using UNION
with the first 20 SUM(CASE ..
. Lastly, the GEOUP_CONCAT()
is necessary to make the generated syntax into one single row so that it can be assign to the @columns
variable. Note: setting variable will only be valid if it's a single row. Otherwise, you'll get Subquery returns more than 1 row
error.
Run this query to check the @columns
variable afterward: SELECT @columns;
- Setting the
@sql
variable:
SELECT CONCAT('SELECT id,',@columns,'
FROM
(SELECT SUM(value) AS vals, id, DATE_FORMAT(date,"%Y-%m-%d") AS dt,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) AS rn
FROM test_table
GROUP BY id, date) v
GROUP BY id;') INTO @sql;
This is basically just concatenating the subquery that you already have with @columns
variable and assign it to @sql
variable. Check the variable: SELECT @sql;
- The last part is just preparing the statement based on
@sql
, execute then deallocate it:
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Demo fiddle with separated components and checkings..
P/S: You should be able to copy this query and directly run it however if there's an error (other than the possible table or column names), then it's probably caused by the group_concat_max_len
setting. If so, you can check the setting by running:
SHOW VARIABLES like 'group_concat_max_len';
If it shows 1024
(in bytes), that means it's the default setting and with the operation we're going with our GROUP_CONCAT()
here, it's not going to work out. Therefore, you might want to either permanently increase the value or you can add this syntax before your full query:
SET SESSION group_concat_max_len = 1000000;