I am trying to create a MySQL 5.x store procedure that backfills data given the following constraints:
- For each specialId, a row must be returned for the last day of each month in the activity_date column.
- the minimum and maximum activity_date in the table determines the amount of months that should be returned per specialId.
- For each row that does not already have an activity_date in the data, we backfill data with count = 0, brand = happyInc, the specialId, and the activity_date. The rest of the row can be null
Here is the data as it is in the table:
CREATE TABLE `item` (
`id` int NOT NULL AUTO_INCREMENT,
`brand` varchar(45) DEFAULT NULL,
`count` int DEFAULT NULL,
`specialId` varchar(45) DEFAULT NULL,
`other_data` varchar(45) DEFAULT NULL,
`activity_date` datetime DEFAULT NULL,
PRIMARY KEY (`id`));
INSERT INTO `item3` VALUES (1,'happyInc',2,'1234567890','something','2020-04-30 00:00:00'),
(2,'happyInc',10,'1234567890','something','2020-05-31 00:00:00'),
(3,'happyInc',30,'1234567890','something','2020-06-30 00:00:00'),
(4,'happyInc',23,'1234567890','something','2020-07-31 00:00:00'),
(5,'happyInc',4,'5555555555','something','2020-05-31 00:00:00'),
(6,'happyInc',7,'5555555555','something','2020-07-31 00:00:00'),
(7,'happyInc',1,'5555555555','something','2020-10-31 00:00:00'),
(8,'happyInc',4,'5555555555','something','2020-11-30 00:00:00'),
(9,'happyInc',3,'9999999999','something','2019-12-31 00:00:00'),
(10,'happyInc',19,'9999999999','something','2021-03-31 00:00:00');
id brand count specialId other_data activity_date
1 happyInc 2 1234567890 something 2020-04-30
2 happyInc 10 1234567890 something 2020-05-31
3 happyInc 30 1234567890 something 2020-06-30
4 happyInc 23 1234567890 something 2020-07-31
5 happyInc 4 5555555555 something 2020-05-31
6 happyInc 7 5555555555 something 2020-07-31
7 happyInc 1 5555555555 something 2020-10-31
8 happyInc 4 5555555555 something 2020-11-30
9 happyInc 3 9999999999 something 2019-12-31
10 happyInc 19 9999999999 something 2021-03-31
Here is the output I am trying to get
id brand count specialId other_data activity_date
null happyInc 0 1234567890 null 2019-12-31
null happyInc 0 1234567890 null 2020-01-31
null happyInc 0 1234567890 null 2020-02-29
null happyInc 0 1234567890 null 2020-03-31
1 happyInc 2 1234567890 something 2020-04-30
2 happyInc 10 1234567890 something 2020-05-31
3 happyInc 30 1234567890 something 2020-06-30
4 happyInc 23 1234567890 something 2020-07-31
null happyInc 0 1234567890 null 2020-08-31
null happyInc 0 1234567890 null 2020-09-30
null happyInc 0 1234567890 null 2020-10-31
null happyInc 0 1234567890 null 2020-11-30
null happyInc 0 1234567890 null 2020-12-31
null happyInc 0 1234567890 null 2021-01-31
null happyInc 0 1234567890 null 2021-02-28
null happyInc 0 1234567890 null 2021-03-31
null happyInc 0 5555555555 null 2019-12-31
null happyInc 0 5555555555 null 2020-01-31
null happyInc 0 5555555555 null 2020-02-29
null happyInc 0 5555555555 null 2020-03-31
null happyInc 0 5555555555 null 2020-04-30
5 happyInc 4 5555555555 something 2020-05-31
null happyInc 0 5555555555 null 2020-06-30
6 happyInc 7 5555555555 something 2020-07-31
null happyInc 0 5555555555 null 2020-08-31
null happyInc 0 5555555555 null 2020-09-30
7 happyInc 1 5555555555 something 2020-10-31
8 happyInc 4 5555555555 something 2020-11-30
null happyInc 0 5555555555 null 2020-12-31
null happyInc 0 5555555555 null 2021-01-31
null happyInc 0 5555555555 null 2021-02-28
null happyInc 0 5555555555 null 2021-03-31
9 happInc 3 9999999999 something 2019-12-31
null happyInc 0 9999999999 null 2020-01-31
null happyInc 0 9999999999 null 2020-02-29
null happyInc 0 9999999999 null 2020-03-31
null happyInc 0 9999999999 null 2020-04-30
null happyInc 0 9999999999 null 2020-05-31
null happyInc 0 9999999999 null 2020-06-30
null happyInc 0 9999999999 null 2020-07-31
null happyInc 0 9999999999 null 2020-08-31
null happyInc 0 9999999999 null 2020-09-30
null happyInc 0 9999999999 null 2020-10-31
null happyInc 0 9999999999 null 2020-11-30
null happyInc 0 9999999999 null 2020-12-31
null happyInc 0 9999999999 null 2021-01-31
null happyInc 0 9999999999 null 2021-02-28
10 happyInc 19 9999999999 something 2021-03-31
But here is the output I am getting
id brand count specialId other_data activity_date
9 happyInc 3 9999999999 something 2019-12-31 00:00:00
null happyInc 0 null null 2020-01-31 00:00:00
null happyInc 0 null null 2020-02-29 00:00:00
null happyInc 0 null null 2020-03-31 00:00:00
1 happyInc 2 1234567890 something 2020-04-30 00:00:00
5 happyInc 4 5555555555 something 2020-05-31 00:00:00
2 happyInc 10 1234567890 something 2020-05-31 00:00:00
3 happyInc 30 1234567890 something 2020-06-30 00:00:00
6 happyInc 7 5555555555 something 2020-07-31 00:00:00
4 happyInc 23 1234567890 something 2020-07-31 00:00:00
null happyInc 0 null null 2020-08-31 00:00:00
null happyInc 0 null null 2020-09-30 00:00:00
7 happyInc 1 5555555555 something 2020-10-31 00:00:00
8 happyInc 4 5555555555 something 2020-11-30 00:00:00
null happyInc 0 null null 2020-12-31 00:00:00
null happyInc 0 null null 2021-01-31 00:00:00
null happyInc 0 null null 2021-02-28 00:00:00
10 happyInc 19 9999999999 something 2021-03-31 00:00:00
using this store procedure
CREATE DEFINER=`root`@`localhost` PROCEDURE `backfill_procedure`()
BEGIN
DROP TEMPORARY TABLE IF EXISTS date_temp_table;
CREATE TEMPORARY TABLE date_temp_table
select * from (
select date_add('1980-01-31 00:00:00.000', INTERVAL n3.num*100 n2.num*10 n1.num MONTH ) as date from
(select 0 as num union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6 union all select 7
union all select 8 union all select 9) n1,
(select 0 as num union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9) n2,
(select 0 as num union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9) n3
) calendar
where calendar.date >= (SELECT activity_date from item where activity_date is not null ORDER BY activity_date ASC LIMIT 1)
and calendar.date <= (SELECT activity_date from item where activity_date is not null ORDER BY activity_date DESC LIMIT 1)
order by date;
SELECT id, IFNULL(brand,'happyInc') as brand, IFNULL(count,0) as count, specialId, other_data, IFNULL(activity_date, date_temp_table.date) as activity_date from item RIGHT JOIN date_temp_table on item.activity_date = date_temp_table.date;
END
The solution escapes me. Any help would be greatly appreciated. Also, sorry for posting this again. The previous post did not have my work effort so I removed it and posted it showing my work. Thanks for understanding.
CodePudding user response:
SELECT item.id,
brand,
COALESCE(item.`count`,0) as `count`,
specialId,
item.other_data,
activity_date
FROM ( SELECT LAST_DAY(date1.`date` INTERVAL num1.num 5*num2.num 25*num3.num MONTH) activity_date,
date1.last_date
FROM (SELECT MIN(activity_date) `date`, LAST_DAY(MAX(activity_date)) last_date
FROM item) date1
CROSS JOIN (SELECT 0 num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4) num1
CROSS JOIN (SELECT 0 num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4) num2
CROSS JOIN (SELECT 0 num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4) num3
HAVING activity_date <= last_date
) date2
CROSS JOIN ( SELECT DISTINCT brand, specialId
FROM item
) spId
LEFT JOIN item USING (brand, specialId, activity_date)
ORDER BY 4,6
fiddle with step-by-step explanations.