Home > other >  MySQL query optimization - add a new column with the max value of the group. The value should be rep
MySQL query optimization - add a new column with the max value of the group. The value should be rep

Time:11-10

I have a mysql table with the following columns- Date, country, YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license,item_description,amount, units, arp.

Total number of records = 7.1 million

Now i want to add a new column which shows the max value of arp for the group (country, YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license,item_description). This max value has to repeated for the all the members in the group.

I have tried the following 2 methods. Both works but both of them takes a long time to update the data more than 2 hrs.

I'm looking for any better solutions that can do this faster. Please help.

Method 1- update the table from subquery

UPDATE table a
INNER JOIN (SELECT Country,YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license, item_description, MAX(arp) AS max_arp FROM table GROUP BY 1,2,3,4,5,6,7,8,9) b  

on a.country = b.country AND a.year = b.year AND a.month=b.month AND a.supercategory=b.supercategory AND a.class=b.class AND a.corp_manufacturer=b.corp_manufacturer AND a.brand=b.brand AND a.license=b.license AND a.item_description = b.item_description

SET a.max_arp= b.max_arp

Method 2 - create temp tables with indexes and join them in to a new table

INSERT INTO  temp1
(SELECT Country,YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license, item_description, MAX(arp) AS max_arp 
    FROM table 
    GROUP BY 1,2,3,4,5,6,7,8,9);

INSERT IGNORE INTO  temp2
SELECT a.*,b.max_arp AS SRP FROM  table  a  JOIN  temp1 b
ON a.country = b.country AND a.year = b.year AND a.month=b.month AND a.supercategory=b.supercategory AND a.class=b.class AND a.corp_manufacturer=b.corp_manufacturer 
    AND a.brand=b.brand AND a.license=b.license AND a.item_description = b.item_description;

CodePudding user response:

We can compute the group count with a window function. That said, why create a column when you can just use a view:

create myview v as
select t.*,
    max(arp) over(partition by 
        country, 
        year, 
        month, 
        supercategory, 
        class, 
        corp_manufacturer, 
        brand, 
        license,
        item_description
    ) max_arp
from mytable t

With this technique, you get an always up-to-date perspective at your data, for 0 maintenance costs. Note that this requires MySQL 8.0.

If you insist on storing the information, then in MySQL I would recommend the update/join syntax. Assuming that your table has a primary key called id:

update mytable t
inner join (
    select id,
        max(arp) over(partition by 
            country, 
            year, 
            month, 
            supercategory, 
            class, 
            corp_manufacturer, 
            brand, 
            license,
            item_description
        ) max_arp
    from mytable t
) x on x.id = t.id
set t.max_arp = x.max_arp

CodePudding user response:

Create a table with the desired arp values:

CREATE TABLE the_groups (
    PRIMARY KEY (Country,YEAR, MONTH, supercategory,
                 class, corp_manufacturer,
                 brand, license, item_description)
    ) AS
  SELECT  Country,YEAR, MONTH, supercategory,
          class, corp_manufacturer,
          brand, license, item_description,
          MAX(arp) AS max_arp
    FROM  table
    GROUP BY  1,2,3,4,5,6,7,8,9;

(That is much faster than UPDATE.)

If necessary, you can keep the_groups, but you would need to maintain it whenever tables is modified.

Meanwhile, you can have the effect of what you asked for via

CREATE VIEW table_plus_maxarp AS
    SELECT a.*, b.max_arp
        FROM `table` AS a
        JOIN `the_groups` AS b
            USING (Country,YEAR, MONTH, supercategory,
                   class, corp_manufacturer,
                   brand, license, item_description)
           ;

CodePudding user response:

I think it will better if your table has the following two indexes :

  1. Index for field arp.
  2. Index for fields Country, YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license, item_description.

After that you can use what ever you want.

For using temporary table it will be faster if temp1 has the same Index Country, YEAR, MONTH, .... as in your table. The second part of your query will be:

    UPDATE table a, temp1 b ON 
    a.country = b.country AND a.year = b.year AND a.month=b.month AND  
    a.supercategory=b.supercategory AND a.class=b.class AND  
    a.corp_manufacturer=b.corp_manufacturer AND a.brand=b.brand AND  
    a.license=b.license AND a.item_description = b.item_description
        
    SET a.max_arp= b.max_arp;
  • Related