Home > OS >  SQL query needed for a complex structure
SQL query needed for a complex structure

Time:12-20

I have a tricky SQL query that needs to be built to get the highest priority rule based on customer session and geo IP data. For that, I would need your input and experience to get an optimized version of it that will fulfill the requirements.

I attached the following tables: rule, rule_attribute, rule_attribute_value.

rule - table where all rules are stored Click here to see a screenshot of the 'rule' table

rule_attribute - table where all rule attributes are stored Click here to see a screenshot of the 'rule_attribute' table

rule_attribute_value - table where all rule attribute values are stored Click here to see a screenshot of the 'rule_attribute_value' table

When the customer logs in, I have access to all those attributes (customer_id, customer_group_id, country_id, subdivision_one_id, subdivision_two_id). Only customer_id and customer_group_id will always have values. The others are optional, but there is a dependency between them. We can't have subdivisions without selecting first a country. We can have a second subdivision without selecting a country and then the first subdivision.

What I would like to get is the highest priority rule that matches the session data in the most optimized way. I have a solution that involves some coding, but I want to see if it's possible directly through SQL.

Here are some examples of session data arrays:

Array
(
    [customer_id] => 2
    [customer_group_id] => 1
    [current_store_id] => 0
    [country_id] => 15
    [subdivision_one_id] => 224
    [subdivision_two_id] => 
)

Array
(
    [customer_id] => 2
    [customer_group_id] => 1
    [current_store_id] => 0
    [country_id] => 15
    [subdivision_one_id] => 
    [subdivision_two_id] => 
)

Array
(
    [customer_id] => 3
    [customer_group_id] => 2
    [current_store_id] => 0
    [country_id] => 
    [subdivision_one_id] => 
    [subdivision_two_id] => 
)

Looking forward to hearing back from you.

Thank you in advance.

CodePudding user response:

Right now, this is my approach considering the first array as payload:

Array
(
    [customer_id] => 2
    [customer_group_id] => 1
    [current_store_id] => 0
    [country_id] => 15
    [subdivision_one_id] => 224
    [subdivision_two_id] => 
)
  1. Generate a query for each attribute to verify if there are matching rules: SELECT ... WHERE store_id = 0 AND attribute_id = 1 AND value = 1, SELECT ... WHERE store_id = 0 AND attribute_id = 4 AND value = 224. I would have 5 queries, one for each attribute.

  2. Following these queries, I will have an array with possible matching rules.

  3. Then, I will loop through those possible matching rules and verify if they match from bottom to top (subdivision_two_id, subdivision_one_id, country_id ...). If one rule does not match at a certain point, I just skip it and move on to the next rule.

  4. If no rules are available, that's it. If multiple rules are available, I will consider the highest priority rule.

That's the approach I have right now.

CodePudding user response:

Without a better understanding of the rules and data this is the best I can come up with. It is based on your first array example -

SELECT `r`.*
FROM `rule_attribute_value` `rav`
INNER JOIN `rule` `r`
    ON `rav`.`rule_id` = `r`.`rule_id`
INNER JOIN `rule_attribute` `ra`
    ON `rav`.`attribute_id` = `ra`.`attribute_id`
WHERE
    (`rav`.`store_id` = 0 AND `ra`.`attribute_code` = 'customer'        AND `rav`.`value` = 2) OR
    (`rav`.`store_id` = 0 AND `ra`.`attribute_code` = 'customer_group'  AND `rav`.`value` = 1) OR
    (`rav`.`store_id` = 0 AND `ra`.`attribute_code` = 'country'         AND `rav`.`value` = 15) OR
    (`rav`.`store_id` = 0 AND `ra`.`attribute_code` = 'subdivision_one' AND `rav`.`value` = 224)
GROUP BY `r`.`rule_id`
HAVING COUNT(DISTINCT `rav`.`attribute_id`) = 4 /* 4 IS THE NUMBER OF ATTRIBUTES BEING QUERIED */
ORDER BY `r`.`position` ASC
LIMIT 1;
  • Related