Home > Enterprise >  parse a binary decision tree and find all path by inversing the rule of the decision tree traversing
parse a binary decision tree and find all path by inversing the rule of the decision tree traversing

Time:02-07

the decision tree rule right now that I am having

{
  "id": -1,
  "rule": "TOTAL_REVENUE <= 300",
  "left": {
        "id": "0",
        "rule": "TOTAL_DATA_DUR <= 39.5794",
        
        "left": {
          "id": "1",
          "rule": "TOTAL_DATA_DUR <= 0.7408",
         
          "left": null,
          "right": {
            "id": "3",
            "rule": "TOTAL_PACKAGE_REVENUE <= 15.1350",
           
            "left": {
              "id": "4",
              "rule": "TOTAL_PACKAGE_REVENUE_14DAYS <= 12.5000",
             
              "left": null,
              "right": {
                "id": "6",
              
                "value": 84.62
                
              }
            },
            "right": null
          }
        },
        "right": {
          "id": "8",
          "rule": "TOTAL_DATA_DUR <= 301.6211",
        
          "left": null,
          "right": {
            "id": "10",
            "rule": "TOTAL_DATA_DUR <= 6898.9146",
           
            "left": {
              "id": "11",
              "rule": "TOTAL_PACKAGE_REVENUE <= 14.5000",
             
              "left": null,
              "right": {
                "id": "13",
                "rule": "TOTAL_PACKAGE_REVENUE <= 16.0000",
             
                "left": {
                  "id": "14",
                 
                 
                  "value": 84.96
                  
                },
                "right": {
                  "id": "15",
                  "rule": "TOTAL_PACKAGE_REVENUE <= 19.5000",
                  "left": null,
                  "right": {
                    "id": "17",
                   
                   
                    "value": 70.8
                    
                  }
                }
              }
            },
            "right": null
          }
        }
      }
}

the output that I am trying to get is

path1 -> TOTAL_REVENUE <= 300 and TOTAL_DATA_DUR <= 39.5794 TOTAL_DATA_DUR > 0.7408 and TOTAL_PACKAGE_REVENUE <= 15.1350 and TOTAL_PACKAGE_REVENUE_14DAYS > 12.5000

here you can see TOTAL_DATA_DUR > 0.7408 and TOTAL_PACKAGE_REVENUE_14DAYS > 12.5000 is being reversed as it is traversing through the right the rest of the condition is <= because its going through left

path2 -> TOTAL_REVENUE <= 300 and TOTAL_DATA_DUR > 39.5794 and TOTAL_DATA_DUR >  301.6211 and TOTAL_DATA_DUR <= 6898.9146 and TOTAL_PACKAGE_REVENUE > 14.5000 and TOTAL_PACKAGE_REVENUE <= 16.0000

path3 -> TOTAL_REVENUE <= 300 and TOTAL_DATA_DUR > 39.5794 and TOTAL_DATA_DUR >  301.6211 and TOTAL_DATA_DUR <= 6898.9146 and TOTAL_PACKAGE_REVENUE > 14.5000 and TOTAL_PACKAGE_REVENUE > 16.0000 and TOTAL_PACKAGE_REVENUE > 19.5000

I am fairly new to coding how will I get the required output using recursion

the code that I am working on right now

from collections import deque
import json
def isLeaf1(node):
    return node.get('left') is None and node.get('right') is None



all_paths = []


def printRootToLeafPaths1(node, path, node_type=None):
    # base case
    if node is None:

        return

    # include the current node to the path
    if node.get('rule') is not None:
        path.append(node.get('rule'))

    # if a leaf node is found, print the path
    if isLeaf1(node):
        all_paths.append(list(path))


    # recur for the left and right subtree
    printRootToLeafPaths1(node.get('left'), path, 'left')
    printRootToLeafPaths1(node.get('right'), path, 'right')

    # backtrack: remove the current node after the left, and right subtree are done
    path.pop()



# The main function to print paths from the root node to every leaf node
def printRootToLeafPath1(root):
    # list to store root-to-leaf path
    path = deque()
    printRootToLeafPaths1(root, path)

json_rule ='{"id":-1,"rule":"TOTAL_REVENUE <= 300","left":{"id":"0","rule":"TOTAL_DATA_DUR <= 39.5794","left":{"id":"1","rule":"TOTAL_DATA_DUR <= 0.7408","left":null,"right":{"id":"3","rule":"TOTAL_PACKAGE_REVENUE <= 15.1350","left":{"id":"4","rule":"TOTAL_PACKAGE_REVENUE_14DAYS <= 12.5000","left":null,"right":{"id":"6","value":84.62}},"right":null}},"right":{"id":"8","rule":"TOTAL_DATA_DUR <= 301.6211","left":null,"right":{"id":"10","rule":"TOTAL_DATA_DUR <= 6898.9146","left":{"id":"11","rule":"TOTAL_PACKAGE_REVENUE <= 14.5000","left":null,"right":{"id":"13","rule":"TOTAL_PACKAGE_REVENUE <= 16.0000","left":{"id":"14","value":84.96},"right":{"id":"15","rule":"TOTAL_PACKAGE_REVENUE <= 19.5000","impurity":"0.47643265235457066","samples":"304","left":null,"right":{"id":"17","value":70.8}}}},"right":null}}}}'

printRootToLeafPath1(json.loads(json_rule))

can anyone tell me what changes should I make in this code in order to obtains the output paths ?

CodePudding user response:

Some remarks:

  • Instead of printing in the recursive function, yield. That way, the caller can decide what they do with the paths: print them, or something else.

  • It is bad practice to use a global variable and let the function mutate that variable all_paths. Instead aim to make the function "pure", so it does not need that and can produce the paths through yielding them (or returning them).

  • The pop() call is unconditional, while the append() call is conditional. This looks as it there is a possibility to pop something when it was not appended in the same function, so it will be error prone. There is a better way to do recursion -- next point:

  • Instead of passing a partial path to the recursive call, let the recursive call give the paths it has found "as if" it was the root, and then prefix the current node's rule to it. This is the better practice for recursion.

  • Apparently, when there is no "rule" key in the node, it is a leaf-node. In that case it is not needed to look for left/right anymore, as it is understood that a node has either a "rule" key and children, or has no "rule" key and no children. In the case of a leaf, just yield an empty path which the caller can extend.

  • There is no logic in your code that negates a rule. For that you can use list of pairs that maps each operator to its opposite. And if none of those match the rule, then just apply a default not (rule) format. This logic assumes that a rule uses just one operator.

  • Your algorithm makes no use of the "value" properties in the leaves of the tree. I think you will need to use that information at some point, but I will ignore it for now.

Here is a possible implementation, taking the above points into account:

def negate(rule):
    mapper = (("<=", ">"), (">=", "<"), (">", "<="), ("<", ">="))
    for operator, negated in mapper:
        if operator in rule:
            return rule.replace(operator, negated)
    return "not ("   rule   ")"  # Default when operator not recognised

def nodeToLeafPaths(node):
    if not node:  # base case: nothing in this direction
        return
    rule = node.get('rule')
    if rule is None:  # base case: a leaf with a value
        yield []  # empty path
        return

    negated_rule = negate(rule)
    for path in nodeToLeafPaths(node.get('left')):
        yield [rule, *path]  # Extend path with current rule
    for path in nodeToLeafPaths(node.get('right')):
        yield [negated_rule, *path]

# Transform paths (lists) to AND-rules (strings):
def rootToLeafConjugations(root):
    return [" AND ".join(path) for path in nodeToLeafPaths(root)]

The main driver code could look like this:

import json

json_rule ='{"id":-1,"rule":"TOTAL_REVENUE <= 300","left":{"id":"0","rule":"TOTAL_DATA_DUR <= 39.5794","left":{"id":"1","rule":"TOTAL_DATA_DUR <= 0.7408","left":null,"right":{"id":"3","rule":"TOTAL_PACKAGE_REVENUE <= 15.1350","left":{"id":"4","rule":"TOTAL_PACKAGE_REVENUE_14DAYS <= 12.5000","left":null,"right":{"id":"6","value":84.62}},"right":null}},"right":{"id":"8","rule":"TOTAL_DATA_DUR <= 301.6211","left":null,"right":{"id":"10","rule":"TOTAL_DATA_DUR <= 6898.9146","left":{"id":"11","rule":"TOTAL_PACKAGE_REVENUE <= 14.5000","left":null,"right":{"id":"13","rule":"TOTAL_PACKAGE_REVENUE <= 16.0000","left":{"id":"14","value":84.96},"right":{"id":"15","rule":"TOTAL_PACKAGE_REVENUE <= 19.5000","impurity":"0.47643265235457066","samples":"304","left":null,"right":{"id":"17","value":70.8}}}},"right":null}}}}'

for rule in rootToLeafConjugations(json.loads(json_rule)):
    print(rule)
  •  Tags:  
  • Related