Home > Blockchain >  Cartesian product of dictionnaries with a condition
Cartesian product of dictionnaries with a condition

Time:03-10

I am currently trying to perform a grid search to optimize a function. Unfortunately, the computing time is too high and it is mostly due to useless calculation on my grid search. So far, I am trying to do only a cartesian product which gives me more set of parameters than I need of.

I have the following input:

dict_params = {"params_a": [0, 1],
               "params_b": ["x1", "x2", "x3"],
               "params_c": {"x1": ["a1", "a2"],
                            "x2": ["b1", "b2"],
                            "x3": ["c1", "c2"]
                           }
               }

I expect the following output:

output_expected = [{"params_a" : 0, "params_b" : "x1", "params_c": "a1"},
                   {"params_a" : 0, "params_b" : "x1", "params_c": "a2"},
                   {"params_a" : 0, "params_b" : "x2", "params_c": "b1"},
                   {"params_a" : 0, "params_b" : "x2", "params_c": "b2"},
                   {"params_a" : 0, "params_b" : "x3", "params_c": "c1"},
                   {"params_a" : 0, "params_b" : "x3", "params_c": "c2"},
                   {"params_a" : 1, "params_b" : "x1", "params_c": "a1"},
                   {"params_a" : 1, "params_b" : "x1", "params_c": "a2"},
                   {"params_a" : 1, "params_b" : "x2", "params_c": "b1"},
                   {"params_a" : 1, "params_b" : "x2", "params_c": "b2"},
                   {"params_a" : 1, "params_b" : "x3", "params_c": "c1"},
                   {"params_a" : 1, "params_b" : "x3", "params_c": "c2"}
                  ]

Any format returned is good. I tried to populate a dataframe with the full cartesian product then merge on "params_b" = "params_c", unfortunately I cannot make it work.

Thanks for reading

CodePudding user response:

With native python you can do like this :

dict_params = {"a": [0, 1],
        "b": ["x1", "x2", "x3"],
        "c": {"x1": ["a1", "a2"],
                    "x2": ["b1", "b2"],
                    "x3": ["c1", "c2"]
                    }
        }

res = []

for param_a in dict_params['a']:
    for param_b in dict_params['b']:
        for param_c in dict_params['c'][param_b]:
            res.append({'a': param_a, 'b': param_b, 'c': param_c})

print(res)

CodePudding user response:

One way I see using pandas:

(pd.DataFrame(product(dict_params['params_a'], dict_params['params_b']),
              columns=['params_a', 'params_b'])
   .assign(params_c=lambda d: d['params_b'].map(pd.Series(dict_params['params_c'])))
   .explode('params_c')
)

output:

   params_a params_b params_c
0         0       x1       a1
0         0       x1       a2
1         0       x2       b1
1         0       x2       b2
2         0       x3       c1
2         0       x3       c2
3         1       x1       a1
3         1       x1       a2
4         1       x2       b1
4         1       x2       b2
5         1       x3       c1
5         1       x3       c2
  • Related