Home > Blockchain >  Python - Generate permutations by key from key value pair list
Python - Generate permutations by key from key value pair list

Time:10-02

Problem:

I am generating a search query from key=value pairs. The system being queried does not support searching by the same field twice. I need to generate all unique permutations (assuming that is the correct word) of the pairs so I can generate multiple queries.

Example query:

python test.py --search field_1="books" and (field_2="paper" or (field_2="abcd" and field_4="test")) and field_20=80 and field_20="443" and not field_13=test or field19="test" and field19="4"

Ignore the boolean operations. After parsing I end up with:

['field_1="books"', 'field_2="paper"', 'field_2="abcd"', 'field_4="test"', 'field_20="80"', 'field_20="443"', 'field_13="test"', 'field19="test"', 'field19="4"']

Number and name of fields used/re-used is user dependent. I wish use this list to generate the below.

Desired Output:

['field_1="books"', 'field_4="test"', 'field_13="test"', 'field_2="paper"', 'field_20="80"', 'field19="test"']
['field_1="books"', 'field_4="test"', 'field_13="test"', 'field_2="abcd"',  'field_20="443"', 'field19="4"']
and so on...

Or a list of dicts is fine too. I just need every permutation where the same key (field_x) is not used twice in the same list.

Attempts:

Tried to break apart repeated fields and only generate permutations of repeats, then was going to append to the non-repeated fields. Seems way more involved than it should be.

repeat_pairs = []
once_pairs = []

for pair in search_pairs:
    key = pair.split('=')[0]
    if key in repeat_keys:
        repeat_pairs.append(pair)
    else:
        once_pairs.append(pair)


print(search_pairs)


def gen_queries(repeat_list):
    master_query_list = []
    for item in repeat_list:
        tmp_list = repeat_list[:]
        
        key = item.split('=')[0]
        value = item.split('=')[1]
        
        build = []
        build.append(item)
        
        tmp_list.remove(item)
        
        for sub in tmp_list:
            sub_key = sub.split('=')[0]
            sub_value = sub.split('=')[1]

            if key != sub_key:                    
                build.append(sub)
                tmp_list.remove(sub)
        
        master_query_list.append(build)
        
    
master_query_list.sort()
    
for item in master_query_list:
    print(item)
    
gen_queries(repeat_pairs) 

Outputs:

['field19="4"', 'field_2="paper"', 'field_20="80"', 'field_2="test"']
['field19="test"', 'field_2="paper"', 'field_20="80"', 'field_2="test"']
['field_20="443"', 'field_2="paper"', 'field_2="test"', 'field19="4"']
['field_20="80"', 'field_2="paper"', 'field_2="test"', 'field19="4"']
['field_2="abcd"', 'field_20="80"', 'field19="test"']
['field_2="paper"', 'field_20="80"', 'field19="test"']
['field_2="test"', 'field_20="80"', 'field19="test"']

This feels like something simple and doable with recursion but my brain just isn't clicking.

CodePudding user response:

Group these strings into "bins" by their key and compute a product of these bins:

conds = ['field_1="books"', 'field_2="paper"', 'field_2="abcd"', 'field_4="test"', 'field_20="80"', 'field_20="443"', 'field_13="test"', 'field19="test"', 'field19="4"']

from collections import defaultdict
from itertools import product

bins = defaultdict(list)
for c in conds:
    k, _ = c.split('=')
    bins[k].append(c)

for q in product(*bins.values()):
    print(q)

Result

('field_1="books"', 'field_2="paper"', 'field_4="test"', 'field_20="80"', 'field_13="test"', 'field19="test"')
('field_1="books"', 'field_2="paper"', 'field_4="test"', 'field_20="80"', 'field_13="test"', 'field19="4"')
('field_1="books"', 'field_2="paper"', 'field_4="test"', 'field_20="443"', 'field_13="test"', 'field19="test"')
('field_1="books"', 'field_2="paper"', 'field_4="test"', 'field_20="443"', 'field_13="test"', 'field19="4"')
('field_1="books"', 'field_2="abcd"', 'field_4="test"', 'field_20="80"', 'field_13="test"', 'field19="test"')
('field_1="books"', 'field_2="abcd"', 'field_4="test"', 'field_20="80"', 'field_13="test"', 'field19="4"')
('field_1="books"', 'field_2="abcd"', 'field_4="test"', 'field_20="443"', 'field_13="test"', 'field19="test"')
('field_1="books"', 'field_2="abcd"', 'field_4="test"', 'field_20="443"', 'field_13="test"', 'field19="4"')
  • Related