Permutations and combination in python-CodePudding

I'm working on an OCR use case and have identified common misclassification from the confusion matrix which is for example: '1' being confused for 'J' and '2' being confused with 'Z' and 'J'.

For a given word, I am trying to create a python script which would create all the permutations which account for all the misclassification.

Example:

Common Misclassifications: {'1':['J'],'2':['Z','J']}
Input: "AB1CD2"
Output: AB1CD2, AB1CDZ, ABJCD2, ABJCDZ, AB1CDJ, ABJCDJ

How do I go about solving this?

CodePudding user response：

itertools product should help

from itertools import product
misclass = {'1':['J'],'2':['Z','J']}
misclass_items = [tuple([k, *v]) for k, v in misclass.items()]
print(["AB"   x   "CD"   y for (x, y) in list(product(*misclass_items))])
# ['AB1CD2', 'AB1CDZ', 'AB1CDJ', 'ABJCD2', 'ABJCDZ', 'ABJCDJ']

CodePudding user response：

You get a neat solution by using a dictionary of all possible classifications, not just all mis-classifications. That is, you first "enrich" your misclassification dictionary with all possible correct classifications.

from itertools import product

all_characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
common_misclass = {'1':['J'],'2':['Z','J']}
input_string = "AB1CD2"

common_class = {}
for char in all_characters:
    if char in common_misclass:
        common_class[char] = [char]   common_misclass[char]
    else:
        common_class[char] = [char]

possible_outputs = ["".join(tup) for tup in 
    product(*[common_class[letter] for letter in input_string])]

print(possible_outputs)