I am working on some sentence formation like this:
sentence = "PERSON is ADJECTIVE"
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"]}
I would now need all possible combinations to form this sentence from the dictionary, like:
Alice is cute
Alice is intelligent
Bob is cute
Bob is intelligent
Carol is cute
Carol is intelligent
The above use case was relatively simple, and it was done with the following code
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"]}
for i in dictionary["PERSON"]:
for j in dictionary["ADJECTIVE"]:
print(f"{i} is {j}")
But can we also make this scale up for longer sentences?
Example:
sentence = "PERSON is ADJECTIVE and is from COUNTRY"
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"], "COUNTRY": ["USA", "Japan", "China", "India"]}
This should again provide all possible combinations like:
Alice is cute and is from USA
Alice is intelligent and is from USA
.
.
.
.
Carol is intelligent and is from India
I tried to use https://www.pythonpool.com/python-permutations/ , but the sentence are all are mixed up - but how can we make a few words fixed, like in this example the words "and is from"
is fixed
Essentially if any key in the dictionary is equal to the word in the string, then the word should be replaced by the list of values from the dictionary.
Any thoughts would be really helpful.
CodePudding user response:
You can first replace the dictionary keys in sentence
to {}
so that you can easily format a string in loop. Then you can use itertools.product
to create the Cartesian product of dictionary.values()
, so you can simply loop over it to create your desired sentences.
from itertools import product
sentence = ' '.join([('{}' if w in dictionary else w) for w in sentence.split()])
mapped_sentences_generator = (sentence.format(*tple) for tple in product(*dictionary.values()))
for s in mapped_sentences_generator:
print(s)
Output:
Alice is cute and is from USA
Alice is cute and is from Japan
Alice is cute and is from China
Alice is cute and is from India
Alice is intelligent and is from USA
Alice is intelligent and is from Japan
Alice is intelligent and is from China
Alice is intelligent and is from India
Bob is cute and is from USA
Bob is cute and is from Japan
Bob is cute and is from China
Bob is cute and is from India
Bob is intelligent and is from USA
Bob is intelligent and is from Japan
Bob is intelligent and is from China
Bob is intelligent and is from India
Carol is cute and is from USA
Carol is cute and is from Japan
Carol is cute and is from China
Carol is cute and is from India
Carol is intelligent and is from USA
Carol is intelligent and is from Japan
Carol is intelligent and is from China
Carol is intelligent and is from India
Note that this works for Python >3.6 because it assumes the dictionary insertion order is maintained. For older Python, must use collections.OrderedDict
rather than dict
.
CodePudding user response:
I would base my answer off of two building blocks itertools.product
and zip
.
itertools.product
will allow us to get the various combinations of our dictionary list values
zip
with the original keys and the combinations above will allow us to create a list of tuples that we can use with replace
.
import itertools
sentence = "PERSON is ADJECTIVE and is from COUNTRY"
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"], "COUNTRY": ["USA", "Japan", "China", "India"]}
keys = dictionary.keys()
for values in itertools.product(*dictionary.values()):
new_sentence = sentence
for tpl in zip(keys, values):
new_sentence = new_sentence.replace(*tpl)
print(new_sentence)
IF you happen to have the ability to control the "sentence" template, and you can do:
sentence = "{PERSON} is {ADJECTIVE} and is from {COUNTRY}"
Then you can simplify this to:
sentence = "{PERSON} is {ADJECTIVE} and is from {COUNTRY}"
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"], "COUNTRY": ["USA", "Japan", "China", "India"]}
keys = dictionary.keys()
for values in itertools.product(*dictionary.values()):
new_sentence = sentence.format(**dict(zip(keys, values)))
print(new_sentence)
both should give you the results like:
Alice is cute and is from USA
Alice is cute and is from Japan
...
Carol is intelligent and is from China
Carol is intelligent and is from India
Note that the order of appearance in the template is not important and both solutions should work with a template of:
sentence = "PERSON is from COUNTRY and is ADJECTIVE"
or in case 2
sentence = "{PERSON} is from {COUNTRY} and is {ADJECTIVE}"