Transform Python dictionary from {'a': ['A', 'B'], 'b': [&#0-CodePudding

I'm starting out with this dictionary:

lower_to_upper = {"a": ["A", "B"],
                  "b": ["B", "C"]}

Then, I put all the capital letters into one list and printing that list...

upper_list = [upper for lo_uppers in lower_to_upper.values() for upper in lo_uppers]
print(f"upper_list:     {upper_list}")

... as expected gives this:

upper_list:     ['A', 'B', 'B', 'C']

Next, I create two dictionaries based on the list of upper case letters. One that has zeros as values (upper_to_count) and another that has empty lists are its values (upper_to_lower):

upper_to_count = dict.fromkeys(upper_list, 0)
upper_to_lower = dict.fromkeys(upper_list, [])

Next, I want to make it so that upper_to_count contains the number of entries an upper case letter occurs in in the original dictionary. Along the same lines, I want to populate the - for now - empty lists in upper_to_lower with the lower-case strings that were they keys for the respective upper case letters.

for i, upper_list in enumerate(lower_to_upper.values()):
    lower = list(lower_to_upper.keys())[i]
    for upper in upper_list:
        upper_to_count[upper]  = 1
        upper_to_lower[upper].append(lower)

The result for upper_to_count is as expected. But the result for upper_to_lower is not as (I) expected. These statements ...

print(f"upper_to_count: {upper_to_count}")
print(f"upper_to_lower: {upper_to_lower}")

... print:

upper_to_count: {'A': 1, 'B': 2, 'C': 1}
upper_to_lower: {'A': ['a', 'a', 'b', 'b'], 'B': ['a', 'a', 'b', 'b'], 'C': ['a', 'a', 'b', 'b']}

My expectation was this:

print(f"upper_to_lower_exp: {upper_to_lower_exp}")
upper_to_lower_exp: {'A': ['a'], 'B': ['a', 'b'], 'C': ['b']}

That is a dictionary looking like this:

upper_to_lower_exp = {'A': ['a'],
                      'B': ['a', 'b'],
                      'C': ['b']}

I don't understand why I get upper_to_lower and not upper_to_lower_exp. What it is that I don't know that produces the output in upper_to_lower rather than upper_to_lower_exp?

I apologise for the strangeness of the title of my question. Thanks a lot for any help in advance!

CodePudding user response：

When iterating in parallel over keys and values the best practice is to use .items:

lower_to_upper = {"a": ["A", "B"],
                  "b": ["B", "C"]}

upper_to_lower_exp = {}
for key, values in lower_to_upper.items():
    
    # iterate over each value
    for value in values:
        
        # if no key has been created so far for the value create one
        if value not in upper_to_lower_exp:
            upper_to_lower_exp[value] = []
        
        # append the key to the list of the corresponding value
        upper_to_lower_exp[value].append(key)

# upper_to_count is just upper_to_lower_exp but with the length of the lists instead
upper_to_count = {k: len(vs) for k, vs in upper_to_lower_exp.items()}
print(upper_to_count)
print(upper_to_lower_exp)

Output

{'A': 1, 'B': 2, 'C': 1}
{'A': ['a'], 'B': ['a', 'b'], 'C': ['b']}

CodePudding user response：

Not sure how you are initializing the upper_to_lower dictionary, but your code as-is works if initialized like this:

# `list(set(upper_list))` being ['A', 'B', 'C']
upper_to_lower = {k: [] for k in list(set(upper_list))}

for i, upper_list in enumerate(lower_to_upper.values()):
    lower = list(lower_to_upper.keys())[i]
    for upper in upper_list:
        upper_to_count[upper]  = 1
        upper_to_lower[upper].append(lower)

Output:

{'B': ['a', 'b'], 'A': ['a'], 'C': ['b']}

CodePudding user response：

I found the answer. In the official documentation the crucial tidbit is that the value in initializing a dictionary "refer to just a single instance" -- be that None or e.g. an empty list. Thus, when I append elements to this list all keys in the final dictionary will point to the same list.

classmethod fromkeys(iterable[, value]) Create a new dictionary with keys from iterable and values set to value.

fromkeys() is a class method that returns a new dictionary. value defaults to None. All of the values refer to just a single instance, so it generally doesn’t make sense for value to be a mutable object such as an empty list. To get distinct values, use a dict comprehension instead. (enter link description here)

I had no idea. Many thanks to the answers provided!