Elements' relationship from a dictionary, that both keys & values are tuples-CodePudding

A dictionary, that both the keys and values are tuples (1-to-1 relationship). Keys are names. Values are IDs.

I want to find out, which names have more chances to come up with some IDs. For example, 'James' often appear with 'Gamma'. 'Harper' often comes with 'Delta' etc.

What I tried is basically to list the most frequent elements in the list of keys, and list of values. Then try to manually guess their likelihood.

d = {('Amelia', 'James', 'Noah'):('Iota', 'Epsilon', 'Gamma'),
('James', 'Lucas', 'Elijah'):('Beta', 'Theta', 'Eta'),
('Harper', 'Emma', 'Ava'):('Eta', 'Iota', 'Delta'),
('Harper', 'James', 'Amelia'):('Gamma', 'Delta', 'Epsilon'),
('Olivia', 'James', 'Liam'):('Zeta', 'Gamma', 'Eta'),
('Oliver', 'Charlotte', 'Evelyn'):('Iota', 'Alpha', 'Eta'),
('Elijah', 'Oliver', 'James'):('Gamma', 'Zeta', 'Epsilon'),
('Ethan', 'Harper', 'Emma'):('Alpha', 'Epsilon', 'Delta')}


getting_keys = list(d.keys())

# putting all elements in keys into a list
keys = [item for t in getting_keys for item in t]

# get a list of unique keys
unique_keys = set(keys)

# print the counts of occurrence of each unique key
for k in unique_keys:
    print (k, keys.count(k))


getting_values = list(d.values())

# putting all elements in values into a list
values = [item for t in getting_values for item in t]

# get a list of unique values
unique_values = set(values)

# print the counts of occurance of each unique value
for v in unique_values:
    print (v, values.count(v))

The printout shows 'James' appeared most often for 5 times, in keys. 'Gamma' appeared most often for 4 times, in values. So it can be concluded that 'James' often comes with 'Gamma'.

Any help please to suggest a better way to find out such? Thank you.

CodePudding user response：

Using collections.Counter() we can generate a two-way map of counters between keys-values and values-keys, then this one-time prepared data can be used to query with a name and find out with which name it appeared most.

from collections import Counter

data = {('Amelia', 'James', 'Noah'):('Iota', 'Epsilon', 'Gamma'),
('James', 'Lucas', 'Elijah'):('Beta', 'Theta', 'Eta'),
('Harper', 'Emma', 'Ava'):('Eta', 'Iota', 'Delta'),
('Harper', 'James', 'Amelia'):('Gamma', 'Delta', 'Epsilon'),
('Olivia', 'James', 'Liam'):('Zeta', 'Gamma', 'Eta'),
('Oliver', 'Charlotte', 'Evelyn'):('Iota', 'Alpha', 'Eta'),
('Elijah', 'Oliver', 'James'):('Gamma', 'Zeta', 'Epsilon'),
('Ethan', 'Harper', 'Emma'):('Alpha', 'Epsilon', 'Delta')}


def prepare_values(data):
    """Prepare a counter both ways, key to value and value to key."""
    relation_data = {}
    for key, value in data.items():
        for k in key:
            for v in value:
                relation_data.setdefault(k, Counter())[v]  = 1
                relation_data.setdefault(v, Counter())[k]  = 1

    return relation_data


def find_likelyhood(data, name, with_count=False):
    max_value = max(data, key=lambda x: data[x][name])
    if with_count:
        return max_value, data[max_value][name]
    return max_value


prepared_data = prepare_values(data)
print(find_likelyhood(prepared_data, 'James')) # Gamma
print(find_likelyhood(prepared_data, 'Harper')) # Delta
print(find_likelyhood(prepared_data, 'Delta', with_count=True))  # ('Harper', 3)