Python: Obtaining index of an element within a value list of a dictionary-CodePudding

I have a dictionary with key:value list pairings, and I intend to find the index of the value list that contains the desired element.

E.g., if the dictionary is:

my_dict = {"key1":['v1'], "key2":None, "key3":['v2','v3'], "key4":['v4','v5','v6']}

Then, given element 'v2' I should be able to get index 2.

For a value list with one element, the index can be obtained with: list(my_dict.values()).index(['v1']) , however this approach does not work with lists containing multiple elements.

Using for loop, it can be obtained via:

for key, value in my_dict.items():
  if value is None:
    continue
  if 'v2' in value:
    print (list(my_dict.keys()).index(key))

Is there a neater (pythonic) way to obtain the same?

CodePudding user response：

If you are looking for the key corresponding to a value, you can reverse the dictionary like so:

reverse_dict = {e: k for k, v in my_dict.items() if v for e in v}

Careful with duplicate values though. The last occurence will override the previous ones.

CodePudding user response：

Don't know if it's the best solution but this works:

value = 'v2'
list(map(lambda x : value in x, list(map(lambda x : x[1] or [], list(my_dict.items()))))).index(True)

CodePudding user response：

You've got an XY problem. You want to know the key that points to a value, and you think you need to find the enumeration index iterating the values so you can then use it to find the key by iteration as well. You don't need all that. Just find the key directly:

my_dict = {"key1":['v1'], "key2":None, "key3":['v2','v3'], "key4":['v4','v5','v6']}

value = 'v2'
# Iterate key/vals pairs in genexpr; if the vals contains value, yield the key,
# next stops immediately for the first key yielded so you don't iterate the whole dict
# when the value is found on an early key
key_for_value = next(key for key, vals in my_dict.items() if vals and value in vals)
print(key_for_value)

Try it online!

That'll raise StopIteration if the value doesn't exist, otherwise it directly retrieves the first key where the values list for that key contains the desired value.

If you don't really have an XY problem, and the index is important (it shouldn't be, that's a misuse of dicts) it's trivial to produce it as well, changing the extraction of the key to get both, e.g.:

index, key_for_value = next((i, key) for i, (key, vals) in enumerate(my_dict.items()) if vals and value in vals)

Mind you, this is a terrible solution if you need to perform these lookups a lot and my_dict isn't trivially small; it's O(n) on the total number of values, so a large dict would take quite a while to check (relative to the cost of just looking up an arbitrary key, which is average-case O(1)). In that case, ideally, if my_dict doesn't change much/at all, you'd construct a reversed dictionary up-front to find the key(s) associated with a value, e.g.:

from collections import defaultdict

my_dict = {"key1":['v1'], "key2":None, "key3":['v2','v3'], "key4":['v4','v5','v6']}

reversed_my_dict = defaultdict(set)
for key, vals in my_dict:
    for val in vals:
        reversed_my_dict[val].add(key)
reversed_my_dict = dict(reversed_my_dict)  # Optional: Prevents future autovivification of keys
                                           # by converting back to plain dict

after which you can cheaply determine the key(s) associated with a given value with:

reversed_my_dict.get(value, ()) # Using .get prevents autovivification of keys even if still a defaultdict

which returns the set of all keys that map to that value, if any, or the empty tuple if not (if you convert back to dict above, reversed_my_dict[value] would also work if you'd prefer to get a KeyError when the value is missing entirely; leaving it a defaultdict(set) would silently construct a new empty set, map it to the key and return it, which is fine if this happens rarely, but a problem if you test thousands of unmapped values and create a corresponding thousands of empty sets for no benefit, consuming memory wastefully).

Which you choose depends on how big my_dict is (for small my_dict, O(n) work doesn't really matter that much), how many times you need to search it (fewer searches mean less gain from reversed dict), and whether it's regularly modified. For that last point, if it's never modified, or rarely modified between lookups, rebuilding the reversed dict from scratch after each modification might be worth it for simplicity (assuming you perform many lookups per rebuild); if it's frequently modified, the reversed dict might still be worth it, you'd just have to update both the forward and reversed dicts rather than just one, e.g., expanding:

# New key
my_dict[newkey] = [newval1, newval2]

# Add value
my_dict[existingkey].append(newval)

# Delete value
my_dict[existingkey].remove(badval)

# Delete key
del my_dict[existingkey]

to:

# New key
newvals = my_dict[newkey] = [newval1, newval2]
for newval in newvals:
    reversed_my_dict[newval].add(newkey)  # reversed_my_dict.setdefault(newval, set()).add(newkey) if not defaultdict(set) anymore

# Add value
my_dict[existingkey].append(newval)
reversed_my_dict[newval].add(existingkey) # reversed_my_dict.setdefault(newval, set()).add(existingkey) if not defaultdict(set) anymore

# Delete value
my_dict[existingkey].remove(badval)
if badval not in my_dict[existingkey]:  # Removed last copy; test only needed if one key can hold same value more than once
    reversed_my_dict[badval].discard(existingkey)
    # Optional delete badval from reverse mapping if last key removed:
    if not reversed_my_dict[badval]:
        del reversed_my_dict[badval]

# Delete key
# set() conversion not needed if my_dict's value lists guaranteed not to contain duplicates
for badval in set(my_dict.pop(existingkey)):
    reversed_my_dict[badval].discard(existingkey)
    # Optional delete badval from reverse mapping if last key removed:
    if not reversed_my_dict[badval]:
        del reversed_my_dict[badval]

respectively, roughly doubling the work incurred by modifications, in exchange for always getting O(1) lookups in either direction.