Home > Mobile >  dictionary like data structure with ordered keys and selection between interval of key values
dictionary like data structure with ordered keys and selection between interval of key values

Time:10-19

Say I want to store ordered values where the key values represent a lower bound. Like this example:

d = {1: "pear", 4: "banana", 7: "orange"}

I can access the first object by d[1]. Say I want to store it so that I can access the first object "pear" by calling for any value between [1,4). If I input any "keyvalue" between [4,7) I want "banana" to be returned. Is there any type of data structure like that in python? I found intervalTrees, but it looked a bit more advanced than what I was looking for. In intervalTrees the intervals which are the keys, can be overlapping and I don't want that. Or maybe it is not a dictionary of any type I want since you can mix datatypes as keys in one dictionary. What do you think?

EDIT: From the tip I got, this would be a working code:

import bisect

d = [(1, "pear"), (4, "banana"), (7,"orange") ]
keys = [j[0] for j in d]

for v in range(1,10):
    print("Using input ", v)
    i = bisect.bisect(keys, v) - 1
    out = d[i]
    print(out)
    print("")

# Or using SortedDict
from sortedcontainers import SortedDict

d2 = SortedDict()
d2[1] = 'pear'
d2[4] = 'banana'
d2[7] = 'orange'

for v in range(1,10):
    print("Using input ", v)
    i = bisect.bisect(d2.keys(), v) - 1
    j = d2.keys()[i]
    out = d2[j]
    print(out)
    print("")

CodePudding user response:

The data structure you're looking for is a binary search tree (BST), and preferably a balanced BST. Your dictionary keys are the keys of the BST, and each node would just have an additional field to store the corresponding value. Then your lookup is just a lower-bound / bisect-left on the keys. Looking up Python implementations for Red-Black trees or AVL trees returns many possible packages.

There is no builtin library for always-sorted data. If you never need to add or delete keys, you can use bisect with (key, value) tuples in a sorted list.

For a pure Python implementation that allows modification, I would recommend checking out SortedDict from the SortedContainers library. It's built to be a drop-in replacement for BST's, is very usable and tested, and claims to outperform pointer-based BST's in memory and speed on reasonably sized datasets (but does not have the same asymptotic guarantees as a BST). You can also provide a custom key for comparing objects of different types.

  • Related