Most Recently Used Index - DS&A / Interview Question-CodePudding

Recently got the following question on an interview with FAANG. Could not understand for the life of me the optimal solution that the interviewer described, and after struggling with it for awhile, I've come here for help. The question is:

Design a most recently used index retriever. The retriever will be fed (in a stream) integers, and your class has to return the most recently used index for all integers that have been fed.

Example:

r = Retriever()
r.stream(1)
r.stream(2)
r.get_index(1) # should return 1
r.get_index(2) # should return 0 since it's the most recently used
r.stream(1) 
r.get_index(1) # should return 0 since it was most recently used in the previous line

Pretty simple, right? The tough part that I'm struggling with, is how to get the optimal solution, which apparently can implement both stream and get_index in O(logN) time.

All I have going on is the fact that the interviewer mentioned that it is possible with a Binary Tree.

CodePudding user response：

Keep track of the current "time". The "time" is just a counter that increments by 1 when an integer is fed.
Maintain a map: integer -> time last fed.

A map can be implemented either as a hashmap, or as a binary tree.

Below is an implementation of class Retriever in python. The map is a python dict, which is a hashmap behind the scene.

class Retriever:
    def __init__(self):
        self.current_time = 0
        self.time_last_seen = {}
    def feed(self, x):
        self.current_time  = 1
        self.time_last_seen[x] = self.current_time
    def get_index(self, x):
        if x in self.time_last_seen:
            return self.current_time - self.time_last_seen[x]
        else:
            raise KeyError(x)

r = Retriever()
r.feed(1)
r.feed(2)
print( r.get_index(1) ) #  1
print( r.get_index(2) ) #  0
r.feed(1) 
print( r.get_index(1) ) #  0

CodePudding user response：

Implement a balanced, threaded binary tree (like AVL) and enrich each node with a size attribute that should always represent the number of nodes in its subtree (including itself). Unlike with binary search trees, an insertion always happens at the left-most leaf, where the new node becomes its left child. This way that new node has index 1 at that moment. By rebalancing (e.g. AVL rotations) the tree remains balanced.

Accompany this with a hash map that maps a value to a node in the tree. If a value is streamed again, this map will point to the node, which can then be removed (using AVL mechanics -- don't forget size updates in the ancestor nodes). After this removal, it is inserted again as would happen with a new value.

To get the node at a given index, one can walk down the tree, using the size information to choose the correct child, and eventually find the right node and corresponding value.