Home > Blockchain >  Data Structure Problem - Reverse Chronological Order
Data Structure Problem - Reverse Chronological Order

Time:04-09

I was asked a question in an interview related to the data structure.

Problem: Getting stream of data and it should be shown in the reverse chronological order. No duplicates.

1. Which data structure to use?
2. What would be your solution approach?

Example

 Data    Output
  
(first set of data)
 A B  ->  A B   
 
(new streamed data i.e. D E arrives)
 D E  ->  D E A B 

(new streamed data i.e. A F arrives)
 A F  ->  A F D E B 

Can someone please share some knowledge?

My approach:

Data structure: Array

Algorithm

  1. Create an empty array
  2. Insert the incoming data into the top of the list
  3. When new data arrives, do a linear search.
  4. If data is already present then remove it and insert the new data on top of the list.
  5. Repeat

CodePudding user response:

I think the best soluttion would be an Hash-map of pointer to linked-list Nodes, I'll explain:

The hash-map is used for searching in O(1) time,

In the hash-map, the keys will be the data, and the values will be the pointer to the node in the linked list with the value of the key.

With the hash-map, it's possible to find if an element is present in the current linked list (in O(1) time), and changing the position of an element in O(1) time (the positive of using a pointer to a node in a linked list)

Over all, the Update of the new data will be O(1) time and the data structures are O(n) space.

If something in the answer is unsatisfying, be sure to leave a comment :)

CodePudding user response:

For order reversal, the classical data structure is a stack (FILO = first in, last out). New entries are pushed on top of the stack. In the end, the stack entries are popped off in reverse order.

To prevent duplicates, one could use a hashset. Whenever a new entry arrives, its hash signature is computed. If the hash signature is present in the hash set, the entry is blocked as duplicate. Otherwise, the hash is registered as new member of the set and the entry is pushed to the stack.

Compared to linear search, the hashset is faster. But it requires extra memory.

CodePudding user response:

I think an easy solution for this would be to use a doubly linked list with a hashmap.

  1. O(1) check for an element's presence in the list
  2. If the element is already present, O(1) removal from the old position (no need to traverse the entire list since you have next and prev pointers in doubly linked list nodes) and O(1) insertion at the list head
  3. If the element isn't presented, O(1) insertion at the list head
  4. O(N) traversal to get all elements in order

CodePudding user response:

I would use an array of arrays to store the data stream, and hashset structure when showing the output with no duplicate.

1. Initialize empty array `x`.
2. While new stream `a` arrives, x.push(a)

# To show data with no duplicate,
3. Initialize an empty set `shown`.
4. Repeat while `x` is not empty:
   a = x.pop()
   For each element in a:
      if element not in shown:
         display(element)
         shown.add(element)

Reverse-chronological order is maintained by push and pop. And duplicates are skipped by the in-set check. Given that hashset inclusion test is O(1), the complexity is linear in the number of incoming elements.

This is what I thought, but maybe this is too naive. Comments are welcome.

  • Related