Home > OS >  What is the time complexity of a HashMap implemented using dynamic array?
What is the time complexity of a HashMap implemented using dynamic array?

Time:09-23

They say HashMap's put and get operations have a constant time complexity, is it still going to be O(1) if it is implemented with a dynamic array?

ex:

 public class HashMap <key, value>{
    private class Entry <k, v>{
        private k key;
        private v value;

        public Entry(k key, v value){
            this.key = key;
            this.value = value;
        }
    }

    private ArrayList < LinkedList<Entry<key, value>> > = new ArrayList<>();
    
    // the rest of the implementation
    // ...

}

CodePudding user response:

HashMap already uses a dynamic array:

/**
 * The table, initialized on first use, and resized as
 * necessary. When allocated, length is always a power of two.
 * (We also tolerate length zero in some operations to allow
 * bootstrapping mechanics that are currently not needed.)
 */
transient Node<K,V>[] table;

Using an ArrayList instead of a manually resized array does not increase the time complexity.

CodePudding user response:

The issue for puts is how often you're going to need to extend the ArrayList.

This is probably little different in overhead to extending a plain array: you have to allocate a new one and rehash.

Note you'll need to know the intended ArrayList size in order to compute the hash index (as hash code % array size), so you should allocate the ArrayList with that capacity initially and then populate with nulls - since the array elements don't exist until added to the list.

Similarly for when you rehash.

You can of course do it wrong: you could compute a size for use in computing the hash index, and then not extend the ArrayList accordingly. Then you'd suffer from arbitrary need to extend the ArrayList whenever you had an index higher than one you'd seen before, which may require reallocation internal to the ArrayList.

In short: there's no significant performance penalty for using an ArrayList if you do it in a reasonable way, but no particular benefit either.

CodePudding user response:

What is the time complexity of a HashMap implemented using dynamic array?

The short answer is: It depends on how you actually implement the put and get methods, and the rehashing. Assuming that you have gotten it right, then the complexity would be the same as with classic arrays.

However, a typical hash table implementation will not benefit from using a dynamic array.

  • In between resizes, a hash table has an array whose size is fixed. Entries are added and removed from buckets, but the number of buckets and their positions in the array do not change. Since the code is not changing the array size, there is no benefit in using dynamic arrays in between the resizes.

  • When the hash table resizes, it needs to create a new array (typically about twice the size of the current one). Then it recompute the entry -> bucket mappings and redistributes the entries. Note that a new array is required, since it is not feasible to redistribute the entries "in place". Secondly, the size of the new array will be known (and fixed) when it is allocated. So again there is no benefit here from using a dynamic array.

Add to this that for all primitive operations on an array, the equivalent operations for a dynamic array (i.e. ArrayList) have a small performance overhead. So there will be a small performance hit from using a dynamic array in a hash table implementation.


Finally, you need to be careful when talking about complexity of hash table implementations. The average complexity of HashMap.put (and similar) is O(1) amortized.

  • A single put operation may be O(N) if it triggers a resize.

  • If the hash function is pathological, all operations can be O(N).

  • If you choose an inappropriate load factor, performance will suffer.

  • If you implement an incorrect resizing policy then performance will suffer.

(Amortized means averaged over all similar operations on the table. For example, N insertions with different keys into an empty table is O(N) ... or O(1) amortized per insertion.)

In short: the complexity of hash tables is ... complex.

  • Related