Home > Back-end >  What is the Complexity of this Hashmap method
What is the Complexity of this Hashmap method

Time:07-04

I know that the insertion in HashMap takes O(1) time Complexity, so for inserting 'n' elements the complexity should be O(n). I have a little doubt about the below method. Please take a look.

    private static Map<Character,Character> mapCharacters(String message){
        Map<Character, Character> map = new HashMap<>();
        char idx = 'a';
        for (char ch : message.toCharArray()) { // a loop - O(n)
            if (!map.containsKey(ch)) {         // containsKey - take O(n) to check the whole map
                map.put(ch, idx);
                  idx;
            }
        }
        return map;
    }

What I am doing is to map the decrypted message with the sequential abcd.

So, My confusion is whether the complexity of the above method is O(n^2) or O(n). Please help me to clear my doubt.

CodePudding user response:

It's amortised O(n). containsKey and put are both amortised O(1); see What is the time complexity of HashMap.containsKey() in java? in particular.

CodePudding user response:

The time complexity of both containsKey() and put() will depend on the implementation of the equals/hashCode contract of the object that is used as a key.

Under the hood, HashMap maintains an array of buckets. And each bucket corresponds to a range of hashes. Each non-empty bucket contains nodes (that with the map-entry data) that form either a list or a tree (since Java 8).

The process of determining the right bucket is almost instant, unless the process of computing the hash is heavy, but anyway it's considered to have a constant time complexity O(1). Accessing the bucket (i.e. accessing array element) is also O(1), but then things can become tricky. Assuming that bucket contains only a few elements, checking them will not significantly increase the total cost (while performing both containsKey() and put() we should check whether provided key exists in the map) and overall time-complexity will be O(1).

But in the case when all the map-entries for some reason ended up in the same bucket then iterating over the list will result in O(n), and if nodes are stored as a Red-black tree the worse case cost of both containsKey() and put() will be O(log n).

In case Character as a key we have a proper implementation of the equals/hashCode. Proper hash function is important to ensure that objects would be evenly spread among buckets. But if you were using instead a custom object which hashCode() is broken, for instance always returning the same number, then all entries will end up in the same bucket and performance will be poor.

Collisions, situations when different objects produces hashes in the range that is being mapped to the same bucket, are inevitable, we need a good hash function to make the number of collisions as fewer as possible, so that bucket would be "evenly" occupied and the map get expanded when the number of non-empty buckets exceeds load factor.

The time complexity of this particular method is linear O(n). But might not be the case if you would change the type of key.

CodePudding user response:

Given your original code below:

private static Map<Character,Character> mapCharacters(String message){
    Map<Character, Character> map = new HashMap<>();
    char idx = 'a';
    for (char ch : message.toCharArray()) {
        if (!map.containsKey(ch)) {
            map.put(ch, idx);
              idx;
        }
    }
    return map;
}

Here are some observations about the time complexity:

  • Iterating each character in message will be O(n) – pretty clearly if there are 10 characters it takes 10 steps, 10,000 characters will take 10,000 steps. The time complexity is directly proportional to n.
  • Inside the loop, you're checking if the map contains that character already – using containsKey(), this is a constant-time operation which is defined as O(1) time complexity. The idea is that the actual time to evaluate the result of containsKey() should be the same whether your map has 10 keys or 10,000 keys.
  • Inside the if, you have map.put() which is another constant-time operation, another O(1) operation.
  • Combined together: for every nth character, you spend time iterating it in the loop, checking if the map has it already, and adding it; all of this would be 3n (three times n) which is just O(n).

Separately, you could make a small improvement to your code by replacing this block:

if (!map.containsKey(ch)) {
    map.put(ch, idx);
      idx;
}

with something like below (which retains the same behavior of running idx only if you've put something, and specifically if the previous thing was either "not present" or perhaps had been set with "null" - though the "set as null" doesn't apply from your code sample):

Character previousValue = map.putIfAbsent(ch, idx);
if (previousValue == null) {
      idx;
}

Though the time complexity wouldn't change, using containsKey() makes the code clearer, and also takes advantage of well-tested, performant code that's available for free.

Under the covers, calling Map.putIfAbsent() is implemented like below (as of OpenJDK 17) showing an initial call to get() followed by put() if the key isn't present.

default V putIfAbsent(K key, V value) {
    V v = get(key);
    if (v == null) {
        v = put(key, value);
    }

    return v;
}
  • Related