Home > Software engineering >  How to know when to use a dictionary in cases where an array feels natural python
How to know when to use a dictionary in cases where an array feels natural python

Time:12-12

I was trying to solve the "Hash Tables: Ransom Note" Hackerrank challenge in python. I come from a MATLAB background so I am used to working with arrays. The question is "Given the words in the magazine and the words in the ransom note, print Yes if he can replicate his ransom note exactly using whole words from the magazine; otherwise, print No."

I have two solutions to the problem; the first using arrays/lists which is what I would naturally write (and is what the input is given as!), and the second using a dictionary which feels unnatural to me. The issue is that the first approach times-out on a couple of the examples (so the second runs much quicker, and is the expected solution).

def checkMagazine(magazine, note):
    # Using arrays
    for word in note:
        if word not in magazine:
            print("No")
            return
        else:
            magazine.remove(word)
    print("Yes")
    return
    
    # Using dictionaries
    dict = {}
    for word in magazine:
        dict[word] = dict.get(word,0)   1
    
    for word in note:
        if dict.get(word,0) == 0:
            print('No')
            return
        else:
            dict[word] -= 1
    print('Yes')
    return

Could someone give me pointers as to a) why the dictionary approach runs quicker and b) how to recognise problems where a dictionary would be the best solution.

CodePudding user response:

The key difference is that dictionaries are optimized for keyed access.

Locating a specific item in a list (e.g. doing word in magazine or magazine.remove(word)) is an O(n) operation that involves iterating over the entire list to look for the item.

Locating a specific item in a dictionary (e.g. dict.get(word) or dict[word] -= 1) is an O(1) operation.

Any time you have a collection of items where you will need to access a specific item by a particular identifier, a dictionary is the obvious choice. If you need to access items by position/order instead of identity (e.g. "pop the last item that was added"), a list is more appropriate.

CodePudding user response:

If magazine is a list, every word not in magazine takes O(n) time, because you have to walk through the entire list to see if word is there or not. (And that bound is tight; when the word is indeed not in the list, you have to check every value to verify that.)

It also takes O(n) time to build your dictionary, but after that each lookup takes only O(1) time, as the value itself can be used to determine where in the dict it is, instead of having to compare it to every value in the dict.

So use a dict when you'll be performing enough lookups that the time you save with O(1) lookups makes up for the time spend building the dict in the first place.

  • Related