How to make a dictionary with two lists and index of lists?-CodePudding

I have two parallel lists of data like:

genres = ["classic", "pop", "classic", "classic", "pop"]
plays = [500, 600, 150, 800, 2500]

I want to get this result:

album = {"classic":{0:500, 2:150, 3:800}, "pop":{1:600, 4:2500}} # want to make

So I tried this code:

    album = dict.fromkeys(genres,dict())
    # album = {'classic': {}, 'pop': {}}

    for i in range(len(genres)):
        for key,value in album.items():
            if genres[i] == key:
                album[key].update({i:plays[i]})

The result for album is wrong. It looks like

{'classic': {0: 500, 1: 600, 2: 150, 3: 800, 4: 2500},
 'pop': {0: 500, 1: 600, 2: 150, 3: 800, 4: 2500}}

That is, every plays value was added for both of the genres, instead of being added only to the genre that corresponds to the number.

Why does this occur? How can I fix the problem?

CodePudding user response：

Try replacing album = dict.fromkeys(genres,dict()) with

album = {genre: {} for genre in genres}

The reason why your dict.fromkeys does not work is documented in the doc:

fromkeys() is a class method that returns a new dictionary. value defaults to None. All of the values refer to just a single instance, so it generally doesn’t make sense for value to be a mutable object such as an empty list. To get distinct values, use a dict comprehension instead.

That is, when you write album = dict.fromkeys(genres,dict()), album['classic'] and album['pop'] both are the same object. As you add new items to either one of them, it is applied to the other (because they are the same object).

Alternatively, you can use defaultdict and zip:

from collections import defaultdict

genres = ["classic", "pop", "classic", "classic", "pop"]
plays = [500, 600, 150, 800, 2500]

album = defaultdict(dict)
for i, (genre, play) in enumerate(zip(genres, plays)):
    album[genre][i] = play

print(dict(album))
# {'classic': {0: 500, 2: 150, 3: 800}, 'pop': {1: 600, 4: 2500}}

The dict(album) is redundant in most cases; you can use album like a dict.

CodePudding user response：

Use:

In [1059]: d = {}

In [1060]: for c,i in enumerate(genres):
      ...:     if i in d:
      ...:         d[i].update({c:plays[c]})
      ...:     else:
      ...:         d[i] = {c:plays[c]}
      ...: 

In [1061]: d
Out[1061]: {'classic': {0: 500, 2: 150, 3: 800}, 'pop': {1: 600, 4: 2500}}

CodePudding user response：

There are two issues here: first off, the for key,value in album.items(): loop is redundant, although this will not cause a problem because dictionaries have unique keys - you will store every key-value pair twice, but the second time will just replace the first.

The important problem is that after album = dict.fromkeys(genres,dict()), the two values in album will be the same dictionary. dict() happens before the call to dict.fromkeys, and the resulting object is passed in. dict.fromkeys() uses that same object as the value for each key - it does not make a copy.

To solve this, use a dict comprehension to create the dictionary instead:

album = {g: {} for g in genres}

This is an analogous problem to List of lists changes reflected across sublists unexpectedly, except that instead of a list-of-lists it is a dict-with-dict values, and instead of creating the problematic data by multiplication we create it with a method. The underlying logic is the same, however, and the natural solution works in the same way as well.

Another approach is to create the key-value pairs in album only when they are first needed, by checking for their presence first.

Yet another approach is to use a tool that automates that on-demand creation - for example, defaultdict from the standard library collections module`. That way looks like:

from collections import defaultdict
# other code until we get to:
album = defaultdict(dict)
# whenever we try `album[k].update(v)`, if there is not already an
# `album[k]`, it will automatically create `album[k] = dict()` first
# - with a new dictionary, created just then.

CodePudding user response：

@j1-lee answered it correctly, but just in case you want to avoid defaultdict and go with primitive dictionary here is the code.

genres = ["classic", "pop", "classic", "classic", "pop"]
plays = [500, 600, 150, 800, 2500]

all_genres_plays = zip(genres, plays)
album = {}
for index, single_genre_play in enumerate(all_genres_plays):
    genre, play = single_genre_play
    if genre not in album:
        album[genre] = {}
    album[genre][index] = play

print(album)

output:

{'classic': {0: 500, 2: 150, 3: 800}, 'pop': {1: 600, 4: 2500}}