I have two parallel lists of data like:
genres = ["classic", "pop", "classic", "classic", "pop"]
plays = [500, 600, 150, 800, 2500]
I want to get this result:
album = {"classic":{0:500, 2:150, 3:800}, "pop":{1:600, 4:2500}} # want to make
So I tried this code:
album = dict.fromkeys(genres,dict())
# album = {'classic': {}, 'pop': {}}
for i in range(len(genres)):
for key,value in album.items():
if genres[i] == key:
album[key].update({i:plays[i]})
The result for album
is wrong. It looks like
{'classic': {0: 500, 1: 600, 2: 150, 3: 800, 4: 2500},
'pop': {0: 500, 1: 600, 2: 150, 3: 800, 4: 2500}}
That is, every plays
value was added for both of the genres, instead of being added only to the genre that corresponds to the number.
Why does this occur? How can I fix the problem?
CodePudding user response:
Try replacing album = dict.fromkeys(genres,dict())
with
album = {genre: {} for genre in genres}
The reason why your dict.fromkeys
does not work is documented in the doc:
fromkeys()
is a class method that returns a new dictionary. value defaults toNone
. All of the values refer to just a single instance, so it generally doesn’t make sense for value to be a mutable object such as an empty list. To get distinct values, use a dict comprehension instead.
That is, when you write album = dict.fromkeys(genres,dict())
, album['classic']
and album['pop']
both are the same object. As you add new items to either one of them, it is applied to the other (because they are the same object).
Alternatively, you can use defaultdict
and zip
:
from collections import defaultdict
genres = ["classic", "pop", "classic", "classic", "pop"]
plays = [500, 600, 150, 800, 2500]
album = defaultdict(dict)
for i, (genre, play) in enumerate(zip(genres, plays)):
album[genre][i] = play
print(dict(album))
# {'classic': {0: 500, 2: 150, 3: 800}, 'pop': {1: 600, 4: 2500}}
The dict(album)
is redundant in most cases; you can use album
like a dict.
CodePudding user response:
Use:
In [1059]: d = {}
In [1060]: for c,i in enumerate(genres):
...: if i in d:
...: d[i].update({c:plays[c]})
...: else:
...: d[i] = {c:plays[c]}
...:
In [1061]: d
Out[1061]: {'classic': {0: 500, 2: 150, 3: 800}, 'pop': {1: 600, 4: 2500}}
CodePudding user response:
There are two issues here: first off, the for key,value in album.items():
loop is redundant, although this will not cause a problem because dictionaries have unique keys - you will store every key-value pair twice, but the second time will just replace the first.
The important problem is that after album = dict.fromkeys(genres,dict())
, the two values in album
will be the same dictionary. dict()
happens before the call to dict.fromkeys
, and the resulting object is passed in. dict.fromkeys()
uses that same object as the value for each key - it does not make a copy.
To solve this, use a dict comprehension to create the dictionary instead:
album = {g: {} for g in genres}
This is an analogous problem to List of lists changes reflected across sublists unexpectedly, except that instead of a list-of-lists it is a dict-with-dict values, and instead of creating the problematic data by multiplication we create it with a method. The underlying logic is the same, however, and the natural solution works in the same way as well.
Another approach is to create the key-value pairs in album
only when they are first needed, by checking for their presence first.
Yet another approach is to use a tool that automates that on-demand creation - for example, defaultdict
from the standard library collections
module`. That way looks like:
from collections import defaultdict
# other code until we get to:
album = defaultdict(dict)
# whenever we try `album[k].update(v)`, if there is not already an
# `album[k]`, it will automatically create `album[k] = dict()` first
# - with a new dictionary, created just then.
CodePudding user response:
@j1-lee answered it correctly, but just in case you want to avoid defaultdict and go with primitive dictionary here is the code.
genres = ["classic", "pop", "classic", "classic", "pop"]
plays = [500, 600, 150, 800, 2500]
all_genres_plays = zip(genres, plays)
album = {}
for index, single_genre_play in enumerate(all_genres_plays):
genre, play = single_genre_play
if genre not in album:
album[genre] = {}
album[genre][index] = play
print(album)
output:
{'classic': {0: 500, 2: 150, 3: 800}, 'pop': {1: 600, 4: 2500}}