I have a nested dictionary like this one:
my_dict[user_profile][user_id][level] = [[9999, 'Heavy Purchaser', 340, 'Star_chest', 999, 1000],
[9999, 'Heavy Purchaser', 340, 'Star_chest', 998, 5],
[9999, 'Heavy Purchaser', 340, 'Star_chest', 3, 1],
[9999, 'Heavy Purchaser', 340, 'Star_chest', 4, 1]]
Basically, per each user_profile, user_id I'm collecting the rewards received per level.
The number of lists contained in dict[user_profile][user_id][level]
is variable and not fix.
A reward looks like this : [9999, 'Heavy Purchaser', 340, 'Star_chest', 999, 1000]
I want to create a DF of rewards using the most efficient and fastest solution. In the end this is what I want:
ID user_profile user_id Chest_type item_code amount
9999 'Heavy Purchaser' 340 'Star_chest' 999 1000
9999 'Heavy Purchaser' 340 'Star_chest' 4 1
9999 'Heavy Purchaser' 340 'Star_chest' 3 1
I tried to append each single list using df.loc[df.shape[0]] = list_with_rewards
, but it's taking too much time. Any suggestion ?
CodePudding user response:
The data that you are starting with is not a nested dictionary, it is just a nested list. You may want to consider transitioning to a nested dictionary that would seem to make more sense for the type of data you are gathering... But that is another question. :)
In pandas
, generally the last thing you want to do is add to a data frame row by row, or anything row by row in general. If you look through the dox for data frame, there are several ways to create from data, based on data structure or file type and data orientation. Your data is a "list of lists" where each list can be interpreted as a "record" or one row in a datframe or database. So, you can just use the from_records()
construct. Behold:
In [7]: import pandas as pd
In [8]: data = [[9999, 'Heavy Purchaser', 340, 'Star_chest', 999, 1000],
...: [9999, 'Heavy Purchaser', 340, 'Star_chest', 998, 5],
...: [9999, 'Heavy Purchaser', 340, 'Star_chest', 3, 1],
...: [9999, 'Heavy Purchaser', 340, 'Star_chest', 4, 1]]
In [9]: type(data)
Out[9]: list
In [10]: pd.DataFrame.from_records(data, columns=['ID', 'user', 'user_id', 'chest', 'count', 'amount'])
Out[10]:
ID user user_id chest count amount
0 9999 Heavy Purchaser 340 Star_chest 999 1000
1 9999 Heavy Purchaser 340 Star_chest 998 5
2 9999 Heavy Purchaser 340 Star_chest 3 1
3 9999 Heavy Purchaser 340 Star_chest 4 1